Why LLMs Are Different

Tags

Large Language Models (LLMs) are trained on huge amounts of data to generate a textual output in response to a question or some other kind of prompt.

Advanced LLMs have a level of intelligence in the context of NLP that is unmatched by other technical approaches.

‘Intelligence’ here refers to the ability of LLM models to receive a complex prompt as an input, and generate an output that accurately predicts what the optimal response would be.

This capability is based on a specific type of deep learning architecture called a transformer - which employs the ‘multi-head attention’ mechanism proposed in the paper ‘Attention Is All You Need’ published in 2017.

Sources

https://aws.amazon.com/what-is/large-language-model/
https://www.youtube.com/watch?v=zjkBMFhNj_g&t=0s