Tags
Large Language Models (LLMs) are trained on huge amounts of data to generate a textual output in response to a question or some other kind of prompt.
Advanced LLMs have a level of intelligence in the context of NLP that is unmatched by other technical approaches.
‘Intelligence’ here refers to the ability of LLM models to receive a complex prompt as an input, and generate an output that accurately predicts what the optimal response would be.
This capability is based on a specific type of deep learning architecture called a transformer - which employs the ‘multi-head attention’ mechanism proposed in the paper ‘Attention Is All You Need’ published in 2017.
Sources
- https://aws.amazon.com/what-is/large-language-model/
- https://www.youtube.com/watch?v=zjkBMFhNj_g&t=0s