How LLM Prompts Work

How LLM Prompts Work

Prompt engineering is the task of crafting natural language inputs to LLM models that will produce the optimal outputs.

In large language models (LLMs), words or phrases from the input prompt are converted into high-dimensional vectors using embedding techniques. These vector representations capture not just the semantic meaning of individual words but also encode aspects of their context and relationships to other words.

When prompts are specific and clear, the vector representations of these prompts are more likely to closely align with the vector spaces that represent the desired answers or outputs. This means the model can more easily and accurately access and generate the relevant information, as the vectors guide the model towards the specific area of the "knowledge" encoded in its parameters.

The attention mechanism allows the model to dynamically focus on different parts of the input prompt when generating each word of the output. It does this by assigning weights to different parts of the input, determining how much focus each part should receive based on its relevance to the task at hand.

By including instructions and examples in a prompt, you effectively guide the model's attention mechanism towards focusing on the most relevant aspects of the input. Instructions can act as a directive for the model, highlighting what kind of information or reasoning process it should prioritize. Examples serve as concrete instances from which the model can derive patterns or templates, making it easier for the attention mechanism to identify relevant features or strategies for generating the correct output.

Sources

  1. https://github.blog/2023-07-17-prompt-engineering-guide-generative-ai-llms/
  2. https://www.thoughtworks.com/en-gb/insights/blog/machine-learning-and-ai/how-to-make-use-of-llms