ai:llm:what_word_comes_next
AI - LLM - What word comes next
Seed text → Predict next word
- Probability distribution of possible next words.
Read input (does not have to be purely text, could be any input including sound, images).
- Helpful is input and output are real numbers; as therefore can do calculus and gradients.
Lookup Table
- Create a lookup table, associating each word with others words in the same context, such as in the same sentence.
- Encode the position of a word in the context.
- Are there nouns at specific positions?
- For each word, are there any adjectives in certain positions, in front of the word?
- Determine what is the probability of that word being the next word, from the list of possible words.
- Use back-propagation to adjust weights.
Weights = which words are associated with other words.
- Bigger numbers matter more.
- Smaller numbers are unrelated; matter less.
- Use dot.product; as cheaper
- v1 * w1 = v1w1
- v1w1 + v2w2 + v3w3 = dot.product
- Measures whether this points in the same direction.
- Positive value means it points (aligns).
- Zero means it is perpendicular (unrelated).
- Negative value means it is opposit.
Activation
- Use softmax
- Converts weights to all be between zero and one; and add up to one.
Aim for low cost; in determining the next word.
- cost formula is: cost = -log(probability).
- A low cost, close to zero, is where the probability is closer to one.
- A low probability (i.e. not predicting the next word), increases cost very steeply.
- There is a cost per word, but same applies to entire network.
- Cost of entire network will be sum or avg of each word cost.
ai/llm/what_word_comes_next.txt · Last modified: 2024/11/29 18:35 by peter