ai:llm:what_word_comes_next
Differences
This shows you the differences between two versions of the page.
ai:llm:what_word_comes_next [2024/11/29 18:21] – created peter | ai:llm:what_word_comes_next [2024/11/29 18:35] (current) – peter | ||
---|---|---|---|
Line 11: | Line 11: | ||
* Helpful is input and output are real numbers; as therefore can do calculus and gradients. | * Helpful is input and output are real numbers; as therefore can do calculus and gradients. | ||
- | Create a lookup table, associating each word with others words in the same context, such as in the same sentence. | ||
+ | Lookup Table | ||
+ | |||
+ | * Create a lookup table, associating each word with others words in the same context, such as in the same sentence. | ||
+ | * Encode the position of a word in the context. | ||
+ | * Are there nouns at specific positions? | ||
+ | * For each word, are there any adjectives in certain positions, in front of the word? | ||
* Determine what is the probability of that word being the next word, from the list of possible words. | * Determine what is the probability of that word being the next word, from the list of possible words. | ||
* Use back-propagation to adjust weights. | * Use back-propagation to adjust weights. | ||
+ | |||
+ | |||
+ | Weights = which words are associated with other words. | ||
+ | |||
+ | * Bigger numbers matter more. | ||
+ | * Smaller numbers are unrelated; matter less. | ||
+ | * Use dot.product; | ||
+ | * v1 * w1 = v1w1 | ||
+ | * v1w1 + v2w2 + v3w3 = dot.product | ||
+ | * Measures whether this points in the same direction. | ||
+ | * Positive value means it points (aligns). | ||
+ | * Zero means it is perpendicular (unrelated). | ||
+ | * Negative value means it is opposit. | ||
+ | |||
+ | Activation | ||
+ | |||
+ | * Use softmax | ||
+ | * Converts weights to all be between zero and one; and add up to one. | ||
+ | |||
+ | |||
Aim for low cost; in determining the next word. | Aim for low cost; in determining the next word. | ||
+ | |||
* cost formula is: cost = -log(probability). | * cost formula is: cost = -log(probability). | ||
* A low cost, close to zero, is where the probability is closer to one. | * A low cost, close to zero, is where the probability is closer to one. |
ai/llm/what_word_comes_next.1732904466.txt.gz · Last modified: 2024/11/29 18:21 by peter