User Tools

Site Tools


ai:llm:what_word_comes_next

AI - LLM - What word comes next

Seed text → Predict next word

  • Probability distribution of possible next words.

Read input (does not have to be purely text, could be any input including sound, images).

  • Helpful is input and output are real numbers; as therefore can do calculus and gradients.

Lookup Table

  • Create a lookup table, associating each word with others words in the same context, such as in the same sentence.
    • Encode the position of a word in the context.
    • Are there nouns at specific positions?
    • For each word, are there any adjectives in certain positions, in front of the word?
  • Determine what is the probability of that word being the next word, from the list of possible words.
  • Use back-propagation to adjust weights.

Weights = which words are associated with other words.

  • Bigger numbers matter more.
  • Smaller numbers are unrelated; matter less.
  • Use dot.product; as cheaper
    • v1 * w1 = v1w1
    • v1w1 + v2w2 + v3w3 = dot.product
      • Measures whether this points in the same direction.
        • Positive value means it points (aligns).
        • Zero means it is perpendicular (unrelated).
        • Negative value means it is opposit.

Activation

  • Use softmax
    • Converts weights to all be between zero and one; and add up to one.

Aim for low cost; in determining the next word.

  • cost formula is: cost = -log(probability).
    • A low cost, close to zero, is where the probability is closer to one.
    • A low probability (i.e. not predicting the next word), increases cost very steeply.
  • There is a cost per word, but same applies to entire network.
    • Cost of entire network will be sum or avg of each word cost.

ai/llm/what_word_comes_next.txt · Last modified: 2024/11/29 18:35 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki