GenAI – What Are Tokens In LLM ?


GenAI – Tokens In LLM

Table Of Contents:

  1. What Are Tokens In LLM ?
  2. Definition Of Tokens.
  3. Examples Of Tokens.
  4. How Tokenization Works?
  5.  Why Tokens Matter ?
  6. Tokens Vs Character Vs Words.
  7. Tokenization Techniques.
  8. Tokenization Tools.

(1) What Is Tokenization ?

(2) Definition Of Tokens.

(3) Examples Of Tokens.

(4) Why Tokens Matter ?

(5) Tokens Vs Character Vs Words

(6) What Is Token Id & How LLM Going To Use It ?

(7) For The Same Word Will I Get Same Token Id ?

(7) Different Tokenization Techniques.

  1. Word Level Tokenization.
  2. Character Level Tokenization.
  3. Sub-word Tokenization.(Most Popular)
    1. Byte Pair Encoding.
    2. WordPiece
    3. SentencePiece
    4. Unigram Language Model
  4. Byte Level BPE
  5. Tokenization with Special Tokens

(7) Word Level Tokenization

(8) Character Level Tokenization

(9) Subword Tokenization

(10) Byte Level BPE

(11) Tokenization with Special Tokens

(12) Tokenization Libraries

Leave a Reply

Your email address will not be published. Required fields are marked *