Tokenization
Tokenization is the process of converting text into a sequence of tokens, which can be anything like words, sub-words, or characters. It may seem to us that LLMs understand Hindi, English, Numbers, practically everything in human existen...