Decoding AI jargons with Chai


Google published the transformer architecture in 2017. The name “Attention is All You Need “ is based on the multi-head attention model.
Encoder:- The encoder is the process of analysing and representing in a sequential manner of the input that the model understands. when the user takes an input, it is fed forward to the model again and again so that the model can understand
Decoder:- decoder takes the encoder input and produces the output sequence
Vector: an array of floating-point numbers that have each other’s semantic meaning, called a vector
Embeddings:- input text, sentence, phrase, etc, to represent it into numerical values so that to found. semantic meaning and context
Positional Encoding:- In this phase, each token talks to each other so that find their meaning, for example, “The river Bank” and “The ICICI Bank” both have a common word “Bank” but the context of the word is different, positional encoding changes the token to see the context of the word
Semantic meaning: Let’s understand it with the help of an example :
MODI ——→India
Trump ——→?
what is here in the question mark? the answer is USA. The analogy is that Modi is the leader of India so the semantic meaning is that Trump is the leader of the USA
Softmax:- probability of identifying the correct word, when the user gives the input How are? the LLM model gives many words like(You, Your, I, She, Her, etc), then softmax checks the high probability, he chooses it and produces the output
Temperature:- Temperature is the randomness of the creativity. If the randomness is high, the output is more creative; if it is less, the output is concise and deterministic
Knowledge-Cutoff:- LLM model is pre-trained, if training data is old, then present knowledge is cut off from the model
Subscribe to my newsletter
Read articles from Vikram kumar Saini directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
