Ai from Laguage tanslater to the bigest revolution

Well we all have used chatgpt or any kind of “ai” in some way what we call ai is based on the basic framework that was developed years ago for google translate. In different languages there are different constraints, definition and contexts to different word and to translate something from english to native language would not just require translating the words but understand the sentence and context of words in what they are said, to counter the challenge google developed a transformer model architecture which is then used for “The next word predictor” as said by OpenAi’s main head at research Ilya Sutskever
no matter how much much complicated this diagram may look its as simple as it could get
Step 1 : It involves turning the words into something we call tokens. In input embedding part the words are
embedded and linked with some numbers which are called token and the process is called Tokenisation
Since transformers don't understand order by default, positional encoding is added to indicate the position of each word
Step 2 : Now we have some words which are linked with some numbers but they done carry any real meaning till now. The model to understand what is the meaning and context of the words and whats important on which they should form a output carries out Vector embeddings. Vector embedding is a part of multi head attention which means looking at all the words at once and understand whats important, Vector embedding stores the semantic relationship of the word (what does it means in real life) in the form of numbers in a 3d graph
this the basic image of vector embeddings graph we can visit https://projector.tensorflow.org/ to get to know it better
Now the model applies the learnt information to generate the output, the the input and output layer is added together and the process is repeated several times until desired result is generated
Step 3 : The probabilities of the the words generated are calculated and the word with the highest probability is accepted as the output
with every input the the learnings are applied and used to generate and calculate probabilities more efficiently and the outputs are refined
Subscribe to my newsletter
Read articles from Prabhutva gupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
