Why need of Transformers

What was going wrong with LSTM/RNN ?
LSTM / RNN are good in processing text word-to-word .They carry information about past words through hidden state(memory slot). Just to provide brief, “hidden state” stores information about the position, order and other info about input tokens. All this info are compressed in hidden state.
Alright, coming back to our discussion. Now everything was going well until LSTM / RNN took toll on its performance, when it started to process very long sentences. These LSTM / RNN networks performed poor to retain relevant information while processing longer sentences. Not only that, it did not understand the proper relationship between distant words in longer sentences.
Note : Tokens are meaningful machine code which are well understood by LLMs. I will talk about this tokens in my next upcoming blog.
Also ,LSTM / RNN are slow to train because the tokens are processed sequentially and demands higher computational complexity.
LSTM / RNN do struggle with vanishing / exploding gradient.
Thus contributing to poor long memory and slow training.
How Transformer solve this problem ?
Transformer something called “Attention” approach. A very smart approach, the Google AI researchers came up with.
Solution to Forgetfulness
Attention makes the model to focus on relevant parts of input. When processing any word, the model can look back at all other words in sentence and assign important-scores. This direct connection eliminates long range memory problem.
Solution to Slowness
The Attention allows the model to concurrently calculate the “important-scores” for all words(tokens) , the entire sequence can be processed in parallel fashion. This drastically improves speed.
That’s it for today. This was all about How the transformers are better than LSTM/ RNN. I wanted to keep this short for less than 3-5mins read.
For a more visual and comprehensive understanding of the 'Attention is All You Need' paper – the foundation of modern LLMs – I highly recommend watching this video
Subscribe to my newsletter
Read articles from Immanuel Dsouza directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
