The Questions That Kept Me Up All Night: My Sleepless Struggle to Understand NLP
What is the ultimate goal of NLP?
Is it just to make computers understand human language and perform tasks like summarization, translation, or answering questions?Is it necessary to convert text into a tabular format for ML/NLP or DL-NLP?
When working with NLP models, do we always have to convert text data into numerical/tabular format for the machine to understand?Are the building blocks of NLP interlinked and sequential?
Do the building blocks of NLP, like morphology, syntax, semantics, and Pragmatics work together in a specific order, or can they be applied independently?How does the Syntax step understand the structure of a sentence after tokenization?
If we’ve already broken a sentence into tokens, how does the syntax step (like POS tagging) understand the structure of the sentence?If words are lemmatized, how does the context and meaning still get preserved without POS tagging or full words?
After lemmatization, we might end up with words like “boy” instead of “boys” and “run” instead of “running.” How does the model still understand the meaning and context of the sentence?Can a machine understand a sentence after we break it down into tokens, remove stop words, and lemmatize it?
How does a machine process a sentence when we remove words like “the” or “are” and lemmatize other words? Can it still capture the true meaning of the sentence?How do models understand language when the input is just numbers (vectors)?
After converting words into vectors, how does a machine understand the meaning and context? How can numbers preserve the deeper meanings of words and sentences?How does a word converted into a vector still have its meaning?
How can a vector representing a word like “king” retain its meaning, or even understand relationships like “king” is to “queen” as “man” is to “woman”?How do modern NLP models handle words in different contexts, like "apple" in "apple pie" versus "Apple Inc."?
How do models like BERT or GPT manage to differentiate the meaning of the same word in different contexts?Are these NLP building blocks like Morphology, Syntax, Semantics, and Pragmatics all required to understand a sentence, or can we skip any steps?
Are all these building blocks necessary, or can we still understand a sentence by focusing on just one or two of these layers?How does a machine know what it did during preprocessing?
After preprocessing text (like tokenization, lemmatization, etc.), do the models “remember” these steps, and how do they use this processed information for understanding tasks?How do models learn to understand the meaning of a sentence just from vectors and numbers?
How do NLP models know that a sentence with words represented by vectors can be understood in terms of meaning, context, and relationships?How does a model transform vectors back into human-readable language?
After processing text into vectors, how do NLP models generate sentences or words that make sense to humans?
Still the answers to these questions I get form ChatGPT and friends who are pro in NLP
Conclusion: The NLP Journey(Beginning)
Learning NLP was a rollercoaster of questions and confusion. From tokenization to understanding how machines process meaning, I often found myself lost in the complexity of it all. But with each challenge, I gained new insights into how language can be broken down, understood, and transformed into something a machine can work with.
The key takeaway? The process of learning NLP isn't just about algorithms—it's about asking the right questions and embracing the journey. So, if you're feeling stuck, don't worry. Keep questioning, and the answers will come.
...Although, there are still plenty of questions I have yet to fully understand. Maybe NLP is just one of those mysteries that keeps you up at night.🤔
Subscribe to my newsletter
Read articles from Manyue javvadi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Manyue javvadi
Manyue javvadi
Business Undergraduate |Ex-Software Engineer |Machine Learning Student |Interested In NLP |Creating New NLP Product for Retail and Hospitality Industry.