The NLP Landscape - 1960's to 2024's

Hello friends,

This is my first blog on learning of NLP Language, or we can say NLP Method. I’m started my learning on 18th Nov. 2024 from YouTube tutorials, blogs, and google search. I’m not a pro in writing the blogs but still I’m starting to write the blog.

Today, I’ve learned a whole journey of Natural Language Processing until till today. From the definition to its real world uses, challenges, and how we’re surrounded by this technology.


Starting with the general definition,

What is NLP?

Natural language processing is a subfield of linguistics, CS, and AI concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data.

In short words, I can say, NLP is the subset or combination of CS, AI and Human language.


What is the need for NLP?

In neuropsychology, linguistics, and the philosophy of language, a Natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages can take different forms, such as speech or signing. They are distinguished from constructed and formal languages such as those used to program computers or to study logic.

In few words, I can say, As the human language has evolved by continuous practicing and sharing the ideas to others either using signs or speaking. Similarly, NLP is a language of currently continuously evolving the language to share the idea of ours with the machines either using JUGAD or using hard codes.


REAL WORLD APPLICATIONS

  • Contextual Advertisements

    • As we’re using Instagram to be scrolling the reels or in communication, the companies are stocking us to track our nature or way of living so that only those advertisements are reached out to you.
  • Email Clients - spam filtering, smart reply, small suggestions on writing the emails, etc.

  • Social media - Removing adult content, opinion mining.

  • Search Engines - Google search engines.

  • Chatbots


Common NLP Tasks

  1. Text/Document Classification

  2. Sentiment Analysis

  3. Information Retrieval

  4. Parts of Speech Tagging

  5. Language Detection and Machine Translation

  6. Conversational Agents

  7. Knowledge Graph and QA Systems

  8. Text Summarization

  9. Topic Modelling

  10. Text Generation

  11. Spell Checking and Grammer Correction

  12. Text Parsing

  13. Speech to Text


Approaches to NLP

  1. Heuristic Methods

  2. Machine learning Based Methods

  3. Deep learning Based Methods


  1. Heuristic Methods

First Google the word Heuristic and know it’s meaning,

  • Regular Expressions

    • To search any pattern in the large context, we set up pre-defined pattern for matching it up.
  • Wordnet

    • Creating a dictionary where are other words are interconnected to each other like, run with shoes.

      • These two are connected to each other like, for running, shoes are required.
  • Open Mind Common Sense

    • This is an open library where a lots of commonly or rarely used words are collected and building a very large dataset of words or say, A VERY BIG DICTIONARY.

    • It is active from January 1999 to August 2016.

Advantages

  • Faster to get answer.

  • Accuracy is high because of creating a certain pattern of words to get the answer.

  • Error is less because of patterns are built by using us(humans).


  1. Machine Learning Methods

The Biggest Advantage

  • No requirement of building the patterns.

  • Patterns are built on the data feed.

  • Complex patterns are easily catch out from the larger dataset.

ML Workflow
  • Convert the textual data into numbers.

  • Send this data into machine learning models.

  • Mostly used algorithms are:

    • Naive Bayes

      • This gives very much excellent results in NLP.
    • Logistic Regression

    • SVM

    • LDA

    • Hidden Markov Models


  1. Deep Learning Methods

Big Advantage

  • It does not lose the sequential modelling.

  • Feature generation is done by DL models.

Architectures Used

  • RNN

  • LSTM

  • GRU/CNN

  • Transformers (BERT, developed by GOOGLE)

  • Autoencoders


Challenges in NLP

  1. Ambiguity

    • I have never tasted a cake quite like that one before!
  2. Contextual Words

    • I ran to the store because we ran out of milk.
  3. Colloquialisms and slang

    • Piece of cake, pulling your leg
  4. Synonyms

  5. Irony, Sarcasm and tonal difference

  6. Spelling Errors

  7. Creativity

    • Poems, dialogue, scripts
  8. Diversity

    • Lots of languages

    • Less data on lots of languages


Finally, I’ve wrap up the blog!

👍 If you like it, upvote it!

I’m not a pro in blog writer; your comment will definitely help me to enhance my ability.

0
Subscribe to my newsletter

Read articles from Avdhesh Varshney directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Avdhesh Varshney
Avdhesh Varshney

I am an aspiring data scientist. Currently, I'm pursuing B.Tech from Dr. B R Ambedkar NIT Jalandhar. Contributed a lot in many open-source programs and secured top ranks amongs them.