#87 Machine Learning & Data Science Challenge 87
Table of contents
What do you understand by TF-IDF?
TF-IDF:
It stands for the term frequency-inverse document frequency.
TF-IDF weight:
It is a statistical measure used to evaluate how important a word is to a document in a collection or corpus.
The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus.
- Term Frequency (TF):
It is a scoring of the frequency of the word in the current document.
Since every document is different in length, it is possible that a term would appear much more times in long documents than in shorter ones. The term frequency is often divided by the document length to normalize
- Inverse Document Frequency (IDF):
- It is a scoring of how rare the word is across the documents. It is a measure of how rare a term is, the Rarer the term, and more is the IDF score.
Thus,
Subscribe to my newsletter
Read articles from Bhagirath Deshani directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Bhagirath Deshani
Bhagirath Deshani
Hello everyone! I am Machine Learning Engineer. I am from India. I have been interested in machine learning since my engineering days. I have completed Andrew NG’s original Machine Learning course from Stanford University at Coursera and also completed the IBM course on Machine Learning and Deep Learning. Currently, I am working on Machine Learning and Data Science project. My goal is to use the skills I have acquired to solve real-world problems and make a positive impact on the world.