Probability For Mastering Data Science - Part 3

Naymul IslamNaymul Islam
5 min read

In the previous part, we covered until Baye’s Law, today we gonna talk about some Probability Distributions.

Probability Distribution👇

What is a Probability distribution?👇

A distribution shows the possible values a variable can take and how frequently they occur.

Upper-case letter (Y) means the actual outcome of an event.

Lower-case letter (y) means one of the possible outcomes.

One way to denote the likelihood of reaching a particular outcome y is -

P(Y=y) or P(y)

P(y) → The probability function.

We define distribution using two characteristics Mean and Variance.

The mean of a distribution is an average value.

The variance of a distribution is essentially how spread out the date is.

We define mean through ‘μ’ character.

We define variance through ‘σ2’ character.

Population data is the formal way to refer to all the data, while sample data is just a part of it.

In sample data,

The third characteristic of the distribution is called ‘Standard deviation’.

The more congested the middle of the distribution, the more data falls within that interval.

The constant relationship between mean and variance is -

Types of probability distribution👇

Rolling a die or picking a card, have a finite number of outcomes they follow the discrete distribution.

Finite number of outcomes → Discrete distribution.

Recording time and distance in track and field, have infinitely many outcomes, they follow continuous distribution -

Infinite number of outcomes → Continuous distribution.

Discrete distribution and their characteristics👇

We can express the entire discrete distribution with either a table, a graph or a formula.

In probability, we are often more interested in the likelihood of an interval than an individual one.

With discrete distribution, we can simply add up the probabilities for all the values within that range.

One peculiarity of discrete events is -

Discrete distribution: The uniform distribution👇

We use the letter ‘u’ to define the uniform distribution.

Events that follow the uniform distribution are ones where all outcomes have equal probability. For example -

If we roll a six-sided die we have an equal chance of getting any value from one to six the graph of probability distribution would have six equally tall that all reaching up to ⅙.

When an event is following the uniform distribution -

  • Each outcome is equally likely

  • Both the mean and the variance are uninterpretable.

  • No predictive power

Discrete distribution: The Bernoulli distribution👇

We use ‘Bern(p)’ to define Bernoulli distribution.

Any event which has 1 trial and 2 possible outcomes follows Bernoulli distribution. For example -

A coin flip, A single true or false quiz question etc.

Usually when dealing with Bernoulli distribution we either have the probabilities of one of the events occurring or have a past date indicating some experimental probability.

We usually denote the higher probability with P and the lower one with 1 - p.

P →1.

1-p → 0.

The variance of Bernoulli events is -

The standard deviation of Bernoulli events -

Discrete distribution: The Binomial distribution👇

What is Binomial Distribution?👇

Binomial events are a sequence of identical Bernoulli events.

We use ‘B(n,p)’ for Binomial distribution -

Additionally, we can express a Bernoulli distribution as a binomial distribution with a single trial -

If you have a pop quiz of 10 true or false questions in that case guessing a single true or false question is a Bernouli event and guessing the entire quiz is a binomial event.

The expected value of the Bernoulli event -

The expected value of Binomial event -

The graph of a Binomial distribution represents the likelihood of attaining our desired outcome a specific number of times.

If we run n many trails the graph would consist of (n+1) many bars

If we wish to find the associated likelihood of getting a given outcome a precise number of times we need the probability function of the binomial distribution

The number of ways in which 4 out of the 6 trials can be successful = picking 4 elements out of a sample space of 6

The probability function of binomial distribution -

Suppose you bought a single stock of GM Motors. Historically you know there is a 60% chance the price of your stock will go up on any given day and a 40% or 0.4 chance it will drop.

With a probability distribution function you can calculate the likelihood of the stock price increasing 3 times during 5 workdays a week -

The expected value formula for the binomial event is -

The variance of the binomial event is -

Discrete distribution: The Poisson distribution👇

We denote a Poisson distribution with the letters ‘Po’ and a single parameter ‘λ’.

The Poisson distribution deals with the frequency with which an event occurs within a specific interval.

Instead the probability of an event the Poisson distribution requires knowing how after it occurs for a specific period of time or distance.For example-

A firefly light up 3 timeless in 10 seconds on average we should use a Poisson distribution if we want to determine the likelihood of it lighting up 8 times in 20 seconds.

The graph of a Poisson distribution plots the number of instances the event occurs in a standard interval of time and the probability for each one.

The probability function of poisson distribution -

If we usually have 4 hours in the study but yesterday we have 7 hours so in that case if we want to know how likely the 7 hours study is than we need the probability function of Poisson distribution.

In that case,

So there is only a 6% chance of receiving questions.

The expected value of poisson distribution -

The mean and the variance of poisson distribution -

Before we end…

Thank you for taking the time to read my posts and share your thoughts. If you like my blog please give a like, comment and share it with your circle, follow for more and I look forward to continuing this journey with you.

Let’s connect and grow together. I look forward to getting to know you better.

Here are my socials links below -

Linkedin: https://www.linkedin.com/in/ai-naymul/

Twitter: https://twitter.com/ai_naymul

Github: https://github.com/ai-naymul

11
Subscribe to my newsletter

Read articles from Naymul Islam directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Naymul Islam
Naymul Islam

👉 I'm an ML Research 7 Open-Source Dev Intern at Menlo Park Lab. 👉 I'm a Machine Learning and MLOps Enthusiast. 👉 I’m One Of The Semi-Finalist Of The Biggest ICT Olympiad In Bangladesh Called “ICT Olympiad Bangladesh” In 2022. 👉 I've More Than 15 Google Cloud Badges. ⭐️ Wanna Know More About Me? Drop Me An Email At: naymul504@gmail.com ★