CS50 AI with Python - Lecture 2: Uncertainty

Shichun MinShichun Min
4 min read

Uncertainty

Preface

This was a relatively difficult lecture. From the lecture video, I felt I only half understood what was being explained. It was only while organizing my notes that I began to grasp some of the concepts and the tricky formulas.


Why We Need Probability

In the real world, many situations are uncertain, for example:

  • The probability of rain is 30%.

  • The probability of someone having a disease is 0.1%.

  • If a person shows symptoms of a disease, what is the actual probability that they have it?

To handle this uncertainty, AI introduces probabilistic models.


Basic Probability Definitions

  • Random Variable
    Represents an event that can take on different possible values, for example:

    • Weather: {Sunny, Rainy}

    • Number of genes: {0, 1, 2}

  • Probability Distribution
    The sum of probabilities of all possible values must equal 1:

Example:


Joint Probability

  • Represents the probability of multiple variables taking certain values simultaneously.
    Example:

  • Rule: The sum of probabilities for all combinations should equal 1.

Example: If a person has both Toothache and Cavity, the relationship can be expressed as:

  • $P(Toothache, Cavity) = 0.04$

  • $P(Toothache, No_Cavity) = 0.06$

  • $P(No_Toothache, Cavity) = 0.1$

  • $P(No_Toothache, No_Cavity) = 0.8$

all combinations sum to 1


Normalization

During probability calculations, we may obtain “unnormalized probabilities.” We then need to rescale them so that their sum equals 1.

Formula:

Example:

  • Suppose we get results {A: 2, B: 3} (unnormalized).

  • Total = 5

  • After normalization:


Conditional Probability & Bayes’ Theorem

  • Conditional Probability:

    Read as: “The probability of A occurring given that B has occurred.”

Example: Among 10 balls, 4 are red and 6 are white. Of the 4 red balls, 2 are large.
The probability of drawing a large ball given that it is red:

It means that once a condition is known, we restrict the probability space to only the world where that condition occurs.

  • Bayes’ Theorem:
    Derived from the conditional probability formula:

Example: Disease Diagnosis

  • Disease prevalence: P(Disease) = 0.01

  • Probability of positive test given disease: P(Positive | Disease) = 0.9

  • Probability of false positive: P(Positive | No_Disease) = 0.1

Question: If the test is positive, what is the probability of actually having the disease?

👉 This means even if the test is positive, the real probability of having the disease is only 8.3%, not the intuitive 90%.


Bayesian Network

  • A model using Directed Acyclic Graphs (DAGs) to represent causal relationships.

  • Nodes: random variables

  • Edges: dependencies

Example:

  • Cavity → Toothache

  • Cavity → Catch

Advantage: Avoids storing large joint probability tables by using Conditional Probability Tables (CPT).


Inference Methods

Enumeration

  • List all possible combinations and calculate the target probability.

  • Downside: combinations grow exponentially with the number of variables.

Sampling

  • Randomly generate samples to estimate the probability distribution.

  • More efficient, provides an approximate solution.

Likelihood Weighting

  • Improved sampling method.

  • Assigns weights to “known conditions” to increase efficiency.

  • Particularly useful in conditional probability scenarios.


Markov Models

  • Assumption: The future depends only on the present, not on the past (Markov property).

  • Applications: weather prediction, speech recognition, natural language processing.

Example:


Summary

  1. Probability Distributions: describe uncertainty.

  2. Joint Probability & Normalization: combine variables, sum = 1.

  3. Conditional Probability & Bayes’ Theorem: core tools for reasoning.

  4. Bayesian Networks: efficient representation of causal dependencies.

  5. Inference Methods: Enumeration → Sampling → Likelihood Weighting.

  6. Markov Models: used for time-series prediction.


Reflection

This lecture involved probability theory. It was abstract and somewhat hard to understand, and the numerous formulas were initially intimidating. By repeatedly reviewing the notes and using simple examples, I gradually understood the key concepts. However, the assignments were not easy—they were more challenging than those from the first two lectures.

0
Subscribe to my newsletter

Read articles from Shichun Min directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Shichun Min
Shichun Min