What is Reinforcement Learning?

Reinforcement Learning is a trial-and-error learning process where an agent learns to make decisions by interacting with an environment.

Key Components

• Agent: The learner/decision-maker

• Environment: The world agent interacts with

• State (s): Current configuration

• Action (a): What agent does

• Reward (r): Feedback signal

Core Idea

• Agent explores environment

• Takes actions

• Receives rewards

• Updates behavior

RL Algorithm

1. Observe state

2. Select action

3. Receive reward

4. Update policy

5. Repeat

RL vs Other Paradigms

Paradigm Data Learning

Supervised Input-output Labeled data

Unsupervised Only inputs Patterns

RL No dataset Environment

Markov Decision Process

• States

• Actions

• Transition probabilities

• Reward function

• Markov property

Fundamental Concepts

• Policy (π): π(s) = a or π(a|s) = P (a|s)

• Return: G = P γtrt

• Value functions: V (s) = E[G|s]

• Q-function: Q(s, a) = E[G|s, a]

Applications

• Robotics

• Games

• Finance

• Healthcare

• Autonomous vehicles

1 Introduction

1.1 What is Deep Reinforcement Learning?
1.2 Three Machine Learning Paradigms

2 Tabular Value-Based Reinforcement Learning

2.1 Sequential Decision Problems
2.2 Tabular Value-Based Agents
2.3 Classic Gym Environments
Summary and Further Reading
Exercises

3 Deep Value-Based Reinforcement Learning

3.1 Large, High-Dimensional, Problems
3.2 Deep Value-Based Agents
3.3 Atari 2600 Environments
Summary and Further Reading
Exercises

4 Policy-Based Reinforcement Learning

4.1 Continuous Problems
4.2 Policy-Based Agents
4.3 Locomotion and Visuo-Motor Environments
Summary and Further Reading
Exercises

5 Model-Based Reinforcement Learning

5.1 Dynamics Models of High-Dimensional Problems
5.2 Learning and Planning Agents
5.3 High-Dimensional Environments
Summary and Further Reading
Exercises

6 Two-Agent Self-Play

6.1 Two-Agent Zero-Sum Problems
6.2 Tabula Rasa Self-Play Agents
6.3 Self-Play Environments
Summary and Further Reading
Exercises

7 Multi-Agent Reinforcement Learning

7.1 Multi-Agent Problems
7.2 Multi-Agent Reinforcement Learning Agents
7.3 Multi-Agent Environments
Summary and Further Reading
Exercises

8 Hierarchical Reinforcement Learning

8.1 Granularity of the Structure of Problems
8.2 Divide and Conquer for Agents
8.3 Hierarchical Environments
Summary and Further Reading
Exercises

9 Meta-Learning

9.1 Learning to Learn Related Problems
9.2 Transfer Learning and Meta-Learning Agents
9.3 Meta-Learning Environments
Summary and Further Reading
Exercises

10 Further Developments

10.1 Development of Deep Reinforcement Learning
10.2 Main Challenges
10.3 The Future of Artificial Intelligence

Reinforcement Learning Guide