Reinforcement Learning Guide

shalin Shahshalin Shah
2 min read

What is Reinforcement Learning?

Reinforcement Learning is a trial-and-error learning process where an agent learns to make decisions by interacting with an environment.

Key Components

• Agent: The learner/decision-maker

• Environment: The world agent interacts with

• State (s): Current configuration

• Action (a): What agent does

• Reward (r): Feedback signal

Core Idea

• Agent explores environment

• Takes actions

• Receives rewards

• Updates behavior

RL Algorithm

1. Observe state

2. Select action

3. Receive reward

4. Update policy

5. Repeat

RL vs Other Paradigms

Paradigm Data Learning

Supervised Input-output Labeled data

Unsupervised Only inputs Patterns

RL No dataset Environment

Markov Decision Process

• States

• Actions

• Transition probabilities

• Reward function

• Markov property

Fundamental Concepts

• Policy (π): π(s) = a or π(a|s) = P (a|s)

• Return: G = P γtrt

• Value functions: V (s) = E[G|s]

• Q-function: Q(s, a) = E[G|s, a]

Applications

• Robotics

• Games

• Finance

• Healthcare

• Autonomous vehicles

1 Introduction

  • 1.1 What is Deep Reinforcement Learning?

  • 1.2 Three Machine Learning Paradigms

2 Tabular Value-Based Reinforcement Learning

  • 2.1 Sequential Decision Problems

  • 2.2 Tabular Value-Based Agents

  • 2.3 Classic Gym Environments

  • Summary and Further Reading

  • Exercises

3 Deep Value-Based Reinforcement Learning

  • 3.1 Large, High-Dimensional, Problems

  • 3.2 Deep Value-Based Agents

  • 3.3 Atari 2600 Environments

  • Summary and Further Reading

  • Exercises

4 Policy-Based Reinforcement Learning

  • 4.1 Continuous Problems

  • 4.2 Policy-Based Agents

  • 4.3 Locomotion and Visuo-Motor Environments

  • Summary and Further Reading

  • Exercises

5 Model-Based Reinforcement Learning

  • 5.1 Dynamics Models of High-Dimensional Problems

  • 5.2 Learning and Planning Agents

  • 5.3 High-Dimensional Environments

  • Summary and Further Reading

  • Exercises

6 Two-Agent Self-Play

  • 6.1 Two-Agent Zero-Sum Problems

  • 6.2 Tabula Rasa Self-Play Agents

  • 6.3 Self-Play Environments

  • Summary and Further Reading

  • Exercises

7 Multi-Agent Reinforcement Learning

  • 7.1 Multi-Agent Problems

  • 7.2 Multi-Agent Reinforcement Learning Agents

  • 7.3 Multi-Agent Environments

  • Summary and Further Reading

  • Exercises

8 Hierarchical Reinforcement Learning

  • 8.1 Granularity of the Structure of Problems

  • 8.2 Divide and Conquer for Agents

  • 8.3 Hierarchical Environments

  • Summary and Further Reading

  • Exercises

9 Meta-Learning

  • 9.1 Learning to Learn Related Problems

  • 9.2 Transfer Learning and Meta-Learning Agents

  • 9.3 Meta-Learning Environments

  • Summary and Further Reading

  • Exercises

10 Further Developments

  • 10.1 Development of Deep Reinforcement Learning

  • 10.2 Main Challenges

  • 10.3 The Future of Artificial Intelligence


0
Subscribe to my newsletter

Read articles from shalin Shah directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

shalin Shah
shalin Shah