In this post, I compare different reinforcement learning algorithms. The code is available on my Github repository.

Reinforcement Learning (RL) is an interesting area of study. In RL, an agent learns to make decisions by connecting actions to situations. The goal is to maximize a reward. The agent isn’t told exactly what to do. Instead, it has to figure out the most rewarding actions through trial and error.

The agent’s actions affect both immediate rewards and future situations. This means actions also affect long-term rewards. This combination of exploration, trial-and-error, and delayed reward makes reinforcement learning unique.

Agent Oriented Learning

Mini-project: Comparing reinforcement learning algorithms.

For this analysis, we’ll use the OpenAI Gym library. This library has many games that can be used as environments for training AI models.

We’ll focus on the game “N-chain”. In this game:

The agent moves along a chain of states
There are two actions: ‘forward’ and ‘back’
‘Forward’ moves the agent along the chain without giving a reward
‘Back’ returns the agent to the start and provides a small reward
Reaching the end of the chain gives a big reward
If the agent keeps moving forward to the end, it can keep getting this big reward
Sometimes the agent “slips” and does the opposite action
The observed state is the agent’s current position (from 0 to n-1)

This game was designed and used by Malcolm J. A. Strens in his work, A Bayesian Framework for Reinforcement Learning.

Comparing Reinforcement Learning Algorithms

Agent Oriented Learning

Bridging Symbols and Neurons: Neurosymbolic Reinforcement Learning and Planning Explained

Enhancing Language Models with Fine-Grained Reinforcement Learning from Human Feedback

Comparing Reinforcement Learning Algorithms

Agent Oriented Learning

You may also like

Bridging Symbols and Neurons: Neurosymbolic Reinforcement Learning and Planning Explained

Enhancing Language Models with Fine-Grained Reinforcement Learning from Human Feedback