AI Simplified 1 : Reinforcement Learning

What is reinforcement learning?

Basically, it is learning from interaction and giving our agent a reward for achieving a goal. This field is essentially a class of problems with a class of solutions and a study of these classes. Of all the forms of Machine Learning, reinforcement learning is closest to how animals and humans learn.

Elements of reinforcement learning

a) Policy - defines the learning agents' behavior at any given time. Basically mapping States to action.

b) Reward Signal - the goal of reinforcement learning. The aim is to maximize it.

c) Value Function - while reward specifies what's good in the short run, value function specifies what's good in the long run. It is an accumulation. Rewards are primary whereas Value function is secondary.

d) Model - Optional. Mimics the environment. Used for planning. Methods with models are called model-based whereas the opposite is called model free.

Reinforcement versus Supervised and Unsupervised Learning

Supervised learning entails a training set where we give it a set of problems and their corresponding solutions. Unsupervised learning entails finding a pattern when given data in coming up with a solution, it is given no training set. One might wonder as to the differences between Unsupervised learning and Reinforcement learning as both are not given solutions and are expected to figure out things as they go. Well, the key difference is that reinforcement seeks to maximize the reward whereas unsupervised learning is more to finding a pattern.

Reinforcement paradigm: the balance

That being said, reinforcement seeks to find a balance between exploitation and exploration. The agent exploits what it already knows to obtain a reward but it also has to explore to get better rewards in the future. Striking a balance in this regard is the paradigm that mainly challenges reinforcement learning in comparison to supervised and unsupervised learning.

Shingirayi Mandebvu's Blog

Search This Blog