Skip to main content

AI Simplified 1 : Reinforcement Learning

What is reinforcement learning?

Basically, it is learning from interaction and giving our agent a reward for achieving a goal. This field is essentially a class of problems with a class of solutions and a study of these classes. Of all the forms of Machine Learning, reinforcement learning is closest to how animals and humans learn.

Elements of reinforcement learning

a) Policy - defines the learning agents' behavior at any given time. Basically mapping States to action.
b) Reward Signal - the goal of reinforcement learning. The aim is to maximize it.
c) Value Function - while reward specifies what's good in the short run, value function specifies what's good in the long run. It is an accumulation. Rewards are primary whereas Value function is secondary.
d) Model - Optional. Mimics the environment. Used for planning. Methods with models are called model-based whereas the opposite is called model free.

Reinforcement versus Supervised and Unsupervised Learning

Supervised learning entails a  training set where we give it a set of problems and their corresponding solutions. Unsupervised learning entails finding a pattern when given data in coming up with a solution, it is given no training set. One might wonder as to the differences between Unsupervised learning and Reinforcement learning as both are not given solutions and are expected to figure out things as they go. Well, the key difference is that reinforcement seeks to maximize the reward whereas unsupervised learning is more to finding a pattern.

Reinforcement paradigm: the balance

That being said, reinforcement seeks to find a balance between exploitation and exploration. The agent exploits what it already knows to obtain a reward but it also has to explore to get better rewards in the future. Striking a balance in this regard is the paradigm that mainly challenges reinforcement learning in comparison to supervised and unsupervised learning.

Comments

Popular posts from this blog

Gentlest introduction : The Cloud

How it began To me the concept of cloud started when people began virtualizing their systems using the like of Virtual-box and VMware. What bough this a;long was the evolution/advent of technology which made it possible for software so simulate hardware. Originally software could not simulate hardware drivers but the moment that became possible- virtualization was born. A few years after- companies started offering this virtualization at a much grander scale and Infrastructure as a service was born. Lets get down to the three cloud components namely: Infrastructure as a Service (IAAS) This outsourced hardware meaning that one noways noes not need to setup servers, air-con and the like of access control but could 'rent' from someone and one of the great things about this was that a backup not only meant software but also meant hardware (as software could now simulate hardware) so recovery in case of disaster became easier. Platform as a Service (PAAS) This is ...

Deploy Django app online for free!

So after a number of lines of code, brilliance and dreaming. Your next dream is for the world to see. Of course you can walk around with your computer and doing a 'manage.py runserver' But cumon guys, lets embrace the cloud. Not like this guy though! I choose to deploy on  PythonAnywhere . So you ask why? 1) Free amazing support - You actually talk to a live human ! 2) Easy - Very easy 3) Affordable - As you scale up, it gets way better! So by now I assume you are already on a version control system (So I will not waste much energy on that one). Maybe Ill someday write on my two favs  Github  and  Bitbucket . STEP 1: Create an account on pythonanywhere. Kindly note that your username will be included in your apps url. So it will be like : " yourUsername .pythonanywhere.com" STEP 2: Select other and set a bash console. STEP 3: Push your code from version control This will push from (in my example) github to your pythonanywhere. You...

PRG, PRF, PRP in Cryptography - What are they?

So I have been reading up on my cryptography and I figured I should give out a brief lesson on these three amazing concepts What are they ? a) PRG (Pseudo Random Generator) You probably know the difference between stream and block cipher. One of the main differences between them is key size. Stream ciphers require the key to be of equal length of greater than the plaintext ,   whereas Block Ciphers take a key smaller than the PT and is then expanded. This is the PRG The PRG expands the seed Considerations: Stream Ciphers base on Perfect Secrecy whereas Block Ciphers base on Semantic Security b) PRF (Pseudo Random Function) Lets share a secret- imagine something- you want to authenticate yourself with me by proving that you know a secret that we both share. Here's a possible option i) Possible Option 1:  PRNGs We both seed a PRNG with the shared secret, I pick and then send you some random number i.  You   then have to prove that you know the s...