Skip to main content

Posts

AI Simplified 3 : Q Learning - From State to Action

Q Learning Previously we were looking at the value of the state. Q Learning now moves to calculate the value of an action. We now move based on our actions as opposed to the value of a state.  People tend to think that its called q as shorthand for quality learning.  Now let's move to determine the equation for q-learning. Deriving the equation Remember the stochastic Markov equation?  V(s) = maxₐ (R(s,a)+  ɣ∑s' P(s, a, s') V(s')) Value of an Action equals Value of a state. ie V(s) = Q(s,a) ↡ Q(s,a)  = R(s,a)+  ɣ∑s' P(s, a, s') V(s') No max as we are not considering all of the alternative actions but just one action. We need to wean ourselves from V so we need to replace V(s'). V(s') represents all possible states. It is also worthy to note that V(s') is also max(Q(s', a')). With that it becomes  ↡ Q(s,a)  = R(s,a)+  ɣ∑s' P(s, a, s')  maxₐ  (Q(s', a')) Why max? Well, we still want to get all the po...

AI Simplified 2 : Bellman and Markov

Bellman equation  It's named after Richard Bellman and is defined as a necessary condition for optimality. It is associated with the mathematical optimization method known as dynamic programming. The deterministic equation is listed below:  V(s) = maxₐ  (R(s,a) + ɣ(V(s')) maxₐ : represents all the possible actions that we can take. R(s, a): The reward of taking an action at a particular state. ɣ: Discount. Works like the time value of money. You can see the dynamic programming aspect as we call the same method on s'. So it will recursively operate to solve the problem. Deterministic vs non-deterministic : Deterministic is definite, there is no randomness whereas non-deterministic is stochastic. So above we were not adding any randomness (deterministic) but nothing in this world is truly predictable, let's add randomness. Whereby each step is not so certain to be done (adding probability). It makes our agent more natural (being drunk! lol)...

AI Simplified 1 : Reinforcement Learning

What is reinforcement learning? Basically, it is learning from interaction and giving our agent a reward for achieving a goal. This field is essentially a class of problems with a class of solutions and a study of these classes. Of all the forms of Machine Learning, reinforcement learning is closest to how animals and humans learn. Elements of reinforcement learning a) Policy - defines the learning agents' behavior at any given time. Basically mapping States to action. b) Reward Signal - the goal of reinforcement learning. The aim is to maximize it. c) Value Function - while reward specifies what's good in the short run, value function specifies what's good in the long run. It is an accumulation. Rewards are primary whereas Value function is secondary. d) Model - Optional. Mimics the environment. Used for planning. Methods with models are called model-based whereas the opposite is called model free. Reinforcement versus Supervised and Unsupervised Learnin...

Gentlest introduction : Data Science

Why Data Science? We want to be an AI driven organisation but in order for that we need to be a data driven organisation. It helps make better decisions (Scientific Method) It helps with making smarter products Automating Manual Methods This is the process of converting data to knowledge. Common sources of data being big data database systems like Spark, MongoDB and Excel. Process of turning your organisation to a data driven organisation Find a question Collect Data Process (munch) the data Create a model Evaluate the model Deploy Repeat (if necesary) Key components for a data driven organisation Strategy People Data Technology Culture The data science hierarchy of needs 

Gentlest introduction : Databases

Why database I know a quick google on the definition of database will yield results suck as a repository of information   but what my tutor taught me is that that definition is akin to defining an engine as a collection of metal parts. So the definition must now be more inclined towards why database. Because database is more than just storing information (because whats the point when you can simply store the information on an excel spreadsheet. Why database then? Its because of what happens next. Imagine these scenarios (which are perfectly normal when it comes to real world scenarios) Data Grows Changes over time - and we need to be able to track the changes Should be fast and searchable Should have awesome up-time and be interdependent to other systems should be consistent and protected (secure) So why database is to be able to solve all the problems that come with data. It is to provide a more organized repository of information.  Relational Dat...

Gentlest introduction : The Cloud

How it began To me the concept of cloud started when people began virtualizing their systems using the like of Virtual-box and VMware. What bough this a;long was the evolution/advent of technology which made it possible for software so simulate hardware. Originally software could not simulate hardware drivers but the moment that became possible- virtualization was born. A few years after- companies started offering this virtualization at a much grander scale and Infrastructure as a service was born. Lets get down to the three cloud components namely: Infrastructure as a Service (IAAS) This outsourced hardware meaning that one noways noes not need to setup servers, air-con and the like of access control but could 'rent' from someone and one of the great things about this was that a backup not only meant software but also meant hardware (as software could now simulate hardware) so recovery in case of disaster became easier. Platform as a Service (PAAS) This is ...

Some programming concepts

Library: Lets take for example. You write code that prints Hello World to the screen. Someone refines it and maybe it starts to ask input before printing. Someone again adds different text formatting to it and you no longer need to rewrite the same code ever again to achieve that functionality. Framework: Imagine you are builing a house. and you get a system that provides the scafflolding, foundation and the frame at the same time so that you don't need to redo most of the work. This is like a library on a grander scale- flexibility/functionality is reduced. SDK: In the old days - one would get a software development kit (something like pictured below). But it would be filled with floppy disks, documentation books etc. So it was an actual kit! Class: This is a template\blueprint or a recipe list.