Skip to main content

BAYESIAN NETWORKS in Machine Learning

So basically this topic is a mix of three theories namely:

  1. Graph Theory
  2. Probability Theory
  3. Bayes Theory

1. Graph Theory

 Graph theory is a mathematical field for the study of Graphs.

Graph = Nodes (or vertices) + Arcs (or lines, or edges!)

So a node can be anything that you want, a company, profits or a band.

The thing is- you can have many nodes- but they have to have some form of relation between them. Graphs can be directed and undirected as shown below.


2. Probability Theory

Probability is a measure of the likelihood of something happening. And is always a value between 0 and 1. The more likely, the closer the value is to 1 and vs-versa.

Probability = (Particular Event)➗(All Possible Events)

An example is coin flip. The probability of getting heads is 0.50 - and this brings us to our next idea, conditional probability.

Instead of one event at a time, what is the the probability of getting a heads twice? Which then becomes

P(Heads|Heads)

The '|' simply means 'given that' (the last value was). And as per our example is 0.50*0.50. You might ask- why multiply? Or what does multiply mean? Or cant we just add?  Well that's because we want to find the intersection of both H and H sets.

3. Bayes Theory

This is an extension of conditional probability. While using Bayes you are using conditional probability to calculate another one.

P(A|B) = P(B|A) * P(A)/P(B)

ie. Posterior = Likelihood*Prior/Evidence

If you want more i refer to this article.


For great examples, follow this link. 

Sample code : 

How AI can help fight Cholera. Feel free to contribute! https://goo.gl/kmcvKv


 


Comments

Popular posts from this blog

Making money with the falling rand: Lessons from Zimbabwe

It is no secret that the rand is falling like there is no tomorrow. This year alone it has fallen by over 18%. And if you look closely, at the last 3 years- it has fallen by 35%! This is not neglecting the economic setup where the slightest thing leads to ‘ toi toi. ' This trend of continuous striking and pay rate increase bargains has created such a vicious cycle. Prices rise, people strike, economy starts going through stuff. And we back at square one. We all know for sure that this cycle is bad. Zimbabwe and South Africa might not be different soon, only difference being that Zimbabwe chased the farmers, South Africa is chasing stabilisation. (Maybe the paradox of thrift  (prompted by the large population) will save them! Hope so.) In Zimbabwe 2008, a lot of people made a lot of money from ‘burning money’. This was whereby people took advantage of the bank rate versus the ‘streets’ rate of forex. The streets rate for forex was lower than the bank rate. Problem wa...

Artificial Neural Networks - Intro for beginners

The perceptron Single node perceptron Perceptrons form the basis for ANNs. Perceptron takes input and produces output as below: Input ➡️ Activation Function ➡️ Output Input If the weight is 0- input remains unaltered coming into the perceptron. Below is what happens to the input ∑ w i z i  ≽ t then y = 1. Else y = 0.  i t is the threshold which is set by the outgoing part. So the key to output is based on the weighted sum and the threshold. Activation function This is the processing part of the neuron and this determines output. So most commonly the ones used are the Sigmoid function and the hyperbolic tangent. QUESTION: Which activation function should I use? I am going to talk of three key activation functions a) Sigmoid The Sigmoid returns 0 or 1 and in code can be written as return 1.0/(1.0+Math.exp(-x) A Sigmoid is a mathematical curve having the characteristic S shape. DISADVANTAGE: Descending Gradient b) Tanh DISAD...

AI Simplified 2 : Bellman and Markov

Bellman equation  It's named after Richard Bellman and is defined as a necessary condition for optimality. It is associated with the mathematical optimization method known as dynamic programming. The deterministic equation is listed below:  V(s) = maxₐ  (R(s,a) + ɣ(V(s')) maxₐ : represents all the possible actions that we can take. R(s, a): The reward of taking an action at a particular state. ɣ: Discount. Works like the time value of money. You can see the dynamic programming aspect as we call the same method on s'. So it will recursively operate to solve the problem. Deterministic vs non-deterministic : Deterministic is definite, there is no randomness whereas non-deterministic is stochastic. So above we were not adding any randomness (deterministic) but nothing in this world is truly predictable, let's add randomness. Whereby each step is not so certain to be done (adding probability). It makes our agent more natural (being drunk! lol)...