Introduction IV – Conditional Probability

It’s all about what happened before!

Conditional probability answers one of the most interesting questions: How does the probability of an event change if we have new, additional information? Translated into real life examples the question would change to: How does the probability for a car accident change if I’m drunk? How does the probability for a increasing stock price change if the price/book value increases? How does the probability of having a disease change if a test for the disease is positive? All these questions can be answered with conditional probability.

Let us first talk about what we have to know before we deepen our understanding with an example:

  1. We denote the probability of A given B or in other words the probability that A happens given that B happened before with P(A|B)
  2. The P(A|B) is the proportion of the intersection of A and B to B: P(A|B)=\frac { P(A\cap B) }{ P(B) }\;given\;P(B)\neq 0

P(B) can’t be equal to 0 because if P(B) can’t happen then A given that B happened before can also never happen.

Example: Suppose we toss a fair coin three times. Let the events be: A = “3 Heads” and B=”The first two tosses were heads”. How does the probability for A change if we take the information given by B into account?

Answer: If we toss a coin three times we have eight possible outcomes {HHH, HHT, HTT, HTH, THT, THH, TTH, TTT}. The probability for event A is therefore \frac { 1 }{ 8 } , but if we take in account that B happened before the sample space for A changes to {HHH, HHT}. The Probability for A given that B happened before is therefore \frac { 1 }{ 2 } . We could have calculated this with the formula from above as well. P(A\cap B)=\frac { 1 }{ 2 } as the coin can either land head or tails on the third toss. P(B)= \frac { 2 }{ 8 } as there are two outcomes out of eight that have two heads in the first two tosses. If we put these probabilities in the formula from above we get the same answer as before.

Multiplication Rule

I will just give the multiplication rule at this point as it is the conditional probability formula just rearranged: P(A \cap B)=P(A|B)\cdot P(B)

Law of Total Probability

The Law of Total Probability enables us to calculate the total probability of an event given that we have all the conditioned probabilities of that even. Wo do that by summing up all conditioned probabilities of that event. We need the total probability of an event for the Bayes Theorem that will follow shortly.

P(A)=P(A \cap B_{ 1 }) +P(A \cap B_{ 2 })+P(A \cap B_{ 3 })+...P(A \cap B_{ n }) or P(A)=P(A|B_{ 1 })\cdot P(B_{ 1 })+P(A|B_{ 2 })\cdot P(B_{ 2 })+P(A|B_{ 3 })\cdot P(B_{ 3 })+...P(A|B_{ n })\cdot P(B_{ n })


Sometime the additional information doesn’t change the probability of an event. The events are then independent of each another:

P(A \cap B)=P(A)\cdot P(B) or P(A|B)=P(A)

Example: Let’s take the same events like above; A = “3 Heads” and B = “The first two tosses were heads”: Are events A and B independent?

Answer: P(A) = \frac { 1 }{ 8 } ; P(B) = \frac { 2 }{ 8 } and P(A|B) = \frac { 1 }{ 2 } ; Therefore P(A|B)\neq P(A) and the events are therefore NOT independent.

Bayes’ Theorem:

Bayes’ Theorem allows us to inverse the conditioned probability P(A|B) to P(B|A). Bayes’ Theorem is also the basis of some machine learning algorithms and is used in many cases of probability.

P(B|A)=\frac { P(A|B)\cdot P(B) }{ P(A) }

Some great examples with good explanations can be found on the Wikipedia page.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s