In a prior post, we look at some of the basics of probability. The prior forms of probability we looked at focused on independent events, which are events that are unrelated to each other.
In this post, we will look at conditional probability which involves calculating probabilities for events that are dependent on each other. We will understand conditional probability through the use of Bayes’ theorem.
If all events were independent of it would be impossible to predict anything because there would be no relationships between features. However, there are many examples of on event affecting another. For example, thunder and lighting can be used to predictors of rain and lack of study can be used as a predictor of test performance.
Thomas Bayes develop a theorem to understand conditional probability. A theorem is a statement that can be proven true through the use of math. Bayes’ theorem is written as follows
P(A | B)
This complex notation simply means
The probability of event A given event B occurs
Calculating probabilities using Bayes’ theorem can be somewhat confusing when done by hand. There are a few terms however that you need to be exposed too.
- prior probability is the probability of an event without a conditional event
- likelihood is the probability of a given event
- posterior probability is the probability of an event given that another event occurred. the calculation or posterior probability is the application of Bayes’ theorem
Naive Bayes Algorithm
Bayes’ theorem has been used to develop the Naive Bayes Algorithm. This algorithm is particularly useful in classifying text data, such as emails. This algorithm is fast, good with missing data, and powerful with large or small data sets. However, naive Bayes struggles with large amounts of numeric data and it has a problem with assuming that all features are of equal value, which is rarely the case.
Probability is a core component of prediction. However, prediction cannot truly take place with events being dependent. Thanks to the work of Thomas Bayes, we have one approach to making prediction through the use of his theorem.
In a future post, we will use naive Bayes algorithm to make predictions about text.