Important Terms in Statistics- Machine Learning

Statistics and Probability Concepts

Photo by Mikael Blomkvist from Pexels

In machine learning, statistics and probability play an important role. Whenever we infer population parameters from sample statistics, we associate a probability to it. While prediction, probability plays an important role. Statistics and probability go hand in hand.

While learning statistics for machine learning, I came across many important terms. In this article, I like to summarise all the important terms which I have studied so far.

Table of Content

  1. Random Variable
  2. Probability distribution
  3. PMF vs PDF vs CDF
  4. Expected value
  5. Independent Events vs Mutually Exclusive Events vs Dependent Events
  6. Joint Probability vs Marginal Probability vs Conditional Probability vs Union Probability
  7. Bayes Theorem
  8. Normal distribution vs Uniform distribution
  9. Descriptive Statistics, Inferential Statistics
  10. Sampling Distribution, Central Limit Theorem
  11. Hypothesis Testing

1. Random Variable

What all values, random variable can take after performing an experiment. It is denoted as X
Example: Rolling a die. Random Variable X can take values [1,2,3,4,5,6]

Random variable can be discrete or continuous.

2. Probability distribution

It describes how probability is distributed over the values of the random variable.

Probability function P(X) is used to describe the probability distribution

Example: Probability of getting 2 while rolling a die.
Here 2 is the random variable

P(X=2) = 1/6

3. PDF vs PMF vs CDF

PMF — Probability Mass function

The probability distribution of discrete variables is known as the probability mass function.

PMF — Image by Author

CDF -Cumulative Distribution Function

CDF is used to calculate the cumulative probability for a given random variable (X)

Example. What is the probability of getting values less than or equal to 4, while rolling a dice? P(X≤4)

P(X≤4) =0.67 .

CDF for a discrete variable [Image by Author]

PDF — Probability density function

The probability distribution of continuous variables is known as the probability density function.

Example: Probability of weight of students in a class.

The probability density function for continuous variable [Image by Author]

CDF for continuous random variable [Image by Author]

4.Expected value

The expected value is the mean of the random variable

E(P(X))= X * P(X)

Example: What is the expected value while rolling a dice?

Random variables → X={1,2,3,4,5,6}

Probability distribution → P(X=x)=1/6

[x can be 1 or 2 or 3 or 4 or 5 or 6]

Expected Value → E(P(X))=1/6*1 + 1/6*2 + 1/6*3 +1/6*4 +1/6*5 +1/6*6

5. Independent Events vs Mutually Exclusive Events vs Dependent Events

Mutually Exclusive Events:

Event A and Event B are said to be mutually exclusive if they have no common outcomes and both can’t occur at the same time.

P( A and B )=0

Example:

Event A= Drawing a King from a deck of cards
Event B= Drawing a Queen from a deck of cards

Mutually Exclusive and Collectively Exhaustive Events:

The sum of probabilities of mutually exclusive and collectively exhaustive events is 1.

Example: Throwing a fair coin.

A → Getting Head 
B → Getting Tail

Both events A and B are mutually exclusive and collectively exhaustive events. The sum of probabilities of both the events is 1.

P(A)+P(B)=1

Independent Events

Event A and Event B are said to be independent if the occurrence of event A is not dependent on the occurrence of event B.

P(A and B)=P(A) * P(B)
Probability of getting 2 heads in a row = 1/2 * 1/2 =1/4

[Probability of getting heads in the second trial is not affected by the probability of getting heads in the first trial.

Dependent Events

Event A and Event B are said to be dependent if the occurrence of event A affects the occurrence of event B.

P(A and B) = P(A|B) P(B)

P(A) → Probability of drawing a King =4/52
P(B) → Probability of drawing a red card =26/52

P(A and B) → Probability of drawing a King and Red card =2/52

Let’s calculate using the formula:

P( A and B) =P(A|B) * P(B)

P(A|B)=Probability of drawing a king given red card = 2/26
P(B)= Probability of getting red card = 26/52

P(A and B)= 2/26 * 26/52 =2/52

6. Joint Probability vs Marginal Probability vs Conditional Probability vs Union Probability

Joint Probability — P( A and B)

Probability of A and B occurring.

Joint Probability for different events [Image by Author]

Union Probability — P(A or B)

The probability of A or B occurring.

P(A∪B) =P(A) + P(B) — P(A∩B)

Marginal Probability -P(A)

Probability of A occurring

Conditional Probability -P(A|B)

Probability of A occurring given that B has occurred.

P(A|B) =P(A∩B)/P(B)

P(A and B) = Joint Probability
P(B) → Marginal Probability

If A and B are independent events,

P(A|B) = P(A)
P(B|A)=P(B)

7. Bayes Theorem

By using the Bayes theorem, we can calculate the conditional probability from the other conditional probability.

In some scenarios, computing P(A|B) or P(B|A) will be easy. Calculate the conditional probability which is easy to compute from the data.
Bayes theorem can be used to compute conditional probability which is really challenging.

8. Normal Distribution vs Uniform Distribution

Uniform Distribution:

The probability is uniformly distributed across all possible outcomes of the random variable

Example: Rolling a die

Probability is uniformly distributed across all possible outcomes {1,2,3,4,5,6}

P(X=1),P(X=2),P(X=3),P(X=4),P(X=5),P(X=6) →1/6

Uniform Distribution — Rectangle

Normal Distribution

A normal distribution is also known as Gaussian distribution. In a normal distribution, data points are distributed more around the mean. It is symmetric in shape.

Parameters for normal distribution →Mean and Variance

The shape of normal distribution — Bell shape

Mean=Median =Mode

Normal distribution [Image by Author]

9. Descriptive Statistics, Inferential Statistics

Descriptive Statistics:

Descriptive statistics are used to describe and summarize the data.

Measure of Central Tendency:
 1. Mean — Average value
 2. Median — Middle value
3. Mode — The most common value

Measure of Spread: 
1.
Variance — How far the data points vary from the mean value.
2. Standard Deviation- Square root of the variance
3. Range — Difference between the maximum value and minimum value

Measure of skewness : 
1.
Right skewed- The distribution is skewed towards the positive side. It has a long right tail.
2. Left skewed — The distribution is skewed towards the negative side. It has long left tail

Inferential Statistics:

Infer population parameter from a sample statistic

Central Limit Theorem, Hypothesis testing

10.Sampling Distribution, Central Limit Theorem

Sampling Distribution

Sampling — Taking representative samples from the population
The sampling distribution of the mean is the mean of all the sample means.

Sampling distribution properties

Sampling distribution of mean = Population mean
Samplimg distribution standard devation= population standard deviation / sqrt(sample size)

Central Limit Theorem:

If the sample size is greater than 30, the sampling distribution of mean follows a normal distribution.

11.Confidence Interval

Confidence Interval means the range in which population parameters can occur. It is an interval estimate. It provides additional information about the variability of the population parameter.

12. Hypothesis Testing

Hypothesis testing is used to test whether the assumption of population parameter should be rejected or not.

Null Hypothesis: Status quo
Alternate Hypothesis: challenges the status quo.

Status quo means accepted norm

Conclusion:

I have some important terms in statistics and probability for machine learning. Thanks for reading and I hope you all like it.


My other blog on statistics.

https://pub.towardsai.net/inferential-statistics-for-data-science-91cf4e0692b1

https://pub.towardsai.net/inferential-statistics-for-data-science-91cf4e0692b1

https://pub.towardsai.net/inferential-statistics-for-data-science-91cf4e0692b1

https://pub.towardsai.net/inferential-statistics-for-data-science-91cf4e0692b1


If you like to read more of my tutorials, follow me on Medium, LinkedIn, Twitter.

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

Buy Me a CoffeeBuy Me a CoffeeBuy Me a Coffee

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s