Header Ads Widget

⚡ Premium Tools Hub • EXE Apps + Full Python Source Code
Lite • Pro • Bundle Packs • Instant Download

NumPy Statistical Functions Explained – Python Mean, Median, Std, Variance Tutorial

NumPy – Statistical Functions 

Statistics plays a key role in data analysis, machine learning, and scientific computing. It helps us understand data distribution, trends, and variability.

Using NumPy, we can easily perform statistical calculations on large datasets using simple and fast functions.


Why Statistical Functions Matter

Statistical functions help you:

  • Understand data behavior
  • Find central tendency
  • Measure variability
  • Detect outliers
  • Make predictions

Import NumPy

import numpy as np

1. Mean (Average)

The mean is the sum of all values divided by the number of values.

import numpy as np

data = [10, 20, 30, 40, 50]

mean_value = np.mean(data)

print(mean_value)

Output

30.0

Explanation

Mean gives the central value of the dataset.


2. Median

The median is the middle value.

import numpy as np

data = [10, 20, 30, 40, 50]

median_value = np.median(data)

print(median_value)

Output

30.0

Explanation

Median is useful when data contains outliers.


3. Standard Deviation

Measures how spread out the data is.

import numpy as np

data = [10, 20, 30, 40, 50]

std_value = np.std(data)

print(std_value)

Output

14.142135623730951

Explanation

Higher standard deviation means more variation in data.


4. Variance

Variance is the square of standard deviation.

import numpy as np

data = [10, 20, 30, 40, 50]

var_value = np.var(data)

print(var_value)

Output

200.0

Explanation

Variance measures data spread.


5. Minimum and Maximum Values

import numpy as np

data = [10, 20, 30, 40, 50]

print(np.min(data))
print(np.max(data))

Output

10
50

Explanation

  • Min = smallest value
  • Max = largest value

6. Percentiles

Percentiles divide data into 100 parts.

import numpy as np

data = [10, 20, 30, 40, 50]

print(np.percentile(data, 25))
print(np.percentile(data, 50))
print(np.percentile(data, 75))

Output

20.0
30.0
40.0

Explanation

  • 25th percentile = lower quarter
  • 50th percentile = median
  • 75th percentile = upper quarter

7. Range of Data

import numpy as np

data = [10, 20, 30, 40, 50]

data_range = np.max(data) - np.min(data)

print(data_range)

Output

40

Explanation

Range shows spread between smallest and largest values.


8. Sum and Cumulative Sum

import numpy as np

data = [1, 2, 3, 4, 5]

print(np.sum(data))
print(np.cumsum(data))

Output

15
[ 1 3 6 10 15]

Explanation

  • Sum = total
  • Cumulative sum = running total

9. Correlation

Measures relationship between two datasets.

import numpy as np

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

correlation = np.corrcoef(x, y)

print(correlation)

Output

[[1. 1.]
[1. 1.]]

Explanation

Value close to 1 means strong positive relationship.


10. Histogram Data Analysis

import numpy as np

data = np.random.randn(1000)

hist, bins = np.histogram(data, bins=10)

print(hist)

Explanation

Histogram shows data distribution.


Real-World Applications

1. Data Science

  • Data analysis
  • Feature engineering
  • Data cleaning

2. Machine Learning

  • Model evaluation
  • Feature scaling
  • Data normalization

3. Finance

  • Risk analysis
  • Market trends
  • Portfolio management

4. Healthcare

  • Patient data analysis
  • Medical research
  • Diagnosis support

5. Engineering

  • Quality control
  • Signal analysis
  • System monitoring

Common NumPy Statistical Functions

FunctionPurpose
np.mean()Average
np.median()Middle value
np.std()Standard deviation
np.var()Variance
np.min()Minimum
np.max()Maximum
np.percentile()Percentiles
np.corrcoef()Correlation

Why Use NumPy for Statistics?

Using NumPy provides:

  • Fast computation on large datasets
  • Easy syntax for complex operations
  • High-performance numerical processing
  • Integration with data science tools

Combined with Python, it becomes a powerful statistical computing environment.


Summary

NumPy statistical functions include:

np.mean()
np.median()
np.std()
np.var()
np.min()
np.max()
np.percentile()
np.corrcoef()

These functions help analyze and understand data efficiently.


Conclusion

Statistical functions are essential for understanding and interpreting data. NumPy provides a powerful and simple way to perform statistical analysis, making it a core tool for data science, machine learning, finance, and scientific research.




Post a Comment

0 Comments