NumPy – Descriptive Statistics

Descriptive statistics helps us summarize and understand data in a simple way.

It provides insights such as:

Central tendency (mean, median, mode idea)
Data spread (variance, standard deviation)
Distribution shape
Minimum and maximum values

Using NumPy, we can easily compute descriptive statistics on large datasets with fast and efficient functions.

What is Descriptive Statistics?

Descriptive statistics is the process of summarizing raw data into meaningful information.

It answers questions like:

What is the average value?
How spread out is the data?
What is the range?
Are there extreme values?

Why Descriptive Statistics is Important?

It is used in:

Data analysis
Machine learning
Business intelligence
Finance
Healthcare
Research and surveys

Import NumPy


import numpy as np

1. Mean (Central Tendency)


import numpy as np

data = [10, 20, 30, 40, 50]

mean_value = np.mean(data)

print(mean_value)

Output


30.0

Explanation

Mean represents the average value of the dataset.

2. Median (Middle Value)


import numpy as np

data = [10, 20, 30, 40, 50]

median_value = np.median(data)

print(median_value)

Output


30.0

Explanation

Median is useful when data contains outliers.

3. Standard Deviation (Data Spread)


import numpy as np

data = [10, 20, 30, 40, 50]

std_value = np.std(data)

print(std_value)

Output


14.142135623730951

Explanation

Shows how much data varies from the mean.

4. Variance (Dispersion Measure)


import numpy as np

data = [10, 20, 30, 40, 50]

var_value = np.var(data)

print(var_value)

Output


200.0

Explanation

Variance measures overall data spread.

5. Minimum and Maximum


import numpy as np

data = [10, 20, 30, 40, 50]

print("Min:", np.min(data))
print("Max:", np.max(data))

Output


Min: 10
Max: 50

Explanation

Minimum = smallest value
Maximum = largest value

6. Range of Data


import numpy as np

data = [10, 20, 30, 40, 50]

data_range = np.max(data) - np.min(data)

print(data_range)

Output

Explanation

Range shows the difference between highest and lowest values.

7. Percentiles (Data Distribution)


import numpy as np

data = [10, 20, 30, 40, 50]

print(np.percentile(data, 25))
print(np.percentile(data, 50))
print(np.percentile(data, 75))

Output


20.0
30.0
40.0

Explanation

Percentiles divide data into equal parts:

25% = lower quarter
50% = median
75% = upper quarter

8. Sum and Cumulative Sum


import numpy as np

data = [1, 2, 3, 4, 5]

print(np.sum(data))
print(np.cumsum(data))

Output


15
[ 1  3  6 10 15]

Explanation

Sum = total values
Cumulative sum = running total

9. Correlation (Relationship Between Data)


import numpy as np

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

print(np.corrcoef(x, y))

Output


[[1. 1.]
 [1. 1.]]

Explanation

A value close to 1 means strong positive relationship.

10. Data Distribution Overview (Histogram)


import numpy as np

data = np.random.randn(1000)

hist, bins = np.histogram(data, bins=10)

print(hist)

Explanation

Histogram shows how data is distributed across ranges.

Real-World Applications

1. Data Science

Data summarization
Feature analysis
Data cleaning

2. Machine Learning

Feature scaling
Model evaluation
Data preprocessing

3. Finance

Risk analysis
Market trends
Portfolio evaluation

4. Healthcare

Patient data analysis
Medical research
Diagnosis insights

5. Business Analytics

Sales analysis
Customer behavior
Performance tracking

Common NumPy Descriptive Functions

Function	Purpose
np.mean()	Average
np.median()	Middle value
np.std()	Standard deviation
np.var()	Variance
np.min()	Minimum value
np.max()	Maximum value
np.percentile()	Data distribution
np.corrcoef()	Correlation
np.sum()	Total
np.cumsum()	Running total

Why Use NumPy for Descriptive Statistics?

Using NumPy provides:

Fast processing of large datasets
Simple statistical functions
High-performance computation
Easy integration with data tools

Combined with Python, it becomes a powerful environment for data analysis and scientific computing.

Summary

NumPy descriptive statistics includes:


np.mean()
np.median()
np.std()
np.var()
np.min()
np.max()
np.percentile()
np.corrcoef()

These functions help summarize and understand data effectively.

Conclusion

Descriptive statistics is the foundation of data analysis. NumPy provides powerful tools to quickly compute and analyze data summaries, helping developers, analysts, and scientists make better decisions based on data.

Header Ads Widget

NumPy Descriptive Statistics Explained – Python Mean, Variance, Distribution Tutorial

NumPy – Descriptive Statistics

What is Descriptive Statistics?

Why Descriptive Statistics is Important?

Import NumPy

1. Mean (Central Tendency)

Output

Explanation

2. Median (Middle Value)

Output

Explanation

3. Standard Deviation (Data Spread)

Output

Explanation

4. Variance (Dispersion Measure)

Output

Explanation

5. Minimum and Maximum

Output

Explanation

6. Range of Data

Output

Explanation

7. Percentiles (Data Distribution)

Output

Explanation

8. Sum and Cumulative Sum

Output

Explanation

9. Correlation (Relationship Between Data)

Output

Explanation

10. Data Distribution Overview (Histogram)

Explanation

Real-World Applications

1. Data Science

2. Machine Learning

3. Finance

4. Healthcare

5. Business Analytics

Common NumPy Descriptive Functions

Why Use NumPy for Descriptive Statistics?

Summary

Conclusion

Posted by: Roger John Williams

You may like these posts

Post a Comment

0 Comments

Search This Blog

Report Abuse

Labels

Subscribe Us

Ad Space

Popular Posts

NumPy Inverse Fourier Transform Explained – Python IFFT with Examples

Python - Join Tuples (Complete Guide for Beginners)

Python - Tuple Methods (Complete Guide for Beginners)

Tags

Popular Posts

NumPy Inverse Fourier Transform Explained – Python IFFT with Examples

Python - Join Tuples (Complete Guide for Beginners)

Python - Tuple Methods (Complete Guide for Beginners)

Labels

Menu Footer Widget