What Are NumPy Statistics Functions?
NumPy statistics functions are built-in functions used to analyze numerical data stored in arrays. They help summarize data by calculating central tendency, spread, and distribution values. These functions are fast, accurate, and commonly used in data analysis, scientific computing, and machine learning.
Why Use NumPy for Statistics?
NumPy statistical functions:
-
Work efficiently on large datasets
-
Support multi-dimensional arrays
-
Allow axis-wise calculations
-
Are faster than manual Python calculations
-
Integrate easily with data science workflows
Measures of Central Tendency
Mean
Calculates the average value.
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.mean(data)) # Output: # 30.0Median
Returns the middle value when data is sorted.
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.median(data)) # Output: # 30.0Sum
Adds all elements.
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.sum(data)) # Output: # 150Measures of Dispersion
Minimum and Maximum
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.min(data)) print(np.max(data)) # Output: # 10 # 50Range (Max − Min)
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.ptp(data)) # Output: # 40Variance
Measures how far values are spread from the mean.
Standard Deviation
Square root of variance.
Percentiles & Quantiles
Percentile
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.percentile(data)) # Output: # 20.0Returns the value below which 25% of the data lies.
Quantile
import numpy as np data = np.array([10, 20, 30, 40, 50]) print(np.quantile(data)) # Output: # 30.0Quantiles are similar to percentiles but expressed between 0 and 1.
Other Useful Statistical Functions
| Function | Purpose |
|---|---|
np.argmin() | Index of minimum value |
np.argmax() | Index of maximum value |
np.cumsum() | Cumulative sum |
np.cumprod() | Cumulative product |
np.average() | Weighted average |