NumPy – Permutations and Shuffling
In data science and machine learning, we often need to randomize data order or generate different arrangements of data.
NumPy provides two powerful tools for this:
-
np.random.permutation()→ returns a new shuffled copy -
np.random.shuffle()→ shuffles in-place
These are widely used in:
- Machine learning
- Data preprocessing
- Simulations
- Statistical sampling
What is Permutation?
Permutation means:
Creating a new rearranged version of data without changing the original.
What is Shuffling?
Shuffling means:
Randomly changing the order of elements in the same array.
Import NumPy
import numpy as np
1. Permutation Example (New Array)
import numpy as np
A = np.array([1, 2, 3, 4, 5])
result = np.random.permutation(A)
print("Original:", A)
print("Permutation:", result)
Output (example):
Original: [1 2 3 4 5]
Permutation: [3 1 5 2 4]
2. Shuffle Example (In-place)
import numpy as np
A = np.array([10, 20, 30, 40, 50])
np.random.shuffle(A)
print(A)
Output (example):
[40 10 50 20 30]
3. Difference Between Permutation and Shuffle
import numpy as np
A = np.array([1, 2, 3, 4])
perm = np.random.permutation(A)
np.random.shuffle(A)
print("Permutation:", perm)
print("Shuffled:", A)
Key Difference:
| Feature | permutation | shuffle |
|---|---|---|
| Output | New array | Same array modified |
| Original data | Unchanged | Changed |
| Usage | Safe random copy | Fast in-place shuffle |
4. Permutation of Range
import numpy as np
print(np.random.permutation(10))
Output (example):
[7 2 9 1 5 0 3 8 4 6]
5. Shuffling 2D Arrays
import numpy as np
A = np.array([[1, 2],
[3, 4],
[5, 6]])
np.random.shuffle(A)
print(A)
Note:
Only rows are shuffled, not individual elements.
6. Permutation for Data Splitting
import numpy as np
data = np.array([10, 20, 30, 40, 50])
shuffled = np.random.permutation(data)
train = shuffled[:3]
test = shuffled[3:]
print("Train:", train)
print("Test:", test)
Real-World Applications
1. Machine Learning
- Train/test data shuffling
- Random batch generation
2. Data Science
- Data randomization
- Sampling datasets
3. Statistics
- Monte Carlo simulations
- Random experiments
4. Gaming
- Random events
- Level generation
Why Use NumPy Random Tools?
Using NumPy provides:
- Fast array operations
- Efficient randomization
- Easy data manipulation
- Scalable performance
Combined with Python, it becomes essential for AI and data science workflows.
Summary
NumPy provides two key functions:
np.random.permutation()
np.random.shuffle()
Both are used for randomizing data efficiently.
Conclusion
Permutation and shuffling are essential techniques for data preparation in machine learning and statistics. NumPy makes them fast, simple, and powerful for real-world applications.


0 Comments