NumPy Union of Arrays
When working with datasets, you often need to combine values from multiple arrays while removing duplicates.
NumPy provides a powerful function called:
union1d()
This function returns the unique values that appear in either of the input arrays.
Union operations are commonly used in:
- Data analysis
- Data cleaning
- Machine learning
- Database operations
- Scientific computing
What is Union of Arrays?
The union of two arrays contains:
All unique elements that exist in either array.
For example:
Array A = [1, 2, 3, 4]
Array B = [3, 4, 5, 6]
Union = [1, 2, 3, 4, 5, 6]
Notice that duplicate values appear only once.
Why Use Array Union?
Array union helps you:
- Remove duplicate values
- Merge datasets
- Create unique lists
- Compare collections of data
- Simplify data preprocessing
NumPy Function for Union
NumPy provides:
np.union1d(arr1, arr2)
Syntax
np.union1d(array1, array2)
Parameters
| Parameter | Description |
|---|---|
| array1 | First input array |
| array2 | Second input array |
Return Value
Returns:
A sorted array containing unique values from both arrays
Basic Example
import numpy as np
a = np.array([1, 2, 3, 4])
b = np.array([3, 4, 5, 6])
result = np.union1d(a, b)
print(result)
Output
[1 2 3 4 5 6]
How Union Works
Step 1
Combine arrays:
[1, 2, 3, 4]
[3, 4, 5, 6]
Combined:
[1, 2, 3, 4, 3, 4, 5, 6]
Step 2
Remove duplicates:
[1, 2, 3, 4, 5, 6]
Step 3
Sort values:
[1, 2, 3, 4, 5, 6]
Example with Duplicate Values
import numpy as np
a = np.array([10, 20, 20, 30])
b = np.array([20, 30, 40, 40])
print(np.union1d(a, b))
Output
[10 20 30 40]
Example with Strings
import numpy as np
a = np.array(["Python", "NumPy"])
b = np.array(["NumPy", "Pandas"])
print(np.union1d(a, b))
Output
['NumPy' 'Pandas' 'Python']
Union of Large Arrays
import numpy as np
a = np.array([1, 3, 5, 7, 9])
b = np.array([2, 4, 6, 8, 10])
print(np.union1d(a, b))
Output
[ 1 2 3 4 5 6 7 8 9 10]
Union with Lists
NumPy automatically converts lists into arrays.
import numpy as np
a = [1, 2, 3]
b = [3, 4, 5]
print(np.union1d(a, b))
Output
[1 2 3 4 5]
Practical Example: Student Enrollment
import numpy as np
class_a = np.array([101, 102, 103, 104])
class_b = np.array([103, 104, 105, 106])
all_students = np.union1d(class_a, class_b)
print(all_students)
Output
[101 102 103 104 105 106]
Practical Example: Product Categories
import numpy as np
electronics = np.array(["Laptop", "Phone"])
accessories = np.array(["Phone", "Mouse"])
products = np.union1d(
electronics,
accessories
)
print(products)
Output
['Laptop' 'Mouse' 'Phone']
Union vs Concatenate
| Feature | union1d() | concatenate() |
|---|---|---|
| Removes duplicates | Yes | No |
| Sorts output | Yes | No |
| Set operation | Yes | No |
| Preserves duplicates | No | Yes |
Example Comparison
concatenate()
import numpy as np
a = np.array([1, 2, 3])
b = np.array([3, 4, 5])
print(np.concatenate((a, b)))
Output:
[1 2 3 3 4 5]
union1d()
print(np.union1d(a, b))
Output:
[1 2 3 4 5]
Related NumPy Set Operations
| Function | Purpose |
|---|---|
union1d() | Union |
intersect1d() | Common elements |
setdiff1d() | Difference |
setxor1d() | Symmetric difference |
unique() | Unique values |
Real-World Applications
Array unions are used in:
- Customer databases
- Student management systems
- Product catalogs
- Data cleaning
- Machine learning preprocessing
- Inventory management
- Scientific datasets
Advantages of NumPy Union
- Removes duplicates automatically
- Fast and optimized
- Easy syntax
- Works with numeric and text data
- Ideal for large datasets
Summary
The NumPy union1d() function combines two arrays and returns a sorted array containing only unique values. It is one of NumPy’s most useful set operations for data preprocessing and analysis.
This functionality is a key feature of NumPy and is commonly used in projects built with Python.
Conclusion
Understanding array unions helps you efficiently merge datasets while eliminating duplicate values. Whether you're working in data science, machine learning, or analytics, union1d() is an essential NumPy function that simplifies data management and improves workflow efficiency.


0 Comments