NumPy Loading Arrays
In real-world data science projects, data is usually stored in external files.
Instead of manually entering data, NumPy provides powerful functions to load arrays from files.
This makes data handling faster and more practical.
Why Loading Arrays is Important?
- Work with real datasets
- Import CSV and text files
- Save time on manual input
- Handle large-scale data
- Essential for data science workflows
1. Loading Arrays from Text File (loadtxt)
numpy.loadtxt() is used to load data from simple text files.
import numpy as np
data = np.loadtxt("data.txt")
print(data)
Example: Custom Delimiter
data = np.loadtxt("data.csv", delimiter=",")
print(data)
2. Handling Missing Values (genfromtxt)
genfromtxt() is more flexible and handles missing values.
data = np.genfromtxt("data.csv", delimiter=",", filling_values=0)
print(data)
3. Loading Specific Columns
data = np.loadtxt("data.csv", delimiter=",", usecols=(0, 2))
print(data)
4. Loading Integer Data
data = np.loadtxt("numbers.txt", dtype=int)
print(data)
5. Saving and Loading NumPy Arrays (.npy)
NumPy allows saving arrays in binary format.
Save Array
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
np.save("array.npy", arr)
Load Array
loaded = np.load("array.npy")
print(loaded)
6. Saving Multiple Arrays (.npz)
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.savez("data.npz", arr1=a, arr2=b)
Loading .npz
data = np.load("data.npz")
print(data["arr1"])
print(data["arr2"])
7. Loading CSV Files
data = np.genfromtxt("file.csv", delimiter=",", skip_header=1)
print(data)
8. Real-World Example: Student Data
data = np.loadtxt("students.csv", delimiter=",", dtype=str)
print(data)
9. Real-World Example: Sales Data
sales = np.genfromtxt("sales.csv", delimiter=",", filling_values=0)
print(sales)
Common NumPy Loading Functions
| Function | Purpose |
|---|---|
| loadtxt() | Load simple text data |
| genfromtxt() | Load data with missing values |
| load() | Load .npy files |
| savez() | Save multiple arrays |
Advantages of Loading Arrays
- Fast data import
- Supports multiple formats
- Handles large datasets
- Easy integration with ML pipelines
- Saves preprocessing time
Summary
NumPy provides powerful tools for loading data from files such as text, CSV, and binary formats. These functions are essential for working with real-world datasets in data science.
This functionality is part of NumPy and is widely used in applications built with Python.
Conclusion
Loading arrays in NumPy is a crucial step in any data analysis workflow. With simple functions like loadtxt, genfromtxt, and .npy handling, you can easily import and manage data in Python.


0 Comments