NumPy – File Formats Supported
When working with data in Python, storing and sharing datasets efficiently is very important.
NumPy supports multiple file formats to save and load data depending on the use case.
Each format has its own advantages in speed, readability, and compatibility.
1. NumPy Binary Format (.npy)
The .npy format is NumPy’s native file format for storing a single array.
Save Example
import numpy as np
data = np.array([1, 2, 3, 4, 5])
np.save("data.npy", data)
Load Example
import numpy as np
data = np.load("data.npy")
print(data)
Features:
- Fastest format
- Stores single array
- Efficient binary storage
2. NumPy Archive Format (.npz)
The .npz format is used to store multiple arrays in one file.
Save Example
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.savez("data.npz", arr1=a, arr2=b)
Load Example
import numpy as np
data = np.load("data.npz")
print(data["arr1"])
print(data["arr2"])
Features:
- Stores multiple arrays
- Compressed archive format
- Great for machine learning datasets
3. Text Format (.txt)
The .txt format stores data in a human-readable form.
Save Example
import numpy as np
data = np.array([[1, 2, 3],
[4, 5, 6]])
np.savetxt("data.txt", data)
Load Example
import numpy as np
data = np.loadtxt("data.txt")
print(data)
Features:
- Human-readable
- Slower than binary formats
- Easy to inspect manually
4. CSV Format (.csv)
CSV is one of the most widely used formats for data exchange.
Save Example
import numpy as np
data = np.array([[10, 20, 30],
[40, 50, 60]])
np.savetxt("data.csv", data, delimiter=",")
Load Example
import numpy as np
data = np.loadtxt("data.csv", delimiter=",")
print(data)
Features:
- Compatible with Excel and databases
- Widely used in data science
- Easy integration with other tools
5. Comparison of File Formats
| Format | Type | Speed | Readability | Use Case |
|---|---|---|---|---|
| .npy | Binary | Fastest | No | Single array storage |
| .npz | Binary | Fast | No | Multiple arrays |
| .txt | Text | Slow | Yes | Simple data |
| .csv | Text | Medium | Yes | Data exchange |
Real-World Applications
1. Data Science
- Dataset storage
- Preprocessing pipelines
2. Machine Learning
- Model input/output storage
- Feature saving
3. Engineering
- Simulation data storage
- Sensor logs
4. Analytics
- Reporting systems
- Data sharing
Why NumPy File Formats Matter?
Using NumPy file formats provides:
- High performance storage
- Easy data serialization
- Flexible format options
- Seamless integration with Python workflows
Combined with Python, it becomes essential for data engineering and machine learning pipelines.
Summary
NumPy supports multiple file formats:
.npy
.npz
.txt
.csv
Each format serves different data storage needs.
Conclusion
Understanding NumPy file formats is essential for efficient data handling in data science and machine learning. Choosing the right format improves performance, compatibility, and workflow efficiency.


0 Comments