NumPy – Reading Data from Files
In real-world data science projects, data is often stored in external files such as:
- CSV files
- Text files
- Structured datasets
NumPy provides powerful functions to load and read data efficiently.
Why File Reading is Important?
Instead of manually entering data, we can directly import datasets for:
- Machine learning
- Data analysis
- Statistical modeling
- Scientific computing
Import NumPy
import numpy as np
1. Reading Simple Text Files (np.loadtxt)
import numpy as np
data = np.loadtxt("data.txt")
print(data)
Explanation:
- Reads numerical data from text file
- Assumes clean, structured data
2. Reading CSV Files
import numpy as np
data = np.loadtxt("data.csv", delimiter=",")
print(data)
Meaning:
- Reads comma-separated values
- Common in datasets
3. Handling Missing Data (np.genfromtxt)
import numpy as np
data = np.genfromtxt("data_with_missing.csv", delimiter=",", filling_values=0)
print(data)
Why use genfromtxt?
- Handles missing values
- More flexible than loadtxt
- Useful for real-world datasets
4. Reading Specific Columns
import numpy as np
data = np.loadtxt("data.csv", delimiter=",", usecols=(0, 2))
print(data)
Meaning:
- Reads only selected columns
- Improves efficiency
5. Skipping Header Rows
import numpy as np
data = np.loadtxt("data.csv", delimiter=",", skiprows=1)
print(data)
Use case:
- Skip column names
- Load only numeric data
6. Reading Large Datasets Efficiently
import numpy as np
data = np.genfromtxt("large_data.csv", delimiter=",", skip_header=1)
print(data[:5])
Real-World Applications
1. Data Science
- Dataset loading
- Data preprocessing
2. Machine Learning
- Training data import
- Feature extraction
3. Scientific Computing
- Experimental data reading
- Simulation results
4. Business Analytics
- CSV reports
- Financial datasets
Why Use NumPy File Reading?
Using NumPy provides:
- Fast data loading
- Efficient array conversion
- Easy handling of large datasets
- Seamless integration with analysis tools
Combined with Python, it becomes essential for data science workflows.
Summary
NumPy provides key functions for file reading:
np.loadtxt()
np.genfromtxt()
These functions make data importing simple and efficient.
Conclusion
Reading data from files is a core skill in data science. NumPy makes it fast, flexible, and powerful for working with real-world datasets.


0 Comments