If you work with large datasets, you must be familiar with CSV or comma-separated value files. The name is a little self-explanatory: it is a file with data separated by commas. Handling CSV files has become an essential skill for anyone working with Python and data. The Python csv module provides an easy-to-use interface for reading, writing, and manipulating CSV files. These capabilities makes it a powerful tool for data analysis, reporting, and automation.
Today at PythonCentral, let us take you through the instructions to efficiently work with CSV files in Python, covering reading, writing, appending, and advanced operations.
What is a CSV file
In simple terms, it is a CSV file is a plain text file that stores tabular data, with values separated by commas. It is commonly used for data exchange between software applications.
CSV File Example
Here is how a CSV file looks like when you open it with a simple text editor:
name,age,email Alfa,25,[email protected] Bravo,30,[email protected] Charlie,28,[email protected]
This is a simple CSV data, and we will be using this same example throughout this article for better and easy understanding.
How to Read CSV Files in Python
Python provides multiple ways to read CSV files, but the built-in csv is most common and simple approach.
Reading a CSV File Using "csv.reader()"
Here is a sample Python script to read a CSV file using the in-built csv.reader module:
import csv with open('data.csv', mode='r') as file: reader = csv.reader(file) for row in reader: print(row) # This script prints each row as a list item
Reading CSV as a Dictionary with csv.DictReader()
Now let us see a sample script that uses the csv.DictReader module:
with open('data.csv', mode='r') as file: reader = csv.DictReader(file) for row in reader: print(row['name'], row['email']) # This lets you access specific columns by name
How to Create a CSV File with Python
You can write or create or write data to a CSV file using csv.writer() module.
How to Write a List to a CSV File
If you want to write a list to a CSV file, you can use Python to do so. Here is how you can do it:
import csv data = [['name', 'age', 'email'], ['Alfa', 25, '[email protected]'], ['Bravo', 30, '[email protected]']] with open('output.csv', mode='w', newline='') as file: writer = csv.writer(file) writer.writerows(data) # This writes multiple rows at once
How to Write a Dictionary to a CSV File with csv.DictWriter()
If you would like to write a dictionary to a CSV file, the csv.DictWriter module helps you do that. Here is an example script:
fieldnames = ['name', 'age', 'email'] data = [{'name': 'Alfa', 'age': 25, 'email': '[email protected]'}, {'name': 'Bravo', 'age': 30, 'email': '[email protected]'}] with open('output.csv', mode='w', newline='') as file: writer = csv.DictWriter(file, fieldnames=fieldnames) writer.writeheader() # This writes the column headers writer.writerows(data)
How to Append Data to a CSV File
You would often want to append data to an existing CSV file when continuously logging information. Here is how you can do that:
new_data = [['Charlie', 28, '[email protected]']] with open('output.csv', mode='a', newline='') as file: writer = csv.writer(file) writer.writerows(new_data)
How to Handle Different Delimiters
By default, CSV files use commas, but sometimes other delimiters like tabs (\t) or semicolons (;) are used. When your CSV files contain other delimiters, here is how you can tackle them.
with open('data.tsv', mode='r') as file: reader = csv.reader(file, delimiter='\t') # See how we are specifying the delimiter? for row in reader: print(row)
How to Handle Large CSV Files Efficiently
For large datasets, it is recommended to process files line by line to avoid memory overload. Let us use a custom function to handle this:
with open('large_data.csv', mode='r') as file: reader = csv.reader(file) for row in reader: process(row) # We have introduced custom function to handle data
How to Use Pandas for CSV Operations
For advanced CSV handling, Pandas is a powerful library that simplifies data manipulation.
How to Read CSV Files with Pandas
Here is how you can get data out of a CSV file with Python:
import pandas as pd df = pd.read_csv('data.csv') print(df.head()) # This script displays the first 5 rows
How to Write to CSV with Pandas
Here is how you can use Python Pandas to write to a CSV file:
df.to_csv('output.csv', index=False) # Saves dataframe without index
How to Handle Errors in CSV Processing
There will be times where you encounter errors. As precautions, we have listed the common pitfalls and errors and provided the solutions as well.
How to Handle Missing Values
If there are any missing values, here is how you can fix this:
with open('data.csv', mode='r') as file: reader = csv.reader(file) for row in reader: if len(row) < 3: print("Skipping incomplete row:", row) else: print(row)
Handling File Not Found Error
If there are chances that the file will not be present, use this script:
with open('non_existent.csv', mode='r') as file: reader = csv.reader(file) for row in reader: print(row) except FileNotFoundError: print("Error: The file was not found.")
Wrapping Up
The Python csv module provides flexible tools for reading, writing, and processing CSV files efficiently. Whether you’re working with small datasets or handling large-scale data processing, mastering CSV operations in Python has become essential rather good-to-have.
By learning Python CSV operations, you can streamline data management, reporting, and automation tasks efficiently. Use this free online tool to convert CSV to XML in seconds.
Related Articles
Python Is Essential for Data Analysis and Data Science – Here’s Why