What Is read_csv() in Pandas?

read_csv() is a Pandas function used to read CSV (Comma-Separated Values) files and load them into a DataFrame. CSV files are one of the most common formats for storing tabular data, and Pandas provides powerful options to handle them efficiently.

Why Use read_csv()?

Using read_csv() allows you to:

  • Load large datasets easily

  • Automatically create a DataFrame

  • Handle headers, separators, and encodings

  • Manage missing values

  • Select specific columns or rows

  • Preprocess data while reading

Basic Syntax

import pandas as pd df = pd.read_csv("data.csv")

This reads the CSV file and stores the data in a DataFrame called df.

Simple Example

Assume a file students.csv:

Python
Name,Age,City Alice,25,Delhi Bob,30,Mumbai Charlie,35,Chennai import pandas as pd df = pd.read_csv("students.csv") print(df) # Output: # Name Age City # 0 Alice 25 Delhi # 1 Bob 30 Mumbai # 2 Charlie 35 Chennai

Reading CSV Without Header

If the CSV file does not contain column headers:

Python
import pandas as pd df = pd.read_csv("students.csv", header=None) print(df) # Output: # 0 1 2 # 0 Name Age City # 1 Alice 25 Delhi # 2 Bob 30 Mumbai # 3 Charlie 35 Chennai

You can assign column names manually:

df = pd.read_csv("students.csv", header=None, names=["Name", "Age", "City"])

Custom Separator

CSV files may use separators other than commas (such as ; or |).

df = pd.read_csv("data.csv", sep=";")

Reading Specific Columns

Python
import pandas as pd df = pd.read_csv("students.csv", usecols=["Name", "Age"]) print(df) # Output: # Name Age # 0 Alice 25 # 1 Bob 30 # 2 Charlie 35

Handling Missing Values

df = pd.read_csv("data.csv", na_values=["NA", "null", "--"])

Pandas automatically converts these values to NaN.

Skipping Rows

df = pd.read_csv("data.csv", skiprows=2)

Useful when files contain metadata or comments at the top.

Limiting Rows

df = pd.read_csv("data.csv", nrows=5)

Reads only the first 5 rows.

Encoding Issues

Some CSV files require specifying encoding:

df = pd.read_csv("data.csv", encoding="utf-8")

Common encodings:

  • utf-8

  • latin1

  • ISO-8859-1

Checking the Loaded Data

print(df.head()) # First 5 rows print(df.tail()) # Last 5 rows print(df.info()) # Structure and data types

Important read_csv() Parameters

ParameterDescription
filepath_or_bufferPath to CSV file
sepColumn separator
headerRow number for column names
namesCustom column names
usecolsSelect specific columns
skiprowsSkip rows
nrowsLimit number of rows
encodingFile encoding
na_valuesDefine missing values

Key Points to Remember

  • read_csv() loads CSV files into DataFrames

  • Highly customizable through parameters

  • Handles missing values automatically

  • Supports large datasets

  • Most commonly used Pandas I/O function