Analyse your data with python using pandas

learn how to use pandas package to clean up or analyze data using python

Posted Feb 22, 2025 Updated Apr 4, 2025

By Vivek Patel

2 min read

Analyse your data with python using pandas

Overview

In this tutorial, I’ll show you how to use pandas Library to manipulate your tabular data
Pandas Library is open source and free to use and my favorite library to manipulate tabular data due it it’s simple API
In this tutorial I am going to focus on
- Basic Data manipulation
- Reading writing data from and to excel file

Setup

use Pip install pandas to install pandas package
use import pandas as pd to import pandas package
i am also going to use csv file and excel file which contain column data from below table
you can copy paste data from this table and create your own version

ID	BREATH	DEPTH
C1	750	400
C2	900	400
C3	1050	400
C4	830	630
C5	1285	400
C6	600	600
C7	1200	400
C8	800	400
C9	900	600
C10	750	300

Create Dataframe

List

  
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], index=[
                  "i", "ii", "iii"], columns=["A", "B", "C"])
print(df)

  
df = pd.DataFrame(
    [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [10, 11, 12],
        [13, 14, 15],
        [16, 17, 18],
        [19, 20, 21],
        [22, 23, 24],
        [25, 26, 27],
        [28, 29, 30]
    ],
    index=["i", "ii", "iii", "iv", "v", "vi", "vii", "viii", "ix", "x"],
    columns=["A", "B", "C"]
)

CSV File

  
df = pd.read_csv('Column Data.csv')

Excel file

  
df = pd.read_excel('Column Data.xlsx')

if you have multiple sheets then specify sheet name

  
df = pd.read_excel('Column Data.xlsx', sheet_name="Data")

Manipulation Dataframe

Print Data

print first/last 5 rows

  
print(df.head())
print(df.tail())

if you want to print only first/last 2 rows

  
print(df.head(2))
print(df.tail(2))

print only rowIndex or column Index this is usefull to create dropdowns

  
print(df.index)
print(df.columns)
print(df["B"])

print specific row

  
print(df.loc[0:2])

print specific value based on row and column index

  
print(df.iat[0, 0])
print(df.iat[2, 1])
print(df.loc["i"]["A"])
# use Row and Column Name
print(df.loc["v", "B"])

you can also use this to update specific values

  
df.loc["i"]["A"] = 100
print(df.loc["i"]["A"])

Sort Data

sort values by specific column

  
print(df.sort_values(by=["A"]))
print(df.sort_values(by=["A"], ascending=False))

Filter using multiple columns

  
print(df.sort_values(by=["A", "B"]))

Filter Data

filter by specific value

  
print(df[df["B"] > 10])
print(df[df["DEPTH"] == 600])
print(df[df["DEPTH"] > 600])

filter but only show specific columns

  
print(df[df["B"] > 10, ["A", "C"]])

filter by multiple conditions

  
print(df[(df["BREATH"] > 600) & (df["DEPTH"] > 400)])

Write data to file

CSV file

  
df.to_csv("data.csv", index=True)

Excel file

  
df.to_excel("data.xlsx", index=True)

Conclusion

pandas is a very useful library to manipulate tabular data
It’s best usecase is when you combine with other libraries/framework like
- xlWings to automate excel
- JupyterNotebook to create interactive notebooks
- Streamlit to create interactive web apps
- matplotlib or plotly to create interactive graphs

If you have any questions or want to discuss something : Join our comment section

Python, Python-Libraries

This post is licensed under CC BY-NC-ND 4.0 by the author.

Overview

Setup

Create Dataframe

List

CSV File

Excel file

Manipulation Dataframe

Print Data

Sort Data

Filter Data

Write data to file

CSV file

Excel file

Conclusion

Trending Tags