Skill - Selecting a subset of DataFrame based on index / column positions using ‘iloc’ function
Skills Required
- Setup python development environment
- Basic Printing in Python
- Commenting in Python
- Managing Variables in python
- Pandas DataFrame Basics
Please make sure to have all the skills mentioned above to understand and execute the code mentioned below. Go through the above skills if necessary for reference or revision
Pandas is a python library.
DataFrame is a data structure provided by the pandas library.
Please go through Pandas DataFrame Basics to learn the basics of pandas DataFrame.
In this post, we will learn how to select a subset of DataFrame using iloc
function
Instructions to run the codes below
- Create a folder and place the csv file used in this post from here
- Open the folder in Visual Studio Code
- Create and work on python files in this folder
The excel files should look like the image below
-
Suppose for a dataframe
df
if you want to get a subset DataFrame with 2nd to 5th columns and 12th to 28th rows, we can usedf.iloc[11:28, 1:5]
-
If we want all rows but only 5th to 9th columns then we can use
df.iloc[:, 4:9]
-
If we want all columns but only 45th to 64th rows then we can use
df.iloc[44:64, :]
-
If we want all rows but only 1,5,8 columns then we can use
df.iloc[:, [1,5,8]]
Example
import pandas as pd
# create DataFrame from csv
df = pd.read_csv('gen_schedules.csv')
# get 2nd to 5th columns and 12th to 28th rows
df1 = df.iloc[11:28, 1:5]
print(df1)
# get all rows but only 5th to 9th columns
df2 = df.iloc[:, 4:9]
print(df2)
# get all columns but only 45th to 64th rows
df3 = df.iloc[44:64, :]
print(df3)
# get all rows but only 1,5,8 column indexes
df4 = df.iloc[:, [1,5,8]]
print(df4)
A similar function is loc, but it uses indexes and column names to get a subset of DataFrame.
Video
Video for this post can be found here
Online Interpreter
Although we recommend to practice the above examples in Visual Studio Code, you can run these examples online at https://www.tutorialspoint.com/execute_python_online.php
References
- Official tutorial - https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/03_subset_data.html#min-tut-03-subset
Comments
Post a Comment