Hello Pandas lovers, in today’s article I will teach you loc and iloc in Pandas with the help of the proper examples and explanation.
As a Data analyst and Data engineer, We must know about the loc and iloc in Pandas because these two methods are beneficial for working with Data on Pandas DataFrame and data series.
Throughout this article, we will cover multiple use cases of the Pandas iloc and loc.
Let’s start the tutorial.
Make sure you have installed the Pandas library in your Python environment. If you have not installed Python Pandas, you can refer to our Pandas installation tutorial.
Now Let’s start the tutorial.
Headings of Contents
- 1 Master Loc and iLoc in Pandas DataFrame
- 1.1 Pandas Loc -> Label-Based Indexing
- 1.2 Pandas iLoc -> Integer-based Indexing
- 1.3 Difference Between Loc and iLoc in Pandas
- 1.4 Conclusion
Master Loc and iLoc in Pandas DataFrame
loc and iloc in Pandas are used to access and manipulate data within a DataFrame or Series. They provide different methods for selecting data based on labels or integer-based positions.
To implement Pandas loc and iloc, I have created a simple Pandas DataFrame along with some information which you can see below.
import pandas as pd data = { "name": ["Vishvajit", "Harsh", "Sonu", "Peter"], "age": [26, 25, 30, 33], "country": ["India", "India", "India", "USA"], } index = ['a', 'b', 'c', 'd'] df = pd.DataFrame(data, index=index) print(df)
Output:
name age country a Vishvajit 26 India b Harsh 25 India c Sonu 30 India d Peter 33 USA
In the above Output, name, age, and country are the column labels and a, b, c, and d are the row labels.
Let’s get started with Pandas loc.
Pandas Loc -> Label-Based Indexing
Pandas loc is used for label-based indexing. This means you use the row and column labels to select data from Pandas DataFrame. It is inclusive of the start and end labels.
Examples Usages
In the above DataFrame name, age and country are the column labels and a, b, c, and d are the row labels.
Now, Let’s see various use cases of Pandas loc with proper examples.
To select data from Pandas DataFrame, use the below Pandas loc syntax.
Syntax:
df.loc[rows labels, column labels]
Selecting a Single Row by Label
Pass a single row label to select a single row from the Pandas DataFrame.
row_b = df.loc['b'] print(row_b)
Output
name Harsh age 25 country India Name: b, dtype: object
Note:- df.loc['b'] will return Pandas series because here we are using a single row label but In the case of DataFrame, you need to pass a list of labels (e.g., df.loc[['a', 'b']]).
Selecting Multiple Rows by Label
Pass multiple rows as a Python list to select multiple rows.
# Select rows with labels 'a' and 'c' rows_ac = df.loc[['a', 'c']] print(rows_ac)
Selecting Specific Rows and Columns by Label
Select specific rows and columns from the Pass row label as the first argument and the column label as the second argument.
# Select rows 'a' and 'b', and columns 'name' and 'country' subset = df.loc[['a', 'b'], ['name', 'country']] print(subset)
Output
name country a Vishvajit India b Harsh India
In the above example, I am selecting rows a and b but using a slicing technique to select rows also.
# Select rows from a to c, and columns 'name' and 'country' subset = df.loc['a':'c', ['name', 'country']] print(subset)
Selecting Specific Rows and Columns by using Boolean
You can use a boolean array of the same length axis being sliced.
result = df.loc[[True, True, False, True], [True, True, False]] print(result)
Output
name age a Vishvajit 26 b Harsh 25 d Peter 33
Changing Value by Using Loc
You can also change the Value of Pandas DataFrame by using Pandas loc.
Changing a Single Value
Changing the value at row label ‘b‘ and column ‘Age‘.
df.loc['b', 'Age'] = 26 print(df)
Changing Multiple Values
# Change age for rows 'b' and 'c' df.loc[['b', 'c'], 'age'] = [26, 31] print(df)
Change All Values in a Column
To update all values in the ‘name‘ column.
df.loc[:, 'name'] = df['name'].str.upper() print(df)
Note:- df.loc[:, 'name'] will select all rows along with name column.
Conditional Selection
Selecting records whose age is 30 or greater than 30
You can apply the condition using the Pandas loc. For example, I want to select only those people whose age is 30 or greater than 30.
condition = df['age'] >= 30 result = df.loc[condition] print(result)
Output
name age country c Sonu 30 India d Peter 33 USA
Selecting records whose name contains ‘sh’
# selecting person who name contains 'sh' condition = df['name'].str.contains('sh') result = df.loc[condition] print(result)
Output
name age country a Vishvajit 26 India b Harsh 25 India
Pandas iLoc -> Integer-based Indexing
Pandas iloc is used for integer-based indexing. This means you use integer indices (positions) to access data.
Use below Pandas iloc syntax to get data from Pandas DataFrame.
df.iloc[row_indices, column_indices]
Let’s see the use cases of Pandas iloc.
Selecting a Single Row by Index Position
# Select the row at index position 1 row_1 = df.iloc[1] print(row_1)
Output
name Harsh age 25 country India Name: b, dtype: object
Selecting Multiple Rows by Index Position
You can pass multiple integer index numbers as a Python list.
# Select the row at index position 0 and 2 result = df.iloc[[0, 2]] print(result)
Output
name age country a Vishvajit 26 India c Sonu 30 India
Selecting Specific Rows and Columns by Index Position
You can pass the index position range of rows and columns to select the data.
# Select rows at positions 0 and 1, and columns at positions 0 and 1 subset = df.iloc[0:2, 0:2] print(subset)
Output
name age a Vishvajit 26 b Harsh 25
Changing Value by Using Loc
Change a Single Value
To change the value at row index 1 and column index 1 use the below code.
df.iloc[1, 1] = 26 print(df)
Change Multiple Values
To change values for multiple rows and columns use the below iloc example.
# Change the age for rows at index positions 1 and 2 df.iloc[1:3, 1] = [32, 37] print(df)
Change All Values in a Row or Column
To update all values in the first column use this Pandas iloc example.
df.iloc[:, 0] = df.iloc[:, 0].str.upper() # Ensure the o index column contains strings print(df)
Conditional Selection Using Indexing
Pandas iloc uses integer-based position that’s why it does not care about row labels but in the above DataFrame, we are using row labels so we will need to get the position of the labels that meet the condition.
Let’s see how we can do that.
Here, I have taken two examples.
Selecting rows who age is 30 or greater than 30:
# selecting rows whose age is 30 or grater than 30 condition = df['age'] >= 30 print(condition) indexes = df.index[condition].to_list() print(indexes) idx = [df.index.get_loc(label) for label in indexes] print(idx) result = df.iloc[idx] print(result)
Output
a False b False c True d True Name: age, dtype: bool ['c', 'd'] [2, 3] name age country c Sonu 30 India d Peter 33 USA
In the above example:
- Firstly, I am applying a condition to check whose age is 30 years or more.
- Second, Displaying the condition value.
- Third, I am displaying a series of boolean returns by the first line.
- Fourth, I am getting the label name whose corresponding boolean value is True in condition.
- Fourth, Display the indexes value as you can see in the output [‘c’, ‘d’].
- Fifth, Getting the integer position of labels stored in the indexes variable because iloc is an integer using integer-based position, not rows labels.
- Sixth, Displaying integer position of the labels [‘c’, ‘d’] as you can see in the Output which is [2, 3].
- And at last, getting the rows from DataFrame and displaying.
Now, let’s move on to the second example.
Selecting rows whose name contains ‘sh’:
The process of this example is also the same as above only the condition will be changed.
# selecting rows whose name contains 'sh' condition = df['name'].str.contains('sh') indexes = df.index[condition].to_list() idx = [df.index.get_loc(label) for label in indexes] result = df.iloc[idx] print(result)
Output
name age country a Vishvajit 26 India b Harsh 25 India
I am not going to explain this because it is almost the same as above. You are a very smart guy, You can understand by above explanation.
Difference Between Loc and iLoc in Pandas
Here, I have listed some important points about Pandas loc and Pandas iLoc.
- Use Pandas loc when you need to access rows or columns by their labels. It is more intuitive if you are working with labeled axes.
- Use Pandas iloc when you need to access rows or columns by their integer position. This is useful when the exact label is unknown or when you are working with a large DataFrame and are more interested in position-based operations.
- Both loc and iloc provide powerful ways to select and manipulate data in pandas DataFrames, making them essential tools for data analysis and manipulation.
Let’s summarize the difference between loc and iloc in Pandas in a table.
Features | Pandas Loc | Pandas iLoc |
---|---|---|
Access Type | Labels-based indexing | Integer-based indexing |
Indexing | Uses row and column labels | Uses integer-based positions (indices) |
Syntax | df.loc[row_label, column_label] | df.iloc[row_index, column_index] |
Row/Column Selection | it also supports boolean indexing | We can select rows and columns by their labels |
Supports Slicing | Supports slicing with labels (e.g., 'a':'c' ) | Supports slicing with integer positions(e.g., 1:3 ) |
Boolean Indexing | it supports boolean indexing | Yes, the end label is included in the slicing |
Label Errors | Raises KeyError if the label is not found | Raises IndexError if the index is out of bounds |
Includes Last Label | Yes, the end label is included in slicing | Yes, the end label is included in the slicing |
DataFrame Example | df.loc[‘a’:’c’, ‘A’:’B’] | df.iloc[1:3, 0:2] |
See Also:
- How to Add Date Column in Pandas DataFrame
- How to Get Day Name from Date in Pandas DataFrame
- How to Split String in Pandas DataFrame Column
- How to Drop Duplicate Rows in Pandas DataFrame
- How to Get Top 10 Lowest Values in Pandas DataFrame
- How to Get Top 10 Highest Values in Pandas DataFrame
- How to Display First 10 Rows in Pandas DataFrame
- How to Explode Multiple Columns in Pandas
- How to use GroupBy in Pandas DataFrame
- How To Add a Column in Pandas Dataframe
- How to Replace Column Values in Pandas DataFrame
- How to Convert Excel to JSON in Python
- How to convert DataFrame to HTML in Python
- How to Delete a Column in Pandas DataFrame
- How to convert SQL Query Result to Pandas DataFrame
- How to Convert Dictionary to Excel in Python
- How to Convert Excel to Dictionary in Python
- How to Delete Column from Pandas DataFrame
- How to Rename Column Name in Pandas DataFrame
Conclusion
So, Throughout this article, we have seen all about loc and iloc in Pandas with the help of the example. Pandas loc is used to access rows and columns by using row labels and column labels and iloc is used to access rows and columns by integer-based indexing.
Pandas loc and iloc are one of the most important concepts in Pandas to select the data from Pandas DataFrame.
If you found this article helpful, Please share and keep visiting for further Pandas tutorials.
Happy Learning…