In this article, we will see multiple ways to create Pandas DataFrame from Dictionary with the help of the examples. As we know that Dictionary is one of the most popular data type in Python programming language which store the data in the form of key-value pair. In real-time applications most of the time we create Pandas DataFrame by reading CSV files or other data sources however sometimes we require it to read Pandas DataFrame using a dictionary.
To understand this article, you must have knowledge of Python Dictionary.
There are two ways to create Pandas DataFrame from Dictionary:
- Using DataFrame constructor
- Using from_dict() method
Now, Let’s explore all the ways to create Pandas DataFrame using Python Dictionary.
Headings of Contents
- 1 Create Pandas DataFrame using DataFrame() Constructor
- 2 Create Pandas DataFrame with Required Columns
- 3 Create Pandas DataFrame with user-defined indexes
- 4 Create Pandas DataFrame from Nested Dictionary
- 5 Create Pandas DataFrame from Dictionary with Single Value
- 6 Create Pandas DataFrame from List of Dictionaries
- 7 Pandas from_dict() Function
- 8 Conclusion
Create Pandas DataFrame using DataFrame() Constructor
A DataFrame() Constructor is defined inside the Python Pandas package that can be used to create Pandas DataFrame from Python Dictionary.
In the below example, I have prepared a sample Python dictionary and then created a Pandas DataFrame from that dictionary as you can see.
from pandas import DataFrame
dictionary = {
"name": ["John", "Vishvajit", "Harsh", "Harshita"],
"age": [20, 25, 31, 25],
"gender": ['Male', 'Male', 'Male', 'Female']
}
df = DataFrame(data=dictionary)
print(df)
Create Pandas DataFrame with Required Columns
When we convert the whole Python dictionary into Pandas DataFrame times we want to get some specific columns in Pandas DataFrame however, Dictionary might have more keys.
For example, in the above dictionary, there are three keys name, age, and gender but we want only name and age into the dataframe to achieve this kind of requirement we have to pass all the required columns as a list into the columns parameter of the Pandas DataFrame() Constructor like columns = [‘name’, ‘age’].
Let’s see how can we do that.
from pandas import DataFrame
dictionary = {
"name": ["John", "Vishvajit", "Harsh", "Harshita"],
"age": [20, 25, 31, 25],
"gender": ['Male', 'Male', 'Male', 'Female']
}
df = DataFrame(data=dictionary, columns=['name', 'age'])
print(df)
Create Pandas DataFrame with user-defined indexes
As you can see, In all the above DataFrame each row has an index number which is used to identify that particular row of the DataFrame. By default, pandas generate an index from 0 to total rows or lines -1 but we can create Pandas DataFrame with our own defined indexes.
To use a defined index, first, you have to define a list of indexes, It is not mandatory to pass only integer values into the DataFrame index, you can pass anything as you wish but you have to remember one thing during the creation of Indexes, Length of indexes must be same as total number of rows otherwise you will get an error. After defining indexes into the list like [‘index1’, ‘index2’, ‘index3’, …], pass that list to the index parameter of the DataFrame constructor.
For Example, I have defined a user-defined index with a length of 5 because Pandas DataFrame has a total of four rows.
from pandas import DataFrame
dictionary = {
"name": ["John", "Vishvajit", "Harsh", "Harshita"],
"age": [20, 25, 31, 25],
"gender": ['Male', 'Male', 'Male', 'Female']
}
index = ['first', 'second', 'third', 'fourth']
df = DataFrame(data=dictionary, index=index)
print(df)
Create Pandas DataFrame from Nested Dictionary
In Python, Nested Dictionary means, A dictionary inside another dictionary. Sometimes we might have this kind of dictionary. In that scenario, we can use the DataFrame() constructor along with the transpose() method.
In all the above examples, we have seen simple Python dictionaries to create Pandas DataFrame but sometimes we might have hierarchal Python dictionaries to create pandas
DataFrame.
For example, In the below Python dictionary, we have keys 0, 1, and 2 and the values of these keys are also Python dictionary that’s it is called a Nested Python dictionary.
You can have any type of nested dictionary.
Let’s create Pandas DataFrame from a Nested Python Dictionary.
from pandas import DataFrame
dictionary = {
0: {
"name": "Vishvajit",
"gender": "Male",
"age": 25
},
1: {
"name": "Vinay",
"gender": "Male",
"age": 20
},
2: {
"name": "Harshita",
"gender": "Female",
"age": 24
}
}
df = DataFrame(data=dictionary)
df.transpose()
Create Pandas DataFrame from Dictionary with Single Value
If we have a dictionary with a single value for each key, Then we can also use the Pandas DataFrame constructor to create dict into Pandas DataFrame.
Remember, You have to provide an index to the DataFrame constructor if the dictionary has a single value for each key otherwise it will raise an error.
from pandas import DataFrame
dictionary = {
"name": "Vishvajit",
"age": 25,
"gender": 'Male'
}
df = DataFrame(data=dictionary, index=[0])
print(df)
Create Pandas DataFrame from List of Dictionaries
Most of the time we have to convert a list of dictionaries into Pandas DataFrame. Let me take an example so that you can understand easily. Suppose we have data of companies in the form of a List of dictionaries and each dictionary has different information some dictionary has an employee’s phone number, some has email, and so on. It’s not mandatory that, each dictionary should have the same keys it might be different.
Pandas DataFrame is also capable of converting a list of dictionaries into Pandas DatFrame. The DataFrame constructor treats the key of all the dictionaries as a column of the resultant DataFrame and handles missing keys by adding the NaN value of missing keys or columns in dictionaries.
from pandas import DataFrame
dictionary = [
{
"name": "Vishvajit",
"gender": "Male",
"age": 25,
"salary": 20000
},
{
"name": "Vinay",
"gender": "Male",
"age": 20,
"country": 'India'
},
{
"name": "Harshita",
"gender": "Female",
"age": 24,
"designation": "Developer"
}
]
df = DataFrame(data=dictionary)
print(df)
As you can see in the DataFrame, the NaN value has been assigned for missing columns or keys.
Pandas from_dict() Function
This is another way to create a Pandas DataFrame from the dictionary. It takes some important parameters that can be used in different cases based on the nature of the Python dictionary.
Let’s see all those parameters of the from_dict() method.
DataFrame.from_dict(data, orient='columns', dtype=None, columns=None)[source]
Parameters:
data:- It is the first parameter you can see. It must be a Python dictionary.
orient:- {‘columns’, ‘index’, ‘tight’}, default ‘columns’. It indicates the orientation of the data. If the keys of the dictionary should be columns of the resulting DataFrame, pass ‘columns’ which is the default value of the orient parameter and if the keys of the dictionary should be rows of the resulting DataFrame then pass ‘index’ and ‘tight’ would be in the case when we want to create MultiIndex DataFrame.orient parameter of from_dict() is the optional parameter.
dtype:- It is also an optional parameter that is used to forcibly convert the DataFrame during the creation of Pandas DataFrame from the dictionary. If it is not passed then infer the schema of the input data.
columns:- It will be used in the case of orient=’index’. from_dict() will raise a ValueError if it is passed with orient=’columns’ or orient=’tight’.
Now, let’s see some examples of the from_dict() method to create Pandas DataFrame from the dictionary.
Create Pandas DataFrame from Dictionary using from_dict()
I have created a Python dictionary with some keys and the value of each key is an array-like or Python list that stores some values. Now I will from_dict() method to create Pandas DataFrame from the dictionary in Python.
from pandas import DataFrame
dictionary = {
"name": ["Ankit", "Harsh", "Pankaj", "Anshika"],
"age": [20, 24, 30, 25],
"gender": ['Male', 'Male', 'Male', 'Female']
}
df = DataFrame.from_dict(dictionary)
print(df)
After executing the above code, the Following DataFrame will be created.
name age gender
0 Ankit 20 Male
1 Harsh 24 Male
2 Pankaj 30 Male
3 Anshika 25 Female
As you can see in the above resulting Python Pandas DataFrame, all the keys of the dictionary have converted into columns of the DataFrame, this is possible because of value of the orient parameter is passed ‘columns’ by default.
What happens, if we have a dictionary with a different structure, let’s see.
Convert Dictionary to Pandas DataFrame with orient=’index’
Now let’s assume we have a Python dictionary whose keys should be rows of the resulting DataFrame, in that scenario, we will use orient=’index‘ in the from_dict() method.
from pandas import DataFrame
dictionary = {
"row_1": ["Ankit", 20, 'Male'],
"row_2": ["Harsh", 24, 'Male'],
"row_3": ["Pankaj", 30, 'Male'],
"row_4": ["Anshika", 25, 'Female'],
}
df = DataFrame.from_dict(dictionary, orient='index')
print(df)
Output
0 1 2
row_1 Ankit 20 Male
row_2 Harsh 24 Male
row_3 Pankaj 30 Male
row_4 Anshika 25 Female
As you can see in the above DataFrame, By default, column names are started from 0 but this is not used in real-life applications instead columns in numbers should be relatable like name, age, etc.
To provide column names we will have to pass column names as a list into the column parameter of the from_dict() method.
So this is how you can use the from_dict() method in order to create Pandas DataFrame from the dictionary.
Useful Pandas Tutorials:
- How To Add a Column in Pandas Dataframe
- How to Replace Column Values in Pandas DataFrame
- How to Convert Excel to JSON in Python
- How to convert DataFrame to HTML in Python
- How to Delete a Column in Pandas DataFrame
- How to convert SQL Query Result to Pandas DataFrame
- How to Convert Dictionary to Excel in Python
- How to Convert Excel to Dictionary in Python
Conclusion
So, in this tutorial, we have seen how to create Pandas DataFrame from the dictionary with the help of multiple ways. You can choose anyone based on your requirements because all the ways are feasible in order to convert a dictionary to a Pandas data frame. if you want to deal with orientation then you can go with the from_dict() method otherwise DataFrame constructor is enough to create Pandas DataFrame from the dictionary.
If you found this article helpful, please share and keep visiting for further tutorials.
Happy Coding…..