How to Create Pandas DataFrame in Python – Data to Fish

In this short guide, you will see two different methods

to create Pandas DataFrame: Writing the values in Python to create the DataFrame Importing the values from a file

  • (such as a CSV file) and then creating
  • the DataFrame in Python based on the imported values Method 1: Write values in Python to create Pandas

  • DataFrame To create Pandas DataFrame in

Python

, You can follow this generic template

: import pandas as pd data = {‘first_column’: [‘first_value’, ‘second_value’, …], ‘second_column’: [‘first_value’, ‘second_value’, …], …. } df = pd. DataFrame(data) printing(df) Note that you

don’t need to use quotation marks around numeric values (unless you want to capture those values as strings).

Now let’s see how to apply the above template using a simple example.

To get started, let’s say you have the following product data and you want to capture that data in Python using Pandas DataFrame:

product_name price laptop 1200 printer 150 tablet 300 desktop 450 chair 200

You can then use the following code to create the DataFrame for our example

: import pandas as pd data = {‘product_name’: [‘laptop’, ‘printer’, ‘tablet’, ‘desk’, ‘chair’], ‘price‘: [1200, 150, 300, 450, 200] } df = pd. DataFrame(data) print(df)

Run the code in Python and you will get the following DataFrame:

product_name price 0 laptop 1200 1 printer 150 2 tablets 300 3 desktop 450 4 chair 200

You may have noticed that each row is represented by a number (also known as an index) that starts from 0. Alternatively, you can assign another value/name to represent each row.

For example, in the following code, the index=

[‘product_1’, ‘product_2’, ‘product_3’, ‘product_4’, ‘product_5’] was added: import pandas as pd data = {‘product_name’: [‘laptop’, ‘printer’, ‘tablet’, ‘desk’, ‘chair’], ‘price’: [1200, 150, 300, 450, 200] } df = pd. DataFrame(data, index=[‘product_1’, ‘product_2’, ‘product_3’, ‘product_4’, ‘product_5’]) print(df)

You will now see the newly assigned index (highlighted in yellow):

product_name price product_1 laptop 1200 product_2 printer 150 product_3 tablet 300 product_4 desktop 450 product_5 chair 200

Let’s now review the second method of importing the values into Python to create the DataFrame.

Method 2: importing values from

a CSV file to create Pandas DataFrame

You can use the following template to import a CSV file into Python to create your DataFrame: import

pandas as pd data = pd.read_csv(r’Path where the CSV file is stored\File name.csv’) df = pd. DataFrame(data) print(df)

Suppose you have the following data stored in a

CSV file (where the name of the CSV file is ‘products’): product_name price portable 1200 printer 150 tablet 300 desktop 450 chair 200

In the Python code below, you will need to rename the path to reflect the location where the CSV file is stored on your computer.

For example, suppose the CSV file is stored in the following path

: ‘C:\Users\Ron\Desktop\products.csv’

Here is the complete Python code for our example

: import pandas as pd data = pd.read_csv(r’C:\Users\Ron\Desktop\

products.csv’

) df = pd. DataFrame(data) print(df) As before, you will get the same Pandas DataFrame in Python: product_name price 0 laptop 1200 1 printer 150 2 tablets 300 3 desktop 450 4 chairs 200

You can also create the same DataFrame by importing an Excel file into Python using Pandas.

Find the maximum value in the DataFrame Once you have your values

in the DataFrame,

you

can perform a variety of operations. For example, you can calculate statistics using Pandas.

For example, let’s say you want to find the maximum price among all products within the DataFrame.

Obviously, you can derive this value simply by looking at the dataset, but the method presented below would work for much larger datasets.

To get the maximum price for our example, you’ll need to add the following part to the Python code (and then print the results):

max_price = df[‘price’].max()

Here is the full Python code

: import pandas as pd data = {‘product_name’: [‘laptop’, ‘printer’, ‘tablet’, ‘desk’, ‘chair’], ‘price’: [1200, 150, 300, 450, 200] } df = pd. DataFrame(data) max_price = df[‘price’].max() print(max_price) Once you run the code, you will get the value of 1200

, which is in fact the maximum price: 1200

You can consult the Pandas documentation for more information on how to create a DataFrame.