Let’s learn about creating data frames and accessing rows and columns in dataframes
How to create DataFrame and Access rows and columns from DataFrame.
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of Python programming language.
Installing Pandas Library:
pip install pandas
Installing Jupyter notebook.
By using jupyter notebook, it will be more easy to visualize data
pip install jupyterlab
- Creating DataFrames
- Attributes and Methods
- Accessing
- Slicing
Creating DataFrames.
DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.
By using pandas, we can read data from different file formats like csv,json,sql,excel etc.
How to read data from CSV file into Pandas DataFrame.
First we have to import pandas.
import pandas as pd
Then we have to specify the path of csv file.
df=pd.read_csv(‘C:datadeveloper.csv’)
Similarly if we have to read data from json file, we have to specify the path of json file.
df1=pd.read_json(‘dataDeveloper.json’)
https://gist.github.com/IndhumathyChelliah/bc74559c6d886bcba4a13e3b31b67a12
Creating DataFrame from dictionary.
We can also create Dataframe from dictionary.
import pandas as pd developer={‘firstname’:[‘Indhu’,’Karthi’,’Sarvesh’], ‘lastname’:[‘mathy’,’Palani’,’Palani’], ‘EmpId’:[12,15,21], ‘Pay’:[5000,10000,15000], ‘Skill’:[‘Python,SQL’,’Java,Hadoop’,’C,Java’]} df=pd.DataFrame(developer)
https://gist.github.com/IndhumathyChelliah/a538048aac3effdbd9d82e974f327dae
Attributes and methods:
df.shape — Returns the number of columns and rows.
df.info() -Returns the information of all columns and data types.
df.head()-Returns the top 5 rows by default.We can also specify the number of rows needed.
df.head(1) -Returns the top 1 row
df.tail()-Returns the bottom 5 rows by default. We can also specify the number of rows needed.
df.tail(1)-Returns the bottom 1 row.
df.columns– Returns list of all column names.
Index(['firstname', 'lastname', 'EmpId', 'Pay', 'Skill'], dtype='object')
Accessing Data from DataFrame:
We can access data from DataFrame. We can access rows and columns.
Accessing columns from DataFrame:
- We can access column by two ways.
df.firstname
df[‘firstname’]
If we specify, one column name,it will return Series object.
Series:
Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.).Series contain rows of single column.
If we access ‘ firstname’ column alone, it return Series object containing rows of that column.Both df[‘firstname’] and df.firstname returns the same.
df[‘firstname’]
Output:Series Object
0 Indhu 1 Karthi 2 Sarvesh Name: firstname, dtype: object
df.firstname
Output:
0 Indhu 1 Karthi 2 Sarvesh Name: firstname, dtype: object
2. We can access two or more column by specifying the list of column names. More than one column will return a DataFrame-which is filtered down DataFrame. DataFrame is a two dimensional array.
df[[‘firstname’,’lastname’]]
Output:
firstname lastname 0 Indhu mathy 1 Karthi Palani 2 Sarvesh Palani
Accessing rows from DataFrame:
We can access rows from DataFrame by 2 ways.
- loc
- iloc
loc- Accessing single or multiple rows by using integer — index based
iloc-Accessing single or multiple rows by using label-(row or column label)
iloc
df.iloc[0] — Returns row 0. It sets column as index.Returns Series object that contains value of first row data.
df.iloc[0]
Output:
firstname Indhu lastname mathy EmpId 12 Pay 5000 Skill Python,SQL Name: 0, dtype: object
df.iloc[[0,1]]- Returns row 0, row 1
df.iloc[[0,1]]
Output:
firstname lastname EmpId Pay Skill 0 Indhu mathy 12 5000 Python,SQL 1 Karthi Palani 15 10000 Java,Hadoop
df.iloc[[0,1],2] — We can specify rows and columns also.
0,1- represents row 0 ,row 1.
2 — represents second column.
It will return second column “EmpID” of row 0 and row 1.
df.iloc[[0,1],2]
Output:
0 12 1 15 Name: EmpId, dtype: int64
loc
df.loc[0]-Returns row 0 data.It sets column as index.Returns Series object that contains value of row 0 data.
df.loc[0]
Output:
firstname Indhu lastname mathy EmpId 12 Pay 5000 Skill Python,SQL Name: 0, dtype: object
df.loc[[0,1]]- Returns row 0, row 1.
df.loc[[0,1]]
Output:
firstname lastname EmpId Pay Skill 0 Indhu mathy 12 5000 Python,SQL 1 Karthi Palani 15 10000 Java,Hadoop
We can access specific columns alone from specific rows.
df.loc[[0,1],['firstname','lastname']] Output:
firstname lastname 0 Indhu mathy 1 Karthi Palani
Slicing:
Slicing using the [] operator selects a set of rows and/or columns from a DataFrame. To slice out a set of rows, you use the following syntax: data[start:stop].
df.iloc[0:2] Output: firstname lastname EmpId Pay Skill 0 Indhu mathy 12 5000 Python,SQL 1 Karthi Palani 15 10000 Java,Hadoop
df.iloc[:] — Returns all rows.
df.iloc[:1]-Returns Row 0
df.iloc[1:]-Returns row 1 and row 2.
Resources:
Make a one-time donation
Make a monthly donation
Make a yearly donation
Choose an amount
Or enter a custom amount
Your contribution is appreciated.
Your contribution is appreciated.
Your contribution is appreciated.
Buy Me a CoffeeBuy Me a CoffeeBuy Me a Coffee