Friday 15 April 2022

P#24 Pandas loc() and iloc()

Differences between loc() and iloc()

With these methods, we can simulate SQL for filtering rows or columns. Let us dive in to this exercise. 

import pandas as pd

df = pd.read_csv('uforeports.csv')

# df =pd.read_csv('http://bit.ly/uforeports') Alternateively directly from web

print(df.shape) # to print rows xcolumns (18241, 5)

print(df.describe)
"""
<bound method NDFrame.describe of City Colors Reported ... State Time
0 Ithaca NaN ... NY 6/1/1930 22:00
1 Willingboro NaN ... NJ 6/30/1930 20:00
2 Holyoke NaN ... CO 2/15/1931 14:00
3 Abilene NaN ... KS 6/1/1931 13:00
4 New York Worlds Fair NaN ... NY 4/18/1933 19:00
... ... ... ... ... ...
18236 Grant Park NaN ... IL 12/31/2000 23:00
18237 Spirit Lake NaN ... IA 12/31/2000 23:00
18238 Eagle River NaN ... WI 12/31/2000 23:45
18239 Eagle River RED ... WI 12/31/2000 23:45
18240 Ybor NaN ... FL 12/31/2000 23:59

[18241 rows x 5 columns]>
"""



# print(df.head(5)) # to Print top 5 rows
"""
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00
"""

# print(df.tail(5)) # to Print bottom 5 rows

"""
City Colors Reported Shape Reported State Time
18236 Grant Park NaN TRIANGLE IL 12/31/2000 23:00
18237 Spirit Lake NaN DISK IA 12/31/2000 23:00
18238 Eagle River NaN NaN WI 12/31/2000 23:45
18239 Eagle River RED LIGHT WI 12/31/2000 23:45
18240 Ybor NaN OVAL FL 12/31/2000 23:59
"""


# To filter rows()

# print(df.loc[6])

"""
City Crater Lake
Colors Reported NaN
Shape Reported CIRCLE
State CA
Time 6/15/1935 0:00
Name: 6, dtype: object
"""

print(df.loc[0:2])

"""
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00

"""

print(df.loc[0:2, :]) # same output as above


print(df.loc[0:2 , 'City':'State']) # To filter rows and display columns from city to state Time column dropped
"""
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
"""

print(df.loc[: , 'City':'State']) # To print all rows
"""
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
... ... ... ... ...
18236 Grant Park NaN TRIANGLE IL
18237 Spirit Lake NaN DISK IA
18238 Eagle River NaN NaN WI
18239 Eagle River RED LIGHT WI
18240 Ybor NaN OVAL FL

[18241 rows x 4 columns]

"""
print(df.head(3).drop('Time', axis = 1))

"""
Name: City, Length: 18241, dtype: object
City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
"""
print(df.loc[: , 'City':'State']) # To print all rows drop timecolumn

print(df.loc[: , 'City']) # To print all rows and city column only

"""
0 Ithaca
1 Willingboro
2 Holyoke
3 Abilene
4 New York Worlds Fair
...
18236 Grant Park
18237 Spirit Lake
18238 Eagle River
18239 Eagle River
18240 Ybor
Name: City, Length: 18241, dtype: object

"""

print(df[df.City == 'Abilene']) # select Filtering by value

"""
City Colors Reported Shape Reported State Time
3 Abilene NaN DISK KS 6/1/1931 13:00
6654 Abilene NaN TRIANGLE TX 9/1/1991 1:00
8357 Abilene NaN SPHERE TX 7/15/1995 0:00
8783 Abilene NaN NaN KS 10/14/1995 23:20
10883 Abilene NaN NaN TX 10/19/1997 20:45

"""

print(df[df.City == 'Abilene'].State) # Filter by value, select column

"""
3 KS
6654 TX
8357 TX
8783 KS
10883 TX
"""

print(df.columns)
# Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time'], dtype='object')

print(df.iloc[0:3 , 0:3]) #Filter 3 rows and 3 columns

""" City Colors Reported Shape Reported
0 Ithaca NaN TRIANGLE
1 Willingboro NaN OTHER
2 Holyoke NaN OVAL

"""

print(df.iloc[0:3 , :]) #Filter 3 rows and all columns
"""
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00

"""

print(df.loc[:, ['City','Time']]) # to filter only two columns for all rows

"""
City Time
0 Ithaca 6/1/1930 22:00
1 Willingboro 6/30/1930 20:00
2 Holyoke 2/15/1931 14:00
3 Abilene 6/1/1931 13:00
4 New York Worlds Fair 4/18/1933 19:00
... ... ...
18236 Grant Park 12/31/2000 23:00
18237 Spirit Lake 12/31/2000 23:00
18238 Eagle River 12/31/2000 23:45
18239 Eagle River 12/31/2000 23:45
18240 Ybor 12/31/2000 23:59

[18241 rows x 2 columns]
"""

df1 = pd.read_csv('http://bit.ly/uforeports', index_col='City')

print(df1.head(5))

"""
Colors Reported Shape Reported State Time
City
Ithaca NaN TRIANGLE NY 6/1/1930 22:00
Willingboro NaN OTHER NJ 6/30/1930 20:00
Holyoke NaN OVAL CO 2/15/1931 14:00
Abilene NaN DISK KS 6/1/1931 13:00
New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00

"""

 Happy Learning @AMET!!!

No comments:

Post a Comment

Making Prompts for Profile Web Site

  Prompt: Can you create prompt to craft better draft in a given topic. Response: Sure! Could you please specify the topic for which you...