Wednesday, 6 April 2022

P#22 Python for finding wifi password

Wifi Passwords can be found using the CLI command in windows:

 CLI: netsh wlan show profile to konw the wifi name
#      netsh wlan show profile campuswifi key=clear  > out
# notepad out ... to find out the password from key content  in security settings: section.


import subprocess

data = subprocess.check_output(['netsh', 'wlan', 'show', 'profiles']).decode('utf-8', errors="backslashreplace").split('\n')

profiles = [i.split(":")[1][1:-1] for i in data if "All User Profile" in i]

for i in profiles:
    try:

        results = subprocess.check_output(['netsh', 'wlan', 'show', 'profile', i, 'key=clear']).decode('utf-8', errors="backslashreplace").split('\n')

        results = [b.split(":")[1][1:-1] for b in results if "Key Content" in b]

        try:
            print ("{:<30}|  {:<}".format(i, results[0]))

        except IndexError:
            print ("{:<30}|  {:<}".format(i, ""))

    except subprocess.CalledProcessError:

        print ("{:<30}|  {:<}".format(i, "ENCODING ERROR"))

input("i am also.. waiting") # waiting to read

# wifiname                    |  wifipassword

Eureka... I found 👀👀👀 out the password with AMET ODL Happy Learning!!!!

Monday, 4 April 2022

P#21 3D Plots

3d Plots are easy to plot using axes3d. Let us see some examples to understand the 3d plots.

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
plt.plot([2,4,6],[4,8,12],color='Red')
plt.xlabel('xlabel')
plt.ylabel('ylabel')
plt.savefig('3dline41.png')
# plt.show()

The above one is 3d Line. we have set projection by ax = fig.add_subplot(111, projection='3d')

import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt

mpl.rcParams['legend.fontsize'] = 10
fig = plt.figure()
ax = fig.gca(projection='3d')
theta = np.linspace(-4 * np.pi, 4 * np.pi, 100)
z = np.linspace(-2, 2, 100)
r = z**2 + 1
x = r * np.sin(theta)
y = r * np.cos(theta)
ax.plot(x, y, z, label='parametric curve')
ax.legend()
plt.savefig('3dline42.png')
plt.show()

Now let us plot 3d scatter plot.

from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt

def randrange(n, vmin, vmax):
    return (vmax - vmin)*np.random.rand(n) + vmin

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

n = 100

# For each set of style and range settings, plot n random points in the box
# defined by x in [23, 32], y in [0, 100], z in [zlow, zhigh].
for c, m, zlow, zhigh in [('r', 'o', -50, -25), ('b', '^', -30, -5)]:
    xs = randrange(n, 23, 32)
    ys = randrange(n, 0, 100)
    zs = randrange(n, zlow, zhigh)
    ax.scatter(xs, ys, zs, c=c, marker=m)

ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
# plt.savefig('3dscatter43.png')
plt.show()

Please note all the labels are printed corresponding label() methods.

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Grab some test data.X, Y, Z = axes3d.get_test_data(0.05)
# Plot a basic wireframe.ax.plot_wireframe(X, Y, Z, rstride=10, cstride=10)plt.savefig('3dwireframe44.png')plt.show()

Please note the wireframe model diagram and corresponding code.

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as pltfrom matplotlib import cmfrom matplotlib.ticker import LinearLocator, FormatStrFormatterimport numpy as np

fig = plt.figure()
ax = fig.gca(projection='3d')
# Make data.X = np.arange(-5, 5, 0.25)Y = np.arange(-5, 5, 0.25)X, Y = np.meshgrid(X, Y)R = np.sqrt(X**2 + Y**2)Z = np.sin(R)

# Plot the surface.surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm,                       linewidth=0, antialiased=False)
# Customize the z axis.ax.set_zlim(-1.01, 1.01)ax.zaxis.set_major_locator(LinearLocator(10))ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))
# Add a color bar which maps values to colors.fig.colorbar(surf, shrink=0.5, aspect=5)plt.savefig('3dsurface45.png')plt.show()

from mpl_toolkits.mplot3d import Axes3D
from matplotlib.collections import PolyCollection
import matplotlib.pyplot as plt
from matplotlib import colors as mcolors
import numpy as np


fig = plt.figure()
ax = fig.gca(projection='3d')


def cc(arg):
    return mcolors.to_rgba(arg, alpha=0.6)

xs = np.arange(0, 10, 0.4)
verts = []
zs = [0.0, 1.0, 2.0, 3.0]
for z in zs:
    ys = np.random.rand(len(xs))
    ys[0], ys[-1] = 0, 0
    verts.append(list(zip(xs, ys)))

poly = PolyCollection(verts, facecolors=[cc('r'), cc('g'), cc('b'),
                                         cc('y')])
poly.set_alpha(0.7)
ax.add_collection3d(poly, zs=zs, zdir='y')

ax.set_xlabel('X')
ax.set_xlim3d(0, 10)
ax.set_ylabel('Y')
ax.set_ylim3d(-1, 4)
ax.set_zlabel('Z')
ax.set_zlim3d(0, 1)
plt.savefig('3dploy46.png')
plt.show()

See the beauty of color. Check how it is nicely handled for better visualization. happy Visualizing 3d with python, matplotlib and AMET.

Sunday, 3 April 2022

DUPLICATE REMOVAL

In any Data set, Duplicates are perennial problem in data cleaning. Let us brief how we can handle duplicates in this article.

Method 1: (Traditional ..loop way)

# Create a list with duplicates

dlist = [10,20,30,40,50,60,10,20,30]
print(dlist)
# remove duplicates
dupFreeList = []
for element in dlist:
    print(element)
    if element not in dupFreeList:
        dupFreeList.append(element)
#
print(dupFreeList) # [10, 20, 30, 40, 50, 60]

Method 2 : (Comprhensive Way)


res = []
[res.append(x) for x in dlist if x not in res]

# printing list after removal
print ("The list after removing duplicates : " + str(res)) 
# The list after removing duplicates : [10, 20, 30, 40, 50]

Method 3:

You can convert to set and then convert to list to remove duplicates.



dlistset = set(dlist)
print(dlistset)
                                # {40, 10, 50, 20, 60, 30}
dupFreeList = list(dlistset)
print(dupFreeList)              # [40, 10, 50, 20, 60, 30] # Order is not Maintained

Method 4:


from collections import OrderedDict

dupFreeList = list(OrderedDict.fromkeys(dlist))

print(dupFreeList)  # [10, 20, 30, 40, 50, 60]  # order is maintained

Here, we have imported package OrderedDict from collections and used the method list(OrderedDict.fromkeys(dlist))

Method 5: list(dict.fromkeys(df)) usage


dlist = ["10","20", "30","40","20","30"]  # String
dflist = list(dict.fromkeys(dlist))
print(dlist, dflist)
#['10', '20', '30', '40', '20', '30'] ## ['10', '20', '30', '40']


dlist = [10,20,30,40,50,10,20] # integer 
dflist = list(dict.fromkeys(dlist))
print(dlist, dflist) #[10, 20, 30, 40, 50, 10, 20] [10, 20, 30, 40, 50]

Happy Open Learning at AMET ODL!

Friday, 1 April 2022

Exploratory Data Analysis

For the quick overview we can use following methods and attributes of a DataFrame: df

df.head() # show first 5 rows

df.tail() # last 5 rows

df.columns # list all column names

df.shape # get number of rows and columns

df.info() # additional info about dataframe

df.describe() # statistical description, only for numeric values

df['col_name'].value_counts(dropna=False) # count unique values in a column

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('iris_csv.csv')
# print(df)
"""
    sepallength  sepalwidth  petallength  petalwidth           class
0            5.1         3.5          1.4         0.2     Iris-setosa
1            4.9         3.0          1.4         0.2     Iris-setosa
2            4.7         3.2          1.3         0.2     Iris-setosa
3            4.6         3.1          1.5         0.2     Iris-setosa
4            5.0         3.6          1.4         0.2     Iris-setosa
..           ...         ...          ...         ...             ...
145          6.7         3.0          5.2         2.3  Iris-virginica
146          6.3         2.5          5.0         1.9  Iris-virginica
147          6.5         3.0          5.2         2.0  Iris-virginica
148          6.2         3.4          5.4         2.3  Iris-virginica
149          5.9         3.0          5.1         1.8  Iris-virginica

[150 rows x 5 columns]
 """
# print(df.head()) # show first 5 rows
"""
   sepallength  sepalwidth  petallength  petalwidth        class
0          5.1         3.5          1.4         0.2  Iris-setosa
1          4.9         3.0          1.4         0.2  Iris-setosa
2          4.7         3.2          1.3         0.2  Iris-setosa
3          4.6         3.1          1.5         0.2  Iris-setosa
4          5.0         3.6          1.4         0.2  Iris-setosa
"""
# print(df.tail()) # last 5 rows
"""
    sepallength  sepalwidth  petallength  petalwidth           class
145          6.7         3.0          5.2         2.3  Iris-virginica
146          6.3         2.5          5.0         1.9  Iris-virginica
147          6.5         3.0          5.2         2.0  Iris-virginica
148          6.2         3.4          5.4         2.3  Iris-virginica
149          5.9         3.0          5.1         1.8  Iris-virginica
"""
#print(df.columns) # list all column names
#Index(['sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'class'], dtype='object')

# print(df.shape) # get number of rows and columns # (150, 5)

# print(df.info()) # additional info about dataframe
"""
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   sepallength  150 non-null    float64
 1   sepalwidth   150 non-null    float64
 2   petallength  150 non-null    float64
 3   petalwidth   150 non-null    float64
 4   class        150 non-null    object 
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
None
"""
# print(df.describe()) # statistical description, only for numeric values
"""
       sepallength  sepalwidth  petallength  petalwidth
count   150.000000  150.000000   150.000000  150.000000
mean      5.843333    3.054000     3.758667    1.198667
std       0.828066    0.433594     1.764420    0.763161
min       4.300000    2.000000     1.000000    0.100000
25%       5.100000    2.800000     1.600000    0.300000
50%       5.800000    3.000000     4.350000    1.300000
75%       6.400000    3.300000     5.100000    1.800000
max       7.900000    4.400000     6.900000    2.500000
"""
# df['col_name'].value_counts(dropna=False) #

#Another way to quickly check the data is by visualizing it.
#We use bar plots for discrete data counts
#and histogram for continuous.

y = df['sepalwidth']
plt.hist(y)
plt.savefig('eda1.png')
plt.show()


#
df.boxplot(column='sepallength', by='class')
plt.savefig('eda2.png')
plt.show()


## Scatter plot to depict relationship between two variables and show outliers if any

df.plot(kind='scatter', x='sepallength', y='class')
plt.savefig('edascatter.png')
plt.show()

Histogram and box plot can help to spot visually the outliers. The scatter plot shows relationship between 2 numeric variables

To check correlation between variable.

import seaborn as sns
plt.figure(figsize=(8,4))
sns.heatmap(df.corr(),cmap='Reds',annot=False)
plt.savefig('heatmap.png')
plt.show()

Above, positive correlation is represented by dark shades and negative correlation by lighter shades. Changes the value of annot=True, and the output will show you values by which features are correlated to each other in grid-cells.


k = 12
cols = df.corr().nlargest(k, 'sepallength')['sepallength'].index
cm = df[cols].corr()
plt.figure(figsize=(8,6))
sns.heatmap(cm, annot=True, cmap = 'viridis')
plt.savefig('sepallength.png')
plt.show()

From this figure we infer strong correlation between pedalwidth and pedallength since it has maximum + value. Funny! pennywise # Foolish!!!

We can check from heatmap, strong and weak correlation of all variable with their counterparts as shown above.

Happy learning at AMET -ODL!!!

Pandas #05 Pivot Tables

PIVOT TABLE

A pivot table is a similar operation that is commonly seen in spreadsheets and other programs that operate on tabular data.

The pivot table takes simple column-wise data as input, and groups the entries into a two-dimensional table that provides a multidimensional summarization of the data.

The difference between pivot tables and GroupBy can sometimes cause confusion; it helps me to think of pivot tables as essentially a multidimensional version of GroupBy aggregation.

That is, you split-apply-combine, but both the split and the combine happen across not a one-dimensional index, but across a two-dimensional grid.

We will take one sample dataset birth.csv as below:

import pandas  as pd

births = pd.read_csv('https://raw.githubusercontent.com/jakevdp/data-CDCbirths/master/births.csv ')

print(births)
"""
      year  month  day gender  births
0      1969      1  1.0      F    4046
1      1969      1  1.0      M    4440
2      1969      1  2.0      F    4454
3      1969      1  2.0      M    4548
4      1969      1  3.0      F    4548
...     ...    ...  ...    ...     ...
15542  2008     10  NaN      M  183219
15543  2008     11  NaN      F  158939
15544  2008     11  NaN      M  165468
15545  2008     12  NaN      F  173215
15546  2008     12  NaN      M  181235

[15547 rows x 5 columns]

"""

How to count male and female in a decade. So now we have to create column say decade and groupby Gender, and sum those rows.


births['decade'] = 10 * (births['year'] // 10)  # function
births.pivot_table('births', index='decade', columns='gender', aggfunc='sum')
print(births.pivot_table('births', index='decade', columns='gender', aggfunc='sum'))
"""
gender         F         M
decade                    
1960     1753634   1846572
1970    16263075  17121550
1980    18310351  19243452
1990    19479454  20420553
2000    18229309  19106428

"""

Have a deep breath. Look at the output. very easily found that male births outnumbered female births in all decades. How we will plot to explore visually

Cross Tab ( Contingency Table)

A contingency table is a tabular representation of categorical data . A contingency table usually shows frequencies for particular combinations of values of two discrete random variable s X and Y. Each cell in the table represents a mutually exclusive combination of X-Y values.

Used to summarize large data set.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('indian_food.csv') # From Kaggle
# print(df)
"""
               name  ...      region
0        Balu shahi  ...        East
1            Boondi  ...        West
2    Gajar ka halwa  ...       North
3            Ghevar  ...        West
4       Gulab jamun  ...        East
..              ...  ...         ...
250       Til Pitha  ...  North East
251         Bebinca  ...        West
252          Shufta  ...       North
253       Mawa Bati  ...     Central
254          Pinaca  ...        West

[255 rows x 9 columns]

"""
#print(df.describe())
#print(df.shape)
#print(df.columns)


# Cross tab
# Compute a simple cross-tabulation of two (or more) factors.
# By default computes a frequency table of the factors unless an array of values and
# an aggregation function are passed.
# implementing crostab on state & diet columns

print(pd.crosstab(df['state'], df['diet']))
"""
diet             non vegetarian  vegetarian
state                                      
-1                            0          24
Andhra Pradesh                0          10
Assam                        10          11
Bihar                         0           3
Chhattisgarh                  0           1
Goa                           1           2
Gujarat                       0          35
Haryana                       0           1
Jammu & Kashmir               0           2
Karnataka                     0           6
Kerala                        1           7
Madhya Pradesh                0           2
Maharashtra                   2          28
Manipur                       1           1
NCT of Delhi                  1           0
Nagaland                      1           0
Odisha                        0           7
Punjab                        4          28
Rajasthan                     0           6
Tamil Nadu                    1          19
Telangana                     1           4
Tripura                       1           0
Uttar Pradesh                 0           9
Uttarakhand                   0           1
West Bengal                   5          19
"""



print(pd.crosstab(df['region'], df['diet']))
"""
diet        non vegetarian  vegetarian
region                                
-1                       0          13
Central                  0           3
East                     5          26
North                    5          44
North East              13          12
South                    3          56
West                     3          71
"""

print(pd.crosstab(df['region'], df['diet'], normalize='all'))  # Note Normalization

"""
diet        non vegetarian  vegetarian
region                                
-1                0.000000    0.051181
Central           0.000000    0.011811
East              0.019685    0.102362
North             0.019685    0.173228
North East        0.051181    0.047244
South             0.011811    0.220472
West              0.011811    0.279528
"""

print(pd.crosstab(df['region'], df['diet'], normalize='index')) # index normalization

# Plotting
pd.crosstab(df['region'], df['diet']).plot(kind='line')


plt.savefig('regionline.png')

plt.show()

plt.show()

pd.crosstab(df['region'], df['diet']).plot(kind='bar')
plt.savefig('regionbar.png')
plt.show()



pd.crosstab(df['region'], df['diet']).plot(kind='barh')
plt.savefig('regionbarh.png')
plt.show()


print(pd.crosstab(df['flavor_profile'], df['diet']).count)
"""
<bound method DataFrame.count of diet            non vegetarian  vegetarian
flavor_profile
-1                           3          26
bitter                       0           4
sour                         0           1
spicy                       26         107
sweet                        0          88>
"""

crosstab has many uses. Here we discussed about very popular usage of crosstab.

AMET-SOLID

Wednesday, 6 April 2022

Best Python Blogs

P#22 Python for finding wifi password

Wifi Passwords can be found using the CLI command in windows:

Monday, 4 April 2022

P#21 3D Plots

P#20 Slicing

Sunday, 3 April 2022

P#19 Duplicates Handling

DUPLICATE REMOVAL

Method 1: (Traditional ..loop way)

Method 2 : (Comprhensive Way)

res = []
[res.append(x) for x in dlist if x not in res]

# printing list after removal
print ("The list after removing duplicates : " + str(res))
# The list after removing duplicates : [10, 20, 30, 40, 50]

Method 3:

Method 4:

Method 5: list(dict.fromkeys(df)) usage

Friday, 1 April 2022

P#18 EDA

Exploratory Data Analysis

Pandas #05 Pivot Tables

PIVOT TABLE

Cross Tab ( Contingency Table)

Making Prompts for Profile Web Site

Happy open and Distance Learning!

Blog Archive

Wednesday, 6 April 2022

Wifi Passwords can be found using the CLI command in windows:

Monday, 4 April 2022

Sunday, 3 April 2022

DUPLICATE REMOVAL

Method 1: (Traditional ..loop way)

Method 2 : (Comprhensive Way)

res = [][res.append(x) for x in dlist if x not in res]# printing list after removalprint ("The list after removing duplicates : " + str(res)) # The list after removing duplicates : [10, 20, 30, 40, 50]

Method 3:

Method 4:

Method 5: list(dict.fromkeys(df)) usage

Friday, 1 April 2022

Exploratory Data Analysis

PIVOT TABLE

Cross Tab ( Contingency Table)

Happy open and Distance Learning!

Blog Archive

res = []
[res.append(x) for x in dlist if x not in res]

# printing list after removal
print ("The list after removing duplicates : " + str(res))
# The list after removing duplicates : [10, 20, 30, 40, 50]