Thursday, 21 April 2022

DS #3 Types of DSA

 Types of Analytics





Descriptive [ What]

 - Depends upon past and historical data 

 - Metrics : mean, Media, Mode, SD, Quartiles, 

   aggregation and summary statistics

 

Diagnostic [Why]

   - What happened

   - Rooted in the past and based on probability

   - techniques: PCA, Sensitivity Analysis, Regression Analysis    - Identify Outliers, Isolate Patterns, Uncover relationships


Predictive [IF]

   - What might happen if specific occur

   - Based on probability

   - Techniques : Quantities, predictive modeling, Machine Learning Algorithms



Prescriptive:[How]

   - Best Action for better performance

   - Future SoP

   - Simulation Anlysis, Recommendation Systems, AI & NN


DS#4 Types of Data

 Types of Data



Examples of Nominal Data :

  • Colour of hair (Blonde, red, Brown, Black, etc.)
  • Marital status (Single, Widowed, Married)
  • Nationality (Indian, German, American)
  • Gender (Male, Female, Others)
  • Eye Color (Black, Brown, etc.)

Examples of Ordinal Data :

  • When companies ask for feedback, experience, or satisfaction on a scale of 1 to 10
  • Letter grades in the exam (A, B, C, D, etc.)
  • Ranking of peoples in a competition (First, Second, Third, etc.)
  • Economic Status (High, Medium, and Low)
  • Education Level (Higher, Secondary, Primary)


Difference between Nominal and Ordinal Data

Nominal DataOrdinal Data
Nominal data can’t be quantified, neither they have any intrinsic orderingOrdinal data gives some kind of sequential order by their position on the scale
Nominal data is qualitative data or categorical dataOrdinal data is said to be “in-between” qualitative data and quantitative data
They don’t provide any quantitative value, neither we can perform any arithmetical operationThey provide sequence and can assign numbers to ordinal data but cannot perform the arithmetical operation
Nominal data cannot be used to compare with one anotherOrdinal data can help to compare one item with another by ranking or ordering
Examples: Eye colour, housing style, gender, hair colour, religion, marital status, ethnicity, etcExamples: Economic status, customer satisfaction, education level, letter grades, etc 

Quantitative Data

Discrete Data

Examples of Discrete Data : 

  • Total numbers of students present in a class
  • Cost of a cell phone
  • Numbers of employees in a company
  • The total number of players who participated in a competition
  • Days in a week

Continuous Data

Examples of Continuous Data : 

  • Height of a person
  • Speed of a vehicle
  • “Time-taken” to finish the work 
  • Wi-Fi Frequency
  • Market share price

Difference between Discrete and Continuous Data

Discrete DataContinuous Data
Discrete data are countable and finite; they are whole numbers or integersContinuous data are measurable; they are in the form of fraction or decimal
Discrete data are represented mainly by bar graphsContinuous data are represented in the form of a histogram
The values cannot be divided into subdivisions into smaller piecesThe values can be divided into subdivisions into smaller pieces
Discrete data have spaces between the valuesContinuous data are in the form of a continuous sequence
Examples: Total students in a class, number of days in a week, size of a shoe, etcExample: Temperature of room, the weight of a person, length of an object, etc

Tuesday, 19 April 2022

DS Syllabus for AMET

Course Code

:

UEAI002

Course Title

:

Introduction to Data Science

Number of Credits

:

3 (L: 2; T: 0; P: 2)

Course Category

:

DAS

 

Course Objective:

·         To Provide the knowledge and expertise to become a proficient data scientist;

·         Demonstrate and understanding of statistics and machine learning concepts that        

        are vital for data science;

·         Produce Python code to statistically analyse a dataset;

·         Critically evaluate data visualisations based on their design and use 

        For  communicating stories from data;

 

Course Contents:

Module 1: [ 7 Lectures]

 

Introduction to Data Science,

 

 

Different Sectors using Data science

https://mail.google.com/mail/u/0/?tab=rm&ogbl#inbox/FMfcgzGpFWQsDnsGDsLLmPxJHTLkDfsz,

 

Purpose and Components of Python in Data Science.

 https://towardsdatascience.com/top-10-reasons-why-you-need-to-learn-python-as-a-data-scientist-e3d26539ec00

 

 

 

 

 

Module 2: [ 7 Lectures]

Data Analytics Process, Knowledge Check, Exploratory Data Analysis (EDA), EDA- Quantitative technique, EDA- Graphical Technique, Data Analytics Conclusion and Predictions.

Module 3: [ 11 Lectures]

Feature Generation and Feature Selection (Extracting Meaning from Data)- Motivating application: user (customer) retention- Feature Generation (brainstorming, role of domain expertise, and place for imagination)- Feature Selection algorithms.

Module 4: [ 10 Lectures]

Data Visualization- Basic principles, ideas and tools for data visualization, Examples of inspiring (industry) projects- Exercise: create your own visualization of a complex dataset.

Module 5: [ 7 Lectures]

Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists.

Lab Work:

1.  Python Environment setup and Essentials.

2.  Mathematical computing with Python (NumPy).

3.  Scientific Computing with Python (SciPy).

4.  Data Manipulation with Pandas.

5.  Prediction using Scikit-Learn

6.  Data Visualization in python using matplotlib


Text Books/References:

 

1.      Business Analytics: The Science of Data - Driven Decision Making, U Dinesh Kumar, John Wiley & Sons.

 

2.      Introducing Data Science: Big Data, Machine Learning, and More, Using Python Tools, Davy Cielen, John Wiley & Sons.

 

3.      Joel Grus, Data Science from Scratch, Shroff Publisher/O’Reilly Publisher Media

 

4.      Annalyn Ng, Kenneth Soo, Numsense! Data Science for the Layman, Shroff Publisher Publisher

 

5.      Cathy O’Neil and Rachel Schutt. Doing Data Science, Straight Talk from The Frontline. O’Reilly Publisher.

 

6.      Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press.

 

7.      Jake VanderPlas, Python Data Science Handbook, Shroff Publisher/O’Reilly Publisher Media.

 

8.      Philipp Janert, Data Analysis with Open Source Tools, Shroff Publisher/O’Reilly Publisher Media.

 

Tutorials

https://realpython.com/tutorials/data-science/

https://www.w3schools.com/datascience/

https://learn.theprogrammingfoundation.org/getting_started/intro_data_science/

 

 

 

Sunday, 17 April 2022

P#01.1 Fundas Opeartors

 # Python divides the operators in the following groups:
# Arithmetic operators   + -  * / % ** // 
# Assignment operators  =   +=  -= *= /= %= //= **=  &= |= ^= >>= <<= 
# Comparison operators == != > < >= <=
# Logical operators and or not 
# Identity operators is is not 
# Membership operators in not in 
# Bitwise operators & |  ^ ~ << >>

# Arithmetic operators   + -  * / % ** // 


x,y = 911,247

print('x=',x)

print('y=',y)

print()


print('x+y=',x+y)

print('x-y=',x-y)

print('x*y=',x*y)

print('x/y=',x/y)


print()

print('3**2=',3**2) #Exponentiation


print()

print('x%y=',x%y) #Modulus

print('x//y=',x//y)# FloorDivision


z = (y * ( x // y) + (x % y))

print(z)        # check Modulus + floor division



Output:

x= 911
y= 247
x+y= 1158
x-y= 664
x*y= 225017
x/y= 3.688259109311741
3**2= 9
x%y= 170
x//y= 3
911

# Assignment operators = += -= *= /= %= //= **= &= |= ^= >>= <<=
a = 5
b = 3
print(a,b)
print()
a += b
print('a += b ' ,a)
a -= b
print('a -= b ' ,a)
a *= b
print('a -= 5 ' ,a)
a /= b
print('a /= b ' ,a)
print()
a **= b
print('a **= b ' ,a)
a %= b
print('a %= b ' ,a)
a //= b
print('a //= b ' ,a)
print()
a=5
b=3
a &= b
print('a &= b ' ,a)
a |= b
print('a |= b ' ,a)
a ^= b
print('a ^= b ' ,a)
print()
a=5
b=3
a <<=b
print('a <<= b ' ,a)
a >>=b
print('a >>= b ' ,a)
output
5 3
a += b 8
a -= b 5
a -= 5 15
a /= b 5.0
a **= b 125.0
a %= b 2.0
a //= b 0.0
a &= b 1
a |= b 3
a ^= b 0
a <<= b 40
a >>= b 5
--------------------------------
# Comparison Operators
a = 15
b = 13
print(a,b)
print()
print('a == b ' ,a==b)
print('a != b ' ,a!=b)
print('a > b ' ,a>b)
print('a < b ' ,a<b)
print('a >= b ' ,a>=b)
print('a <= b ' ,a<=b)
print()
output:
15 13
a == b False
a != b True
a > b True
a < b False
a >= b True
a <= b False
-----------------------------
#Logical operators and or not
a=6
print ('a>4 and a <11 ',a>4 and a<11)
print()
print('a>4 or a <11 ', a>4 or a<11)
print()
print('not(a>4 and a <11) ', not(a>4 and a<11))
print()
output:
a>4 and a <11 True
a>4 or a <11 True
not(a>4 and a <11) False


P#26 natural Language processing

 NLTK

Very easy to tokenize the statement with NLTK of python


import nltk

word_data ='Sitting pretty,impatient, work from home,
we can work from home'

tokens = nltk.word_tokenize(word_data)

print(tokens)

#['Sitting', 'pretty', ',', 'impatient', ',', 'work',
'from', 'home', ',', 'we', 'can', 'work', 'from', 'home']


Another Exercise:


sentence_data = "Johhny Jonny Yes papa. 
Eating Sugar No. NO. papa "

nltk_tokens = nltk.sent_tokenize(sentence_data)

print(nltk_tokens) # ['Johhny Jonny Yes papa.',
'Eating Sugar No.', 'NO.', 'papa']




Happy Learning @AMET ODL!!!


Making Prompts for Profile Web Site

  Prompt: Can you create prompt to craft better draft in a given topic. Response: Sure! Could you please specify the topic for which you...