Monday 25 April 2022

carsales.csv

 "Month","Sales"

"1960-01",6550
"1960-02",8728
"1960-03",12026
"1960-04",14395
"1960-05",14587
"1960-06",13791
"1960-07",9498
"1960-08",8251
"1960-09",7049
"1960-10",9545
"1960-11",9364
"1960-12",8456
"1961-01",7237
"1961-02",9374
"1961-03",11837
"1961-04",13784
"1961-05",15926
"1961-06",13821
"1961-07",11143
"1961-08",7975
"1961-09",7610
"1961-10",10015
"1961-11",12759
"1961-12",8816
"1962-01",10677
"1962-02",10947
"1962-03",15200
"1962-04",17010
"1962-05",20900
"1962-06",16205
"1962-07",12143
"1962-08",8997
"1962-09",5568
"1962-10",11474
"1962-11",12256
"1962-12",10583
"1963-01",10862
"1963-02",10965
"1963-03",14405
"1963-04",20379
"1963-05",20128
"1963-06",17816
"1963-07",12268
"1963-08",8642
"1963-09",7962
"1963-10",13932
"1963-11",15936
"1963-12",12628
"1964-01",12267
"1964-02",12470
"1964-03",18944
"1964-04",21259
"1964-05",22015
"1964-06",18581
"1964-07",15175
"1964-08",10306
"1964-09",10792
"1964-10",14752
"1964-11",13754
"1964-12",11738
"1965-01",12181
"1965-02",12965
"1965-03",19990
"1965-04",23125
"1965-05",23541
"1965-06",21247
"1965-07",15189
"1965-08",14767
"1965-09",10895
"1965-10",17130
"1965-11",17697
"1965-12",16611
"1966-01",12674
"1966-02",12760
"1966-03",20249
"1966-04",22135
"1966-05",20677
"1966-06",19933
"1966-07",15388
"1966-08",15113
"1966-09",13401
"1966-10",16135
"1966-11",17562
"1966-12",14720
"1967-01",12225
"1967-02",11608
"1967-03",20985
"1967-04",19692
"1967-05",24081
"1967-06",22114
"1967-07",14220
"1967-08",13434
"1967-09",13598
"1967-10",17187
"1967-11",16119
"1967-12",13713
"1968-01",13210
"1968-02",14251
"1968-03",20139
"1968-04",21725
"1968-05",26099
"1968-06",21084
"1968-07",18024
"1968-08",16722
"1968-09",14385
"1968-10",21342
"1968-11",17180
"1968-12",14577

0.0#000 Environments

Popular Latest Language Learning Environments


IDE : Integrated Development Environment 

VS Code

Jupyter Notebook


IDLE : Integrated Development and Learning Environment

 PyCharm 10.0 For windows : 

Online Without any download & Installation

Python Shell: Python Shell 

Multi Language Online Compiler & Debugger Online Compiler & Debugger


ML#1 Linear Regression

 Machine Learning using Linear Regression


import pandas as pd
df = pd.read_csv('TSLA.csv')
import pandas as pd

# Load .csv file as DataFrame
df = pd.read_csv('TSLA.csv')

# print the data
print(df)

# print some summary statistics
print(df.describe())


# Indexing data using a DatetimeIndex
df.set_index(pd.DatetimeIndex(df['Date']), inplace=True)

# Keep only the 'Adj Close' Value
df = df[['Adj Close']]

# Re-inspect data
print(df)

print(df.info())

import matplotlib.pyplot as plt
plt.plot(df[['Adj Close']])
plt.title('TESLA Share Price')
plt.xlabel('Year')
plt.ylabel('Adj Close Volume')
plt.savefig('TESLA.png')
plt.show()

import pandas_ta
df.ta.ema(close='Adj Close',length=10, append=True)

# will give Nan Values for First 10 Rows
# We have to fillup data

df = df.iloc[10:]

print(df.head(10))

#
plt.plot(df['Adj Close'])
plt.plot(df['EMA_10'])
plt.xlabel('')
plt.ylabel('')
plt.title('TESLA Share Price with EMA overlaid')
plt.legend(["blue", "orange"], loc=0)
plt.savefig('TESLA_EMA_10.png')
plt.show()




from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df[['Adj Close']], df[['EMA_10']], test_size=.2)

from sklearn.linear_model import LinearRegression
# Create Regression Model
model = LinearRegression()
# Train the model
model.fit(X_train, y_train)
# Use model to make predictions
y_pred = model.predict(X_test)


#Test Set
print(X_test.describe())


# Training set
print(X_train.describe())


from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
# Printout relevant metrics
print("Model Coefficients:", model.coef_) # [[0.98176283]]
print("Mean Absolute Error:", mean_absolute_error(y_test, y_pred)) #6.21531704292117
print("Coefficient of Determination:", r2_score(y_test, y_pred)) #0.9942788743625711

Here is a simple Regression ML Model to find out the MAE and  R-Squared error.



Thursday 21 April 2022

DS #3 Types of DSA

 Types of Analytics





Descriptive [ What]

 - Depends upon past and historical data 

 - Metrics : mean, Media, Mode, SD, Quartiles, 

   aggregation and summary statistics

 

Diagnostic [Why]

   - What happened

   - Rooted in the past and based on probability

   - techniques: PCA, Sensitivity Analysis, Regression Analysis    - Identify Outliers, Isolate Patterns, Uncover relationships


Predictive [IF]

   - What might happen if specific occur

   - Based on probability

   - Techniques : Quantities, predictive modeling, Machine Learning Algorithms



Prescriptive:[How]

   - Best Action for better performance

   - Future SoP

   - Simulation Anlysis, Recommendation Systems, AI & NN


DS#4 Types of Data

 Types of Data



Examples of Nominal Data :

  • Colour of hair (Blonde, red, Brown, Black, etc.)
  • Marital status (Single, Widowed, Married)
  • Nationality (Indian, German, American)
  • Gender (Male, Female, Others)
  • Eye Color (Black, Brown, etc.)

Examples of Ordinal Data :

  • When companies ask for feedback, experience, or satisfaction on a scale of 1 to 10
  • Letter grades in the exam (A, B, C, D, etc.)
  • Ranking of peoples in a competition (First, Second, Third, etc.)
  • Economic Status (High, Medium, and Low)
  • Education Level (Higher, Secondary, Primary)


Difference between Nominal and Ordinal Data

Nominal DataOrdinal Data
Nominal data can’t be quantified, neither they have any intrinsic orderingOrdinal data gives some kind of sequential order by their position on the scale
Nominal data is qualitative data or categorical dataOrdinal data is said to be “in-between” qualitative data and quantitative data
They don’t provide any quantitative value, neither we can perform any arithmetical operationThey provide sequence and can assign numbers to ordinal data but cannot perform the arithmetical operation
Nominal data cannot be used to compare with one anotherOrdinal data can help to compare one item with another by ranking or ordering
Examples: Eye colour, housing style, gender, hair colour, religion, marital status, ethnicity, etcExamples: Economic status, customer satisfaction, education level, letter grades, etc 

Quantitative Data

Discrete Data

Examples of Discrete Data : 

  • Total numbers of students present in a class
  • Cost of a cell phone
  • Numbers of employees in a company
  • The total number of players who participated in a competition
  • Days in a week

Continuous Data

Examples of Continuous Data : 

  • Height of a person
  • Speed of a vehicle
  • “Time-taken” to finish the work 
  • Wi-Fi Frequency
  • Market share price

Difference between Discrete and Continuous Data

Discrete DataContinuous Data
Discrete data are countable and finite; they are whole numbers or integersContinuous data are measurable; they are in the form of fraction or decimal
Discrete data are represented mainly by bar graphsContinuous data are represented in the form of a histogram
The values cannot be divided into subdivisions into smaller piecesThe values can be divided into subdivisions into smaller pieces
Discrete data have spaces between the valuesContinuous data are in the form of a continuous sequence
Examples: Total students in a class, number of days in a week, size of a shoe, etcExample: Temperature of room, the weight of a person, length of an object, etc

Tuesday 19 April 2022

DS Syllabus for AMET

Course Code

:

UEAI002

Course Title

:

Introduction to Data Science

Number of Credits

:

3 (L: 2; T: 0; P: 2)

Course Category

:

DAS

 

Course Objective:

·         To Provide the knowledge and expertise to become a proficient data scientist;

·         Demonstrate and understanding of statistics and machine learning concepts that        

        are vital for data science;

·         Produce Python code to statistically analyse a dataset;

·         Critically evaluate data visualisations based on their design and use 

        For  communicating stories from data;

 

Course Contents:

Module 1: [ 7 Lectures]

 

Introduction to Data Science,

 

 

Different Sectors using Data science

https://mail.google.com/mail/u/0/?tab=rm&ogbl#inbox/FMfcgzGpFWQsDnsGDsLLmPxJHTLkDfsz,

 

Purpose and Components of Python in Data Science.

 https://towardsdatascience.com/top-10-reasons-why-you-need-to-learn-python-as-a-data-scientist-e3d26539ec00

 

 

 

 

 

Module 2: [ 7 Lectures]

Data Analytics Process, Knowledge Check, Exploratory Data Analysis (EDA), EDA- Quantitative technique, EDA- Graphical Technique, Data Analytics Conclusion and Predictions.

Module 3: [ 11 Lectures]

Feature Generation and Feature Selection (Extracting Meaning from Data)- Motivating application: user (customer) retention- Feature Generation (brainstorming, role of domain expertise, and place for imagination)- Feature Selection algorithms.

Module 4: [ 10 Lectures]

Data Visualization- Basic principles, ideas and tools for data visualization, Examples of inspiring (industry) projects- Exercise: create your own visualization of a complex dataset.

Module 5: [ 7 Lectures]

Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists.

Lab Work:

1.  Python Environment setup and Essentials.

2.  Mathematical computing with Python (NumPy).

3.  Scientific Computing with Python (SciPy).

4.  Data Manipulation with Pandas.

5.  Prediction using Scikit-Learn

6.  Data Visualization in python using matplotlib


Text Books/References:

 

1.      Business Analytics: The Science of Data - Driven Decision Making, U Dinesh Kumar, John Wiley & Sons.

 

2.      Introducing Data Science: Big Data, Machine Learning, and More, Using Python Tools, Davy Cielen, John Wiley & Sons.

 

3.      Joel Grus, Data Science from Scratch, Shroff Publisher/O’Reilly Publisher Media

 

4.      Annalyn Ng, Kenneth Soo, Numsense! Data Science for the Layman, Shroff Publisher Publisher

 

5.      Cathy O’Neil and Rachel Schutt. Doing Data Science, Straight Talk from The Frontline. O’Reilly Publisher.

 

6.      Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press.

 

7.      Jake VanderPlas, Python Data Science Handbook, Shroff Publisher/O’Reilly Publisher Media.

 

8.      Philipp Janert, Data Analysis with Open Source Tools, Shroff Publisher/O’Reilly Publisher Media.

 

Tutorials

https://realpython.com/tutorials/data-science/

https://www.w3schools.com/datascience/

https://learn.theprogrammingfoundation.org/getting_started/intro_data_science/

 

 

 

Making Prompts for Profile Web Site

  Prompt: Can you create prompt to craft better draft in a given topic. Response: Sure! Could you please specify the topic for which you...