Course Code
|
:
|
UEAI002
|
Course Title
|
:
|
Introduction to Data Science
|
Number
of Credits
|
:
|
3 (L: 2; T: 0; P: 2)
|
Course Category
|
:
|
DAS
|
Course Objective:
· To Provide the knowledge and expertise to become a proficient data scientist;
· Demonstrate and understanding of statistics and machine learning concepts that
are vital for data science;
· Produce Python code to statistically analyse a dataset;
· Critically evaluate data visualisations based on their design and use
For communicating stories from data;
Course Contents:
Module 1: [ 7 Lectures]
Introduction to Data Science,
Different Sectors using Data science
https://mail.google.com/mail/u/0/?tab=rm&ogbl#inbox/FMfcgzGpFWQsDnsGDsLLmPxJHTLkDfsz,
Purpose and Components of Python in Data Science.
https://towardsdatascience.com/top-10-reasons-why-you-need-to-learn-python-as-a-data-scientist-e3d26539ec00
Module 2: [ 7 Lectures]
Data Analytics Process, Knowledge Check, Exploratory Data Analysis (EDA), EDA- Quantitative technique, EDA- Graphical Technique, Data Analytics Conclusion and Predictions.
Module 3: [ 11 Lectures]
Feature Generation and Feature
Selection (Extracting Meaning from Data)- Motivating application: user (customer) retention- Feature Generation (brainstorming, role of domain expertise, and place for imagination)- Feature Selection algorithms.
Module 4: [ 10 Lectures]
Data Visualization- Basic principles,
ideas and tools for data visualization, Examples of inspiring (industry) projects- Exercise: create your own visualization of a complex dataset.
Module 5: [ 7 Lectures]
Applications of Data Science, Data
Science and Ethical Issues- Discussions on privacy, security, ethics-
A look back at Data Science- Next-generation data scientists.
Lab Work:
1. Python Environment setup and Essentials.
2. Mathematical computing with Python (NumPy).
3. Scientific Computing with Python (SciPy).
4. Data Manipulation with Pandas.
5. Prediction using Scikit-Learn
6. Data Visualization in python using matplotlib
Text Books/References:
1. Business Analytics: The Science of Data - Driven
Decision Making, U Dinesh Kumar, John Wiley & Sons.
2. Introducing Data Science: Big Data, Machine
Learning, and More, Using Python Tools, Davy Cielen, John Wiley & Sons.
3. Joel Grus, Data Science from Scratch, Shroff Publisher/O’Reilly Publisher
Media
4. Annalyn Ng, Kenneth Soo, Numsense! Data Science for the Layman, Shroff Publisher Publisher
5. Cathy O’Neil and Rachel Schutt. Doing Data Science, Straight Talk from The Frontline. O’Reilly Publisher.
6. Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press.
7. Jake VanderPlas, Python Data Science Handbook,
Shroff Publisher/O’Reilly Publisher Media.
8. Philipp Janert, Data Analysis with Open Source Tools, Shroff Publisher/O’Reilly Publisher Media.
Tutorials
https://realpython.com/tutorials/data-science/
https://www.w3schools.com/datascience/
https://learn.theprogrammingfoundation.org/getting_started/intro_data_science/