Topics:
- What is Data Science?
- Analytics Landscape
- Life Cycle of a Data Science Projects
- Data Science Tools & Technologies
Learning Objectives:
Get an idea of what is data science. Why data science is "Rosy" or "Handy" or "Fascinating"
Get acquainted with various analysis and visualization tools used in data science
Delivery Type:
Theory
Hands-on workshop
No hands-on
Home Assignment
No
Topics:
- Python Basics
- Data Structures in Python
- Control & Loop Statements in Python
- Functions & Classes in Python
- Working with Data
- Analyze Data using Pandas
- Visualize Data
- Case Study
Learning Objectives:
- Learn how to install Python distribution - Anaconda Learn basic data types, strings & regular expressions
- Data structures that are used in Python
- Learn all about loops and control statements in Python
- Write user-defined functions in Python. Learn about Lambda function.
- Learn object oriented way of writing classes & objects
- Learn how to import datasets into Python. Also learn how to write output into files from Python
- Manipulate & analyze data using Pandas library. Learn generating insights from your data
- Use various magnificent libraries in Python like Matplotlib, Seaborn & ggplot for data visualization
- Hands-on session on a real-life case study
Skills
Python
Subskills
- Basics
- Container Objects
- Python Flow Control
- User Defined Functions
- File Handling
- Data Manipulation
- Visualization
Core Competencies
- Variables, Data Types
- List, Tuple, Set, Dictionary
- Looping Constructs, Conditional Statements
- Creating functions, Calling UDF, Function Arguments, Classes and Objects
- Reading, Writing to a File
- Dataframe Operations, EDA
- Univariate, Bivariate Plots, EDA
Delivery Type:
Theory + Workshop
Hands-on workshop
- Know how to install python distribution like anaconda and other libraries
- Write python code for definining your own functions,and also learn to write object oriented way of writing classes and objects
- Write python code to import dataset into python notebook
- Write Python code to implement Data Manipulation, Preparation & Exploratory Data Analysis in a dataset
Home Assignment
Yes
Topics:
- Measures of Central Tendency
- Measures of Dispersion
- Descriptive Statistics
- Probability Basics
- Marginal Probability
- Bayes Theorem
- Probability Distributions
- Hypothesis Testing
Learning Objectives:
- Visit basics like mean (expected value), median and mode
- Distribution of data in terms of variance, standard deviation and interquartile range
- Basic summaries about the data and the measures. Together with simple graphics analysis
- Basics of probability with daily life examples
- Marginal probability and its importance with respective to datascience
- Learn baye's theorem and conditional probability
- Learn alternate and null hypothesis, Type1 error, Type2 error, power of the test, p-value,
Skills
Statistics, Probability
Subskills
Basics/Intermediate
Core Competencies
-
Mean, Medain, Mode
-
variance, standard deviation
-
Measure of Central Tendency and Dispersion
-
Events, Trials, Likelihood
-
Probability Interpretations, Conditional Probability
-
Various Probability Functions and their Constructions
-
Formulating and Testing Hypothesis
Delivery Type:
Theory + Workshop
Hands-on workshop
Write python code to formulate Hypothesis and perform Hypothesis Testing on a real production plant scenario
Home Assignment
Yes
Topics:
- ANOVA
- Linear Regression (OLS)
- Case Study: Linear Regression
- Principal Component Analysis
- Factor Analysis
- Case Study: PCA/FA
Learning Objectives:
- Analysis of Variance and its practical use
- Linear Regression with Ordinary Least Square Estimate to predict a continuous variable. It covers strong concepts, model building, evaluating model parameters, measuring performance metrics on Test and Validation set. Further it covers enhancing model performance by means of various steps like feature engineering & regularization
- Real Life Case Study with Linear Regression
- Dimensionality Reduction Technique with Principal Component Analysis and Factor Analysis. Covers techniques to find the optimum number of components/factors using scree plot, one-eigenvalue criterion
- Real-Life case study with PCA & FA
Skills
Statistics, Data Science, Python
Subskills
- Basics
- Maths behind Linear Regression, Statsmodel library, Case Study
- Linear Algebra, Case Study
Core Competencies
-
Analysis of Variance
-
Building and Evaluating Linear Regression Model using OLS with Python statsmodel
-
Vectors, Matrices, Eigenvalues, Eigenvectors
Delivery Type:
Theory + Workshop
Hands-on workshop
- PROJECT 1
- TITLE - Predict House Pricing
DESCRIPTION - With attributes describing various aspect of residential homes, you are required to build a regression model to predict the property prices.
- TITLE - Predict House Pricing
- Reduce Data Dimensionality for a House Attribute Dataset for more insights & better modeling
Home Assignment
Yes
Topics:
- Logistic Regression
- Case Study: Logistic Regression
- K-Nearest Neighbor Algorithm
- Case Study: K-Nearest Neighbor Algorithm
- Decision Tree
- Case Study: Decision Tree
Learning Objectives:
- Binomial Logistic Regression for Binomial Classification Problems. Covers evaluation of model parameters, model performance using various metrics like sensitivity, specificity, precision, recall, ROC Cuve, AUC, KS-Statistics, Kappa Value
- Real Life Case Study with Binomial Logistic Regression
- KNN Algorithm for Classification Problem. Covers techniques that are used to find the optimum value for K
- Real Life Case Study with KNN
- Decision Trees - for regression & classification problem. Covers both Classification & regression problem. Candidates get knowledge on Entropy, Information Gain, Standard Deviation reduction, Gini Index, CHAID
- Real Life Case Study with Decision Tree
Skills
Data Science, Python
Subskills
- Maths behind Logistic Regression. sklearn library, Case Study
- KNN Algorithm, Case Study
- Building Decision for Regression and Classification problems with sklearn library, Case Study
Core Competencies
- Building and Evaluating Logistic Regression Model with Python sklearn
- Distance Metrics, Elbow Curve
- ID3, CHART, CHAID, Entropy, Information gain, gini index
Delivery Type:
Theory + Workshop
Hands-on workshop
- PROJECT 2
- TITLE - Predict credit card defaulter using Logistic Regression
- DESCRIPTION - With various customer attributes describing customer charactarestics, build a classification model to predict which customer is likely to default a credit card payment next month. This can help the bank be proactive in collecting dues
- PROJECT 3
- TITLE - Predict chronic kidney disease using KNN
- DESCRIPTION - Predict if a patient is likely to get any chronic kidney disease depending on the health metrics
- PROJECT 4
- TITLE - Predict quality of Wine using Decision Tree
- DESCRIPTION - Wine comes in various style. With the ingredient composition known, we can build a model to predict the the Wine Quality using Decision Tree (Regression Trees)
Home Assignment
Yes
Topics:
- Understand Time Series Data
- Visualizing TIme Series Components
- Exponential Smoothing
- Holt's Model
- Holt-Winter's Model
- ARIMA
- Case Study: Time Series Modeling on Stock Price
Learning Objectives:
- Understand Time Series Data and its components like Level Data, Trend Data and Seasonal Data
- Understand Time Series Data and its components like Level Data, Trend Data and Seasonal Data
- Understand Time Series Data and its components like Level Data, Trend Data and Seasonal Data
- Real Life Case Study with ARIMA
Skills
Data Science, Python
Subskills
- Time Component in Data
- Systematic and Non-Systematic Components
- Smooting Methods
- Time Series Models
- Case Study in Python
Core Competencies
- Features of Time Series Data
- Level, Trend, Seasonality, Noise
- Time Constant, Types of Smooting
- Basic Exponential Smoothing, Double Exponential Smoothing, Triple Exponential Smoothing
- Building Time Series Forecasting Model
- AR, MA, ARMA, ARIMA
Delivery Type:
Theory + Workshop
Hands-on workshop
- Write python code to Understand Time Series Data and its components like Level Data, Trend Data and Seasonal Data
- Write python code to Use Holt's model when your data has Constant Data, Trend Data and Seasonal Data. How to select the right smooting constants.
- Write python code to Use Auto Regressive Integrated Moving Average Model for building Time Series Model
Home Assignment
Yes
Topics:
Industry relevant capstone project under experiened industry-expert mentor
Learning Objectives:
An industry mentor guided group project to handle a real-life project. The same way you would execute a data science project in any business problem
Delivery Type:
Workshop
Hands-on workshop
Project to be selected by candidates.
Home Assignment
Yes