Become a Data Science expert with huge employment opportunities.
Course OutlineFAQs
Introduction to Data Science
- Need for Data Scientists
- Foundation of Data Science
- What is Business Intelligence
- What is Data Analysis
- What is Data Mining
- What is Machine Learning
- Analytics vs Data Science
- Value Chain
- Types of Analytics
- Lifecycle Probability
- Analytics Project Lifecycle
Data
- Basis of Data Categorization
- Types of Data
- Data Collection Types
- Forms of Data & Sources
- Data Quality & Changes
- Data Quality Issues
- Data Quality Story
- What is Data Architecture
- Components of Data Architecture
- OLTP vs OLAP
- How is Data Stored?
Big Data
- What is Big Data?
- 5 Vs of Big Data
- Big Data Architecture
- Big Data Technologies
- Big Data Challenge
- Big Data Requirements
- Big Data Distributed Computing & Complexity
- Hadoop
- Map Reduce Framework
- Hadoop Ecosystem
Data Science Deep Dive
- What Data Science is
- Why Data Scientists are in demand
- What is a Data Product
- The growing need for Data Science
- Large Scale Analysis Cost vs Storage
- Data Science Skills
- Data Science Use Cases
- Data Science Project Life Cycle & Stages
- Map Reduce Framework
- Hadoop Ecosystem
- Data Acuqisition
- Where to source data
- Techniques
- Evaluating input data
- Data formats
- Data Quantity
- Data Quality
- Resolution Techniques
- Data Transformation
- File format Conversions
- Annonymization
Intro to R Programming
- Introduction to R
- Business Analytics
- Analytics concepts
- The importance of R in analytics
- R Language community and eco-system
- Usage of R in industry
- Installing R and other packages
- Perform basic R operations using command line
- Usage of IDE R Studio and various GUI
R Programming Concepts
- The datatypes in R and its uses
- Built-in functions in R
- Subsetting methods
- Summarize data using functions
- Use of functions like head(), tail(), for inspecting data
- Use-cases for problem solving using R
Data Manipulation in R
- Various phases of Data Cleaning
- Functions used in Inspection
- Data Cleaning Techniques
- Uses of functions involved
- Use-cases for Data Cleaning using R
Data Import Techniques in R
- Import data from spreadsheets and text files into R
- Importing data from statistical formats
- Packages installation for database import
- Connecting to RDBMS from R using ODBC and basic SQL queries in R
- Web Scraping
- Other concepts on Data Import Techniques
- Exploratory Data Analysis (EDA) using R
- What is EDA?
- Why do we need EDA?
- Goals of EDA
- Types of EDA
- Implementing of EDA
- Boxplots, cor() in R
- EDA functions
- Multiple packages in R for data analysis
- Some fancy plots
- Use-cases for EDA using R
Data Visualization in R
- Story telling with Data
- Principle tenets
- Elements of Data Visualization
- Infographics vs Data Visualization
- Data Visualization & Graphical functions in R
- Plotting Graphs
- Customizing Graphical Parameters to improvise the plots
- Various GUIs
- Spatial Analysis
- Other Visualization concepts
Statistics + Machine Learning
Statistics; Whats is Statistics
- Descriptive Statistics
- Central Tendency Measures
- The Story of Average
- Dispersion Measures
- Data Distributions
- Central Limit Theorem
- What is Sampling
- Why Sampling
- Sampling Methods
- Inferential Statistics
- What is Hypothesis testing
- Confidence Level
- Degrees of freedom
- what is pValue
- Chi-Square test
- What is ANOVA
- Correlation vs Regression
- Uses of Correlation & Regression
Machine Learning Introduction
- ML Fundamentals
- ML Common Use Cases
- Understanding Supervised and Unsupervised Learning Techniques
- Clustering
- Similarity Metrics
- Distance Measure Types: Euclidean, Cosine Measures
- Creating predictive models
- Understanding K-Means Clustering
- Understanding TF-IDF, Cosine Similarity and their application to Vector Space Model
- Case study
Implementing Association rule mining
- Case study
Understanding Process flow of Supervised Learning Techniques
- Decision Tree Classifier
- How to build Decision trees
- Case study
Random Forest Classifier
- What is Random Forests
- Features of Random Forest
- Out of Box Error Estimate and Variable Importance
- Case study
Naive Bayes Classifier.
- Case study
Project Discussion
- Problem Statement and Analysis
- Various approaches to solve a Data Science Problem
- Pros and Cons of different approaches and algorithms.
Linear Regression
- Case study
Logistic Regression
- Case study
Text Mining
- Case study
Sentimental Analysis
- Case study