Lecture-1 Introduction to Business Analytics, R programming and R Studio
Live Lecture
· Introduction to Business Intelligence
· Introduction to Business Analytics
· Introduction to Data
· Introduction to Information
· How information hierarchy can be improved/introduced
· Understanding Business Analytics and R
· Knowledge about the R language
· Its community and ecosystem
· Understand the use of 'R' in the industry
· Compare R with other software in analytics
· Install R and the packages useful for the course
· Perform basic operations in R using command line
· Learn the use of IDE R Studio and Various GUI
· Use the ‘R help’ feature in R
· Worldwide R community collaboration
· Practical Exercise
Lecture-2 Understanding Data
Live Lecture
· Importance of data in business analytics
· Differences between data, information and knowledge
· The various stages that an organization goes through in terms of data maturity
· Business Analytics, Business Intelligence and Data Mining
· Differences between Business Analytics and Business Intelligence
· Describe the two major components within Business Analytics and Business Intelligence
· Data Mining technique helps both Business Intelligence and Business Analytics
· Analytical Decision-Making Process
· Analysing Business Problems
· Practical Exercise
Lecture-3 Introduction to R programming and R Studio
Live Lecture
· Installation of rstudio
· Implementing simple mathematical operations
· Logic using R operators
· Loops
· If statements
· Switch cases
· Practical Exercise
Lecture-4 Data Exploration
Live Lecture
· Introduction to data exploration
· Importing and exporting data to/from external sources
· What are data exploratory analysis and data importing?
· Dataframes
· Accessing individual elements
· Vectors
· Factors
· Operators
· In-built functions
· Conditional Looping statements
· User-defined functions
· Data types
· Practical Exercise
Lecture-5 Data Manipulation
Live Lecture
· Need for data manipulation
· Introduction to the dplyr package
· Selecting one or more columns with select()
· Filtering records on the basis of a condition with filter()
· Adding new columns with mutate()
· Sampling, and counting
· Combining different functions with the pipe operator
· Implementing SQL-like operations with sqldf
· The various steps involved in Data Cleaning
· Functions used in Data Inspection
· Tackling the problems faced during Data Cleaning
· Uses of the functions
· Coerce the data
· Uses of the apply() functions
· Practical Exercise
Lecture-6 Data Import Techniques in R
Live Lecture
· Import data from spreadsheets and text files into R
· Import data from other statistical formats
· Packages installation used for database import
· Connect to RDBMS from R using ODBC
· Basic SQL queries in R
· Basics of Web Scraping
· Practical Exercise
Lecture-7 Exploratory Data Analysis
Live Lecture
· Understanding the Exploratory Data Analysis(EDA)
· Implementation of EDA on various datasets
· Boxplots
· Whiskers of Boxplots
· Understanding the cor() in R
· EDA functions
· Multiple packages in R for data analysis
· The Fancy plots like the Segment plot
· HC plot in R
· Practical Exercise
Lecture-8 Data Visualization
Live Lecture
· Introduction to visualization
· Different types of graphs
· The grammar of graphics
· The ggplot2 package
· Categorical distribution with geom_bar()
· Numerical distribution with geom_hist()
· Building frequency polygons with geom_freqpoly()
· Making a scatterplot with geom_pont()
· Multivariate analysis with geom_boxplot
· Univariate analysis with barplot, histogram & density plot
· Multivariate distribution
· Creating barplots for categorical variables using geom_bar()
· Adding themes with the theme() layer
· Visualization with plotly
· Frequency plots with geom_freqpoly()
· Multivariate distribution with scatter plots and smooth lines
· Continuous distribution vs categorical distribution with box-plots
· Sub grouping plots
· Co-ordinates and themes
· Understanding plotly
· Various plots
· Visualization with ggvis
· Geographic visualization with ggmap()
· Building web applications with shinyr
· Practical Exercise
Lecture-9 Introduction to Statistics
Live Lecture
· Why do we need statistics?
· Categories of statistics
· Statistical terminology
· Types of data
· Measures of central tendency
· Measures of spread
· Correlation and covariance
· Standardization and normalization
· Probability and the types
· Hypothesis testing
· Chi-square testing
· ANOVA
· Normal distribution
· Binary distribution
· Practical Exercise
Lecture-10 Machine Learning
Live Lecture
· Introduction to Machine Learning
· Practical Exercise
Lecture-11 Linear Regression
Live Lecture
· Introduction to linear regression
· Predictive modeling
· Simple linear regression vs multiple linear regression
· Concepts
· Formulas
· Assumptions
· Residuals in Linear Regression
· Building a simple linear model
· Predicting results
· Finding the p-value
· Practical Exercise
Lecture-12 Logistic Regression
Live Lecture
· Introduction to logistic regression
· Logistic regression concepts
· Linear vs logistic regression
· Math behind logistic regression
· Detailed formulas
· logit function and odds
· Bivariate logistic regression
· Poisson regression
· Building a simple binomial model
· Predicting the result
· Making a confusion matrix for evaluating the accuracy
· True positive rate
· False positive rate
· Threshold evaluation with ROCR
· Finding out the right threshold by building the ROC plot
· Cross validation
· Multivariate logistic regression
· Building logistic models with multiple independent variables
· Real-life applications of logistic regression
· An introduction to logistic regression
· Comparing linear regression with logistics regression
· Bivariate logistic regression with multivariate logistic regression
· Understanding the fit of the model
· Using qqnorm() and qqline()
· Understanding the summary results with null hypothesis & F-statistic
· Practical Exercise
Lecture-13 Decision Trees and Random Forest
Live Lecture
· What is classification?
· Different classification techniques
· Introduction to decision trees
· Algorithm for decision tree induction
· Building a decision tree in R
· Confusion matrix & regression trees vs classification trees
· Introduction to bagging
· Random forest and implementing it in R
· Computing probabilities
· Impurity function
· Entropy
· Gini index
· Information gain for the right split of node
· Overfitting
· Pruning
· Re-pruning
· Post-pruning
· Cost-complexity pruning
· Pruning a decision tree and predicting values
· Finding out the right number of trees
· Evaluating performance metrics
· Practical Exercise
Lecture-14 Unsupervised Learning
Live Lecture
· What is Clustering?
· Its use cases
· What is k-means clustering?
· What is canopy clustering?
· What is hierarchical clustering?
· Introduction to unsupervised learning
· Feature extraction
· Clustering algorithms
· The k-means clustering algorithm
· Theoretical aspects of k-means
· K-means process flow
· K-means in R
· Implementing k-means
· Finding out the right number of clusters using a screen plot
· Dendograms
· Understanding hierarchical clustering
· Implementing it in R
· Explanation of Principal Component Analysis (PCA)
· Implementing PCA in R
· Practical Exercise
Lecture-15 Association Rule Mining & Recommendation Engines
Live Lecture
· Introduction to association rule mining and MBA
· Measures of association rule mining
· Introduction to recommendation engines
· User-based collaborative filtering
· Item-based collaborative filtering
· Implementing a recommendation engine in R
· Recommendation engine use cases
· Practical Exercise
Lecture-16 Time Series Analysis
Live Lecture
· What is a time series?
· The techniques
· Applications
· Components of time series
· Moving average
· Smoothing techniques
· Exponential smoothing
· Univariate time series models
· Multivariate time series analysis
· ARIMA model
· Time series in R
· Sentiment analysis in R
· Text analysis
· Practical Exercise
Lecture-17 Support Vector Machine (SVM)
Live Lecture
· Introduction to Support Vector Machine (SVM)
· Data classification using SVM
· SVM algorithms using separable and inseparable cases
· Linear SVM for identifying margin hyperplane
· Practical Exercise
Lecture-18 Naïve Bayes
Live Lecture
· What is Naive Bayes?
· What is the Bayes theorem?
· What is Naïve Bayes Classifier?
· Classification Workflow
· How Naive Bayes classifier works
· Classifier building in Scikit-Learn
· Building a probabilistic classification model using Naïve Bayes
· The zero probability problem
· Practical Exercise
Lecture-19 Text Mining
Live Lecture
· Introduction to the concepts of text mining
· Text mining use cases
· Understanding and manipulating the text with ‘tm’ and ‘stringr’
· Text mining algorithms and the quantification of the text
· TF-IDF and after TF-IDF
· Practical Exercise
Case Studies
Case Studies