Sale Date Ended
Python Basic :
Collections in python:
◦ Introduction to List
◦ List operations
◦ List Comprehensions
◦ Introduction to Dictionary
◦ Dictionaries operations
◦ Introduction to Set
◦ Set Operations
◦ Introduction to Tuples
◦ Tuples Operations
Conditionals and Looping
Strings:
Functions modules and Package:
Exception handling in python :
IO in python
Some useful packages in Python
◦ Joining path
◦ Creating new directory
◦ Absolute and relative path
◦ File size
◦ Getting content of a folder
◦ Copying files and directories
◦ Deleting files and directories
◦ Renaming files and directories
Class and Objects :
Debugging in Python
NumPy :
Matplotlib :
NLTK :
◦ Tokenization
◦ Stemming
◦ Stop words
◦ Part of speech tagging
◦ Lemmatization
Pandas :
Scikit-Learn
Introduction :
Exploratory analysis of data :
Linear regression
Classification
◦ Introduction to K Nearest neighbor
◦ K Nearest neighbor in Scikit Learn
◦ Strength and weakness of K Nearest neighbor
◦ Introduction to logistic regression
◦ Use cases of logistic regression
◦ Mathematical description of logistic regression
◦ Logistic regression with Scikit Learn
◦ Introduction to Bayes theorem
◦ Bayes theorem in classification
◦ Bayes classifier with Scikit Learn
◦ Introduction to decision tree.
◦ Use cases of decision tree
◦ Partition algorithms for decision tree
▪ ID3
▪ Gini Index
▪ Cart
◦ Tree pruning
◦ Scikit Learn and decision tree
◦ Introduction to brain
◦ Introduction to neural network
◦ Perceptron
◦ Back propagation algorithm
◦ MLP and Scikit Learn
◦ Confusion matrix
◦ Cohen kappa
◦ Precision, recall and F-measures
◦ Receiver operating characteristic (ROC)
Clustering
◦ Different linkage type: Ward, complete and average linkage
◦ Adjusted Rand index
◦ Mutual Information based scores
◦ Homogeneity, completeness and V-measure
◦ Fowlkes-Mallows scores
◦ Silhouette Coefficient
NLTK :
◦ Tokenization
◦ Stemming
◦ Stop words
◦ Part of speech tagging
◦ Lemmatization
Projects :
There will be three projects, which will move end to end .
Project1 : Given sells data, participants has to implement data science day to day algorithm like filtering, aggregation, date and time manipulation and applying charts to understand patterns in sells of different stores.
Project 2 : Given Movie lens data, participant has to implement data joining, aggregation and charting algorithms to find meaningful patterns and informations.
Project 3 : Given Kaggle titanic data, participants have to implement data preprocessing and data cleaning algorithms. After that, participants are required to do hypothesis testing on the data to validate their hypothesis.