This course is a great introduction to Data Analysis and Machine Learning in Python.

At this workshop you will learn:

  • How to process data with Pandas?
  • How to explore unknown data?
  • How to do machine learning with scikit-learn?
  • How to convert data between different formats?
  • And much, much more.

Course Syllabus

  1. Tooling
    1. Python 3 vs Python 2
    2. Python 3.x Installation
    3. PyCharm – IDE
    4. Executing Python Scripts
    5. pip – Packet Manager
    6. IPython – Interactive Console
    7. Jupyter Notebook
    8. virtualenv – Isolated Python Installations
    9. Tooling Summary
    10. Tooling for Data Science
  2. Data Visualization with matplotlib
    1. Basic Line Plots
    2. More Series Customization
    3. Log and Symlog Scale
    4. Multiple Plots
    5. Interactive Plots
  3. Python Crash Course
    1. Data Types
    2. Functions
    3. Useful Builtin Functions
  4. Data Processing with Pandas
    1. Importing and Exporting Data
    2. Basic Transformations
    3. Aggregation
    4. Filtering
    5. Split-Apply-Combine Pattern
    6. Rolling
    7. Processing Missing Values
  5. Introduction to Machine Learning
    1. What is Machine Learning?
    2. Basic Concepts
    3. Problem Types
    4. Basic Questions
    5. Common Workflow
    6. Links
    7. Use Case: Iris Classification
  6. Supervised Learning
    1. Underfitting, overfitting
    2. k-Nearest Neighbors
      1. Classification
      2. Regression
    3. Linear Models
      1. Ordinary Least Squares
      2. Ridge Regression
      3. Lasso Regression
      4. Logistic Regression i Linear Support Vector Machines
      5. Multiclass Classification
      6. Naive Bayes Classifier
    4. Decision Trees
      1. Single Decision Trees
      2. Single Decision Tree for Regression
      3. Ensembles of Decision Trees
      4. Random Forests
      5. Gradient Boosted Regression Trees
    5. Kernelized Support Vector Machines
    6. Neural Networks
      1. Working Principle
      2. Parameters
      3. Use Case
    7. Classificators Uncertainty
    8. Classificators Comparision
  7. Unsupervised Learning
    1. Preprocessing and Scaling
    2. Unsupervised Transformations
      1. Principal Component Analysis (PCA)
      2. Feature Extraction with PCA
      3. Non-negative matrix factorization (NMF)
      4. Dekompozycja sygnału z NMF
      5. Manifold Learning with t-SNE
    3. Clustering
      1. k-Means Clustering
      2. Agglomerative Clustering
      3. Hierarchical Clustering and Dendograms
      4. DBSCAN
      5. Evaluating Clustering with Ground Truth
      6. Evaluating Clustering without Ground Truth
      7. Comparing Clustering on Digits
    4. Semi-Supervised Learning
  8. Model Evaluation and Improvement
    1. Cross Validation
    2. Grid Search
      1. Naive Implementation
      2. Grid Search with Cross Validation
      3. Analysing Results of Cross-Validation
      4. Search Over Spaces That Are Not Grids
      5. Nested Cross Validation
    3. Evaluation Metrics for Classification
      1. Confusion Matrix
      2. Accuracy, Precision, Recall, F-score
      3. Taking Uncertainty into Account
      4. Precision-Recall Curve
      5. Receiver Operating Characteristics (ROC) and AUC
      6. Multiclass Classification
    4. Using Evaluation Metrics in Model Selection