This course is a great introduction to Data Analysis and Machine Learning in Python.

# At this workshop you will learn:

- How to process data with Pandas?
- How to explore unknown data?
- How to do machine learning with scikit-learn?
- How to convert data between different formats?
- And much, much more.

# Course Syllabus

- Tooling
- Python 3 vs Python 2
- Python 3.x Installation
- PyCharm – IDE
- Executing Python Scripts
- pip – Packet Manager
- IPython – Interactive Console
- Jupyter Notebook
- virtualenv – Isolated Python Installations
- Tooling Summary
- Tooling for Data Science

- Data Visualization with matplotlib
- Basic Line Plots
- More Series Customization
- Log and Symlog Scale
- Multiple Plots
- Interactive Plots

- Python Crash Course
- Data Types
- Functions
- Useful Builtin Functions

- Data Processing with Pandas
- Importing and Exporting Data
- Basic Transformations
- Aggregation
- Filtering
- Split-Apply-Combine Pattern
- Rolling
- Processing Missing Values

- Introduction to Machine Learning
- What is Machine Learning?
- Basic Concepts
- Problem Types
- Basic Questions
- Common Workflow
- Links
- Use Case: Iris Classification

- Supervised Learning
- Underfitting, overfitting
- k-Nearest Neighbors
- Classification
- Regression

- Linear Models
- Ordinary Least Squares
- Ridge Regression
- Lasso Regression
- Logistic Regression i Linear Support Vector Machines
- Multiclass Classification
- Naive Bayes Classifier

- Decision Trees
- Single Decision Trees
- Single Decision Tree for Regression
- Ensembles of Decision Trees
- Random Forests
- Gradient Boosted Regression Trees

- Kernelized Support Vector Machines
- Neural Networks
- Working Principle
- Parameters
- Use Case

- Classificators Uncertainty
- Classificators Comparision

- Unsupervised Learning
- Preprocessing and Scaling
- Unsupervised Transformations
- Principal Component Analysis (PCA)
- Feature Extraction with PCA
- Non-negative matrix factorization (NMF)
- Dekompozycja sygnału z NMF
- Manifold Learning with t-SNE

- Clustering
- k-Means Clustering
- Agglomerative Clustering
- Hierarchical Clustering and Dendograms
- DBSCAN
- Evaluating Clustering with Ground Truth
- Evaluating Clustering without Ground Truth
- Comparing Clustering on Digits

- Semi-Supervised Learning

- Model Evaluation and Improvement
- Cross Validation
- Grid Search
- Naive Implementation
- Grid Search with Cross Validation
- Analysing Results of Cross-Validation
- Search Over Spaces That Are Not Grids
- Nested Cross Validation

- Evaluation Metrics for Classification
- Confusion Matrix
- Accuracy, Precision, Recall, F-score
- Taking Uncertainty into Account
- Precision-Recall Curve
- Receiver Operating Characteristics (ROC) and AUC
- Multiclass Classification

- Using Evaluation Metrics in Model Selection