This shows you the differences between two versions of the page.
ep:labs:10 [2021/12/04 16:30] vlad.stefanescu [Resources] |
ep:labs:10 [2025/02/11 22:58] (current) cezar.craciunoiu [Lab 9 - Machine Learning Optimization] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Lab 10 - Machine Learning ====== | + | ====== Lab 10 - Machine Learning Optimization ====== |
===== Objectives ===== | ===== Objectives ===== | ||
- | * Understand basic concepts of machine learning | + | * TODO |
- | * Remember examples of real-world problems that can be solved with machine learning | + | |
- | * Learn the most common performance evaluation metrics for machine learning models | + | |
- | * Analyse the behaviour of typical machine learning algorithms using the most popular techniques | + | |
- | * Be able to compare multiple machine learning models | + | |
===== Resources ===== | ===== Resources ===== | ||
- | In this lab, we will study the basic performance evaluation in machine learning, covering elementary concepts such as classification, regression, data fitting, clustering and much more. | + | TODO |
- | + | ||
- | You will work in an environment that is easy to use, and provides a couple of tools like manipulating data and visualizing results. We will use **Google Colab**, which comes with a variety of useful tools already installed. | + | |
- | + | ||
- | You can also check out these cheet sheets for fast reference to the common libraries: | + | |
- | + | ||
- | **Cheat sheets:** | + | |
- | + | ||
- | - [[https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf)|python]] | + | |
- | - [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf|numpy]] | + | |
- | - [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] | + | |
- | - [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] | + | |
- | - [[https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf|pandas]] | + | |
- | - [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Python_Seaborn_Cheat_Sheet.pdf|seaborn]] | + | |
- | + | ||
- | <note>This lab is organized in a Jupyer Notebook hosted on Google Colab. You will find there some intuitions and applications for pandas and seaborn. Check out the Tasks section below.</note> | + | |
- | + | ||
- | The exercises will be solved in Python, using various popular libraries that are usually integrated in machine learning projects: | + | |
- | + | ||
- | * [[https://scikit-learn.org/stable/documentation.html|Scikit-Learn]]: fast model development, performance metrics, pipelines, dataset splitting | + | |
- | * [[https://pandas.pydata.org/pandas-docs/stable/|Pandas]]: data frames, csv parser, data analysis | + | |
- | * [[https://numpy.org/doc/|NumPy]]: scientific computation | + | |
- | * [[https://matplotlib.org/3.1.1/users/index.html|Matplotlib]]: data plotting | + | |
- | + | ||
- | + | ||
- | [[https://www.kaggle.com/uciml/pima-indians-diabetes-database/data|Classification Dataset]] | + | |
- | [[https://www.kaggle.com/zaraavagyan/weathercsv|Regression dataset]] | + | |
- | + | ||
- | <solution -hidden> | + | |
- | Solution: {{:ep:labs:lab_12_ml_revisited_solution.zip}} | + | |
- | </solution> | + | |
===== Tasks ===== | ===== Tasks ===== | ||
Line 47: | Line 13: | ||
==== Google Colab Notebook ==== | ==== Google Colab Notebook ==== | ||
- | For this lab, we will use Google Colab for exploring performance evaluation in machine learning. Please solve your tasks [[https://github.com/vladastefanescu/machine-learning-introduction/blob/main/Machine_Learning_Introduction.ipynb|here]] by clicking "**Open in Colaboratory**". | + | TODO |
- | + | ||
- | You can then export this python notebook as a PDF (**File -> Print**) and upload it to **Moodle**. | + | |
+ | ===== Feedback ===== | ||
+ | Please take a minute to fill in the **[[https://forms.gle/NpSRnoEh9NLYowFr5 | feedback form]]** for this lab. | ||