This shows you the differences between two versions of the page.
ep:labs:10 [2021/12/04 16:24] vlad.stefanescu [Resources] |
ep:labs:10 [2022/09/24 14:46] (current) emilian.radoi |
||
---|---|---|---|
Line 11: | Line 11: | ||
===== Resources ===== | ===== Resources ===== | ||
- | The exercises will be solved in Python, using various popular libraries that are usually integrated in machine learning projects: | + | In this lab, we will study basic performance evaluation techniques used in machine learning, covering elementary concepts such as classification, regression, data fitting, clustering and much more. |
+ | |||
+ | You will work in an environment that is easy to use, and provides a couple of tools like manipulating data and visualizing results. We will use a **Jupyer Notebook** hosted on **Google Colab**, which comes with a variety of useful tools already installed. | ||
+ | |||
+ | The exercises will be solved in Python, using popular libraries that are usually integrated in machine learning projects: | ||
* [[https://scikit-learn.org/stable/documentation.html|Scikit-Learn]]: fast model development, performance metrics, pipelines, dataset splitting | * [[https://scikit-learn.org/stable/documentation.html|Scikit-Learn]]: fast model development, performance metrics, pipelines, dataset splitting | ||
Line 18: | Line 22: | ||
* [[https://matplotlib.org/3.1.1/users/index.html|Matplotlib]]: data plotting | * [[https://matplotlib.org/3.1.1/users/index.html|Matplotlib]]: data plotting | ||
+ | As datasets, we will use some public corpora provided by the Kaggle community: | ||
- | [[https://www.kaggle.com/uciml/pima-indians-diabetes-database/data|Classification Dataset]] | + | * [[https://www.kaggle.com/uciml/pima-indians-diabetes-database/data|Classification Dataset]] |
- | [[https://www.kaggle.com/zaraavagyan/weathercsv|Regression dataset]] | + | * [[https://www.kaggle.com/zaraavagyan/weathercsv|Regression dataset]] |
+ | |||
+ | You can also check out these cheet sheets for fast reference to the most common libraries: | ||
+ | |||
+ | **Cheat sheets:** | ||
+ | |||
+ | * [[https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf)|python]] | ||
+ | * [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf|numpy]] | ||
+ | * [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] | ||
+ | * [[https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] | ||
+ | * [[https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/Pandas_Cheat_Sheet.pdf|pandas]] | ||
<solution -hidden> | <solution -hidden> | ||
- | Solution: {{:ep:labs:lab_12_ml_revisited_solution.zip}} | + | [[https://colab.research.google.com/drive/1aeV9PGF_uxBA3FoKNMEzsiXMxjVSCcm4?usp=sharing|Solution]] |
</solution> | </solution> | ||
Line 34: | Line 49: | ||
You can then export this python notebook as a PDF (**File -> Print**) and upload it to **Moodle**. | You can then export this python notebook as a PDF (**File -> Print**) and upload it to **Moodle**. | ||
+ | ===== Feedback ===== | ||
+ | Please take a minute to fill in the **[[https://forms.gle/LWBWYsMiJq8FsYdN9 | feedback form]]** for this lab. | ||
Line 48: | Line 65: | ||
- | ===== References ===== | ||
- | |||
- | [[https://www.kaggle.com/uciml/pima-indians-diabetes-database/data|Classification Dataset]] | ||
- | |||
- | [[https://www.kaggle.com/zaraavagyan/weathercsv|Regression dataset]] | ||
- | {{namespace>:ep:labs:10:contents:tasks&nofooter&noeditbutton}} |