This shows you the differences between two versions of the page.
ewis:laboratoare:07 [2023/04/19 17:08] alexandru.predescu [Exercises] |
ewis:laboratoare:07 [2023/04/19 18:08] (current) alexandru.predescu [Exercises] |
||
---|---|---|---|
Line 253: | Line 253: | ||
==== Exercises ==== | ==== Exercises ==== | ||
- | === Task 1 (5p) === | + | === Task 1 (3p) === |
Download the {{:ewis:laboratoare:lab7:project_lab7.zip|project archive}} and unzip on your PC. Install the requirements using pip (e.g. //py -3 -m pip install -r requirements.txt//). | Download the {{:ewis:laboratoare:lab7:project_lab7.zip|project archive}} and unzip on your PC. Install the requirements using pip (e.g. //py -3 -m pip install -r requirements.txt//). | ||
- | The code sample (//task12.py//) uses linear regression to fit a sample of generated data. | + | The script (//task1.py//) uses linear regression to fit a sample of generated data. |
Run the program and solve the following scenarios: | Run the program and solve the following scenarios: | ||
* Experiment with different polynomial orders | * Experiment with different polynomial orders | ||
Line 266: | Line 267: | ||
* Q3: Explain the results based on the provided function that is used to generate the dataset. | * Q3: Explain the results based on the provided function that is used to generate the dataset. | ||
- | === Task 2 (5p) === | + | === Task 2 (3p) === |
- | The code sample (//task3.py//) loads the Boston Housing Dataset and trains a linear model over multiple features. The prediction results (median housing prices in thousands of dollars) are shown in the plot and compared to the original dataset. | + | The script (//task2.py//) loads a dataset from a CSV file. Run a similar script as Task 1, and present your results. |
+ | |||
+ | [[https://www.kaggle.com/datasets/meetnagadia/bitcoin-stock-data-sept-17-2014-august-24-2021|Bitcoin Price Dataset]] | ||
+ | |||
+ | === Task 3 (3p) === | ||
+ | |||
+ | The script (//task3.py//) loads the Boston Housing Dataset and trains a linear model over multiple features. The prediction results (median housing prices in thousands of dollars) are shown in the plot and compared to the original dataset. | ||
Run the program and solve the following scenarios: | Run the program and solve the following scenarios: | ||
* [TODO 1] Change the size of the training dataset (percent) and evaluate the models that are obtained in each case using RMSE | * [TODO 1] Change the size of the training dataset (percent) and evaluate the models that are obtained in each case using RMSE | ||
Line 277: | Line 284: | ||
* Q2. What is the amount (percent) of training data that provides the best results in terms of prediction accuracy on validation data? | * Q2. What is the amount (percent) of training data that provides the best results in terms of prediction accuracy on validation data? | ||
* Q3. What happens if the amount training data is small, e.g. 10%, with regards to the prediction accuracy and the over/underfitting of the regression model? | * Q3. What happens if the amount training data is small, e.g. 10%, with regards to the prediction accuracy and the over/underfitting of the regression model? | ||
- | |||
- | Submit your answers on Moodle as PDF report. | ||
==== Resources ==== | ==== Resources ==== | ||
- | * {{:ewis:laboratoare:lab7:project_lab7.zip|Project}} | + | * {{:ewis:laboratoare:lab7:lab7.zip|Project}} |
* {{:ewis:laboratoare:python_workflow.pdf|Python Workflow}} | * {{:ewis:laboratoare:python_workflow.pdf|Python Workflow}} | ||
* [[https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html|The Boston Housing Dataset]] | * [[https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html|The Boston Housing Dataset]] |