This shows you the differences between two versions of the page.
ewis:laboratoare:06 [2020/03/31 15:47] alexandru.predescu |
ewis:laboratoare:06 [2023/04/12 18:01] (current) alexandru.predescu [Exercises] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Lab 6. Databases. MySQL-Python. Charts ===== | + | ===== Lab 6. Relational Databases. MySQL-Python. Charts ===== |
+ | In this lab, we are introduced to more advanced SQL (functions, joins, subqueries) that are commonly used when querying databases. Then, we will connect to the database and run queries using Python and we will experiment with a plotting library to visualize the results. | ||
==== SQL. Functions ==== | ==== SQL. Functions ==== | ||
Line 108: | Line 109: | ||
</code> | </code> | ||
- | ==== SQL. Complex queries ==== | + | ==== SQL. Querying multiple tables ==== |
{{:ewis:laboratoare:schema_mysql.png?600|Database schema}} | {{:ewis:laboratoare:schema_mysql.png?600|Database schema}} | ||
Line 158: | Line 159: | ||
</code> | </code> | ||
- | == LEFT (OUTER) JOIN == | + | <spoiler Click here for other types of joins> |
+ | |||
+ | **LEFT (OUTER) JOIN** | ||
{{:ewis:laboratoare:left_join.png?200|}} | {{:ewis:laboratoare:left_join.png?200|}} | ||
Line 171: | Line 174: | ||
</code> | </code> | ||
- | == RIGHT (OUTER) JOIN == | + | **RIGHT (OUTER) JOIN** |
{{:ewis:laboratoare:right_join.png?200|}} | {{:ewis:laboratoare:right_join.png?200|}} | ||
Line 177: | Line 180: | ||
RIGHT JOIN is used to return rows from both tables based on the matching condition. The difference from INNER JOIN is that ALL the rows from the second table will be returned and if for some rows there is no matching record on the first table, the result for that column(s) will be NULL. | RIGHT JOIN is used to return rows from both tables based on the matching condition. The difference from INNER JOIN is that ALL the rows from the second table will be returned and if for some rows there is no matching record on the first table, the result for that column(s) will be NULL. | ||
- | == FULL (OUTER) JOIN == | + | **FULL (OUTER) JOIN** |
{{:ewis:laboratoare:full_join.png?200|}} | {{:ewis:laboratoare:full_join.png?200|}} | ||
Line 183: | Line 186: | ||
FULL JOIN combines LEFT JOIN and RIGHT JOIN. Where rows in the tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. | FULL JOIN combines LEFT JOIN and RIGHT JOIN. Where rows in the tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. | ||
- | == CROSS JOIN == | + | **CROSS JOIN** |
CROSS JOIN is a simplest form of JOIN which matches each row from one table to all rows of another table (similar to cartesian product in set theory). The result of a CROSS JOIN can be filtered by using a WHERE clause which may then produce the equivalent of an INNER JOIN. | CROSS JOIN is a simplest form of JOIN which matches each row from one table to all rows of another table (similar to cartesian product in set theory). The result of a CROSS JOIN can be filtered by using a WHERE clause which may then produce the equivalent of an INNER JOIN. | ||
Line 199: | Line 202: | ||
WHERE emp.position_id = pos.id | WHERE emp.position_id = pos.id | ||
</code> | </code> | ||
+ | |||
+ | </spoiler> | ||
=== Subqueries === | === Subqueries === | ||
Line 367: | Line 372: | ||
</code> | </code> | ||
- | For working with tables, the //pandas// package can be used which simplifies many operations such as extracting columns, CRUD operations (creating, reading, updating and deletion of data), filtering and data analysis. //Pandas// is often used for Machine Learning in the form of dataframes: //DataFrame//, allows to import data from various structures and file formats: lists, dictionaries, csv files, excel files, etc. | + | For working with tables, the //pandas// library can be used which simplifies many operations such as extracting columns, CRUD operations (creating, reading, updating and deletion of data), filtering and data analysis. //Pandas// is often used for Machine Learning in the form of dataframes: //DataFrame//, allows to import data from various structures and file formats: lists, dictionaries, csv files, excel files, etc. |
In the following example, data is retrieved from a cursor and added to a pandas //DataFrame//: | In the following example, data is retrieved from a cursor and added to a pandas //DataFrame//: | ||
Line 492: | Line 497: | ||
== 1. Databases (5p) == | == 1. Databases (5p) == | ||
- | <note important> | + | **Task 1.** Run the example queries presented in the lab (based on the company database example). |
- | For this part you only need MySQL Workbench and Server installed on your computer. | + | |
- | * Pick **5** exercises (ex. 0 is required) of your choice and write the SQL query that solves the task. | + | |
- | * Upload your results on Moodle as screenshots or CSV files for grading. | + | |
- | </note> | + | |
- | **0.** Run the example queries presented in the lab (that are related to the company database example). | + | **Task 2.** Solve 5 of the following queries. Present the SQL and results. |
- | **1.** Using aggregate functions, write an SQL query to calculate the sum of all salaries in the company database. | + | - Using aggregate functions, write an SQL query to calculate the sum of all salaries in the company database. |
- | + | - Using aggregate functions, write an SQL query to calculate the average salary in the company database. | |
- | **2.** Using aggregate functions, write an SQL query to calculate the average salary in the company database. | + | - Write an SQL query to return a list of employees with the following information about them: |
- | + | * the name | |
- | **3.** Write an SQL query to return a list of employees with the following information about them: | + | * the name of their department |
- | + | * the size of their team (number of employees within the same department) | |
- | * the name | + | - Write an SQL query to return a list of employees with the following information about them: |
- | * the name of their department | + | * the name |
- | * the size of their team (number of employees within the same department) | + | * the name of their position |
- | + | * the min/average/max salary of the employees on the same position | |
- | **4.** Write an SQL query to return a list of employees with the following information about them: | + | - Write an SQL query to return a list of employees with the following information about them: |
- | + | * the name in uppercase | |
- | * the name | + | * the location of their department |
- | * the name of their position | + | - Write an SQL query to return a list of employees (name, salary) having a higher salary than the average salary within the company. |
- | * the min/average/max salary of the employees on the same position | + | - Write an SQL query to return a list of employees (name, salary) having a higher salary than their manager. |
- | + | - Write an SQL query to return the employees (name, salary) having the highest salary within each department. | |
- | **5.** Write an SQL query to return a list of employees with the following information about them: | + | - Write an SQL query to return the managers and the number of employees in their teams. |
- | + | - Write an SQL query to list all employees and the year they were hired (hint: use the scalar function DATE_FORMAT(hire_date,'%Y')) | |
- | * the name in uppercase | + | |
- | * the location of their department | + | |
- | + | ||
- | **6.** Write an SQL query to return a list of employees (name, salary) having a higher salary than the average salary within the company. | + | |
- | + | ||
- | **7.** Write an SQL query to return a list of employees (name, salary) having a higher salary than their manager. | + | |
- | + | ||
- | **8.** Write an SQL query to return the employees (name, salary) having the highest salary within each department. | + | |
- | + | ||
- | **9.** Write an SQL query to return the managers and the number of employees in their teams. | + | |
- | + | ||
- | **10.** Write an SQL query to list all employees and the year they were hired (hint: use the scalar function DATE_FORMAT(hire_date,'%Y')) | + | |
== 2. MySQL-Python. Charts (5p) == | == 2. MySQL-Python. Charts (5p) == | ||
Line 536: | Line 525: | ||
For this part you also need Python installed on your computer. | For this part you also need Python installed on your computer. | ||
* Check out the requirements section (at the end of the page) to install the required tools and setup working with Python on your computer. | * Check out the requirements section (at the end of the page) to install the required tools and setup working with Python on your computer. | ||
- | * Upload your results on Moodle as image files or CSV files for grading. | ||
</note> | </note> | ||
- | **1.** Run the example code (**lab.py**) and check out the generated CSV file that contains the data about the employees in the example database. | + | Run the example code (**lab.py**) and check out the generated CSV file that contains the data about the employees in the example database. |
+ | * Change the example by using the //ex// variable (1/2/3) | ||
- | **2.** Run the example code (**plots.py**) to display the results of the functions as charts. Save the figures as image files. | + | Run the example code (**plots.py**) to display the results of the functions as charts. |
- | * Change the mode by using the //ex// variable (1/2) | + | * Change the example by using the //ex// variable (1/2) |
- | * Save the figures as shown in the image below | + | |
- | + | ||
- | {{:ewis:laboratoare:save_figure.png?400|}} | + | |
- | + | ||
- | **3.** The example code (**lab.py**) contains multiples examples. Run //example2// function to list all salaries in the company and plot them on a chart. Save the figures as image files. | + | |
- | * Change the mode by using the //ex// variable (1/2/3) | + | |
- | * Save the figures as shown in the image below | + | |
- | + | ||
- | {{:ewis:laboratoare:save_figure.png?400|}} | + | |
- | **4.** Based on the examples, create a function to display the results for one of the selected queries as CSV file. | + | **Task 3.** Based on the examples, create a function to display the results for one of the selected queries in CSV format. |
Hint: | Hint: | ||
Line 559: | Line 539: | ||
* Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | * Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | ||
- | **5.** Based on the examples, create a function to display the results for one of the selected queries as a plot. You may plot a single column, using either plot, stem, barchart or histogram. | + | **Task 4.** Based on the examples, create a function to display the results for one of the selected queries as a plot. You may plot a single column, using either plot, stem, barchart or histogram. |
+ | * Save the figures as shown in the image below | ||
- | Hint: | + | {{:ewis:laboratoare:save_figure.png?400|}} |
+ | |||
+ | Hints: | ||
* Check how the example in **lab.py** works (see ex. 3) | * Check how the example in **lab.py** works (see ex. 3) | ||
* Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | * Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | ||
Line 568: | Line 551: | ||
== 3. Online evaluation == | == 3. Online evaluation == | ||
- | * Upload all results (csv files, images) as a zip archive on Moodle. | + | * Include the results and discussion in a pdf as either text or screenshots along with the task description. |
- | * Add a readme file (readme.txt) to the archive where you describe your results | + | |
<note important> | <note important> | ||
Line 576: | Line 558: | ||
You will start from a Python project that is set up for connecting to the database, processing results and showing charts. | You will start from a Python project that is set up for connecting to the database, processing results and showing charts. | ||
- | * Install Python. | + | * Install Python. |
+ | * {{:ewis:laboratoare:python_workflow.pdf|Python Workflow}} | ||
* Download the Python project from the archive below. | * Download the Python project from the archive below. | ||
* Unzip the archive. | * Unzip the archive. | ||
Line 583: | Line 566: | ||
* Modify the code to solve the exercises. | * Modify the code to solve the exercises. | ||
- | {{:ewis:laboratoare:project_lab6.zip|Project Archive}} | + | {{:ewis:laboratoare:lab6:project_lab6.zip|Project Archive}} |
</note> | </note> | ||