This shows you the differences between two versions of the page.
ewis:laboratoare:06 [2020/03/31 15:32] alexandru.predescu |
ewis:laboratoare:06 [2023/04/12 18:01] (current) alexandru.predescu [Exercises] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Lab 6. Databases. MySQL-Python. Charts ===== | + | ===== Lab 6. Relational Databases. MySQL-Python. Charts ===== |
- | + | ||
- | <hidden> | + | |
- | https://www.guru99.com/python-mysql-example.html | + | |
- | + | ||
- | https://www.guru99.com/sql.html | + | |
- | + | ||
- | https://www.guru99.com/joins.html | + | |
- | + | ||
- | https://www.guru99.com/sub-queries.html | + | |
- | + | ||
- | https://www.w3resource.com/sql/subqueries/understanding-sql-subqueries.php | + | |
- | </hidden> | + | |
+ | In this lab, we are introduced to more advanced SQL (functions, joins, subqueries) that are commonly used when querying databases. Then, we will connect to the database and run queries using Python and we will experiment with a plotting library to visualize the results. | ||
==== SQL. Functions ==== | ==== SQL. Functions ==== | ||
Line 121: | Line 109: | ||
</code> | </code> | ||
- | ==== SQL. Complex queries ==== | + | ==== SQL. Querying multiple tables ==== |
{{:ewis:laboratoare:schema_mysql.png?600|Database schema}} | {{:ewis:laboratoare:schema_mysql.png?600|Database schema}} | ||
Line 171: | Line 159: | ||
</code> | </code> | ||
- | == LEFT (OUTER) JOIN == | + | <spoiler Click here for other types of joins> |
+ | |||
+ | **LEFT (OUTER) JOIN** | ||
{{:ewis:laboratoare:left_join.png?200|}} | {{:ewis:laboratoare:left_join.png?200|}} | ||
Line 184: | Line 174: | ||
</code> | </code> | ||
- | == RIGHT (OUTER) JOIN == | + | **RIGHT (OUTER) JOIN** |
{{:ewis:laboratoare:right_join.png?200|}} | {{:ewis:laboratoare:right_join.png?200|}} | ||
Line 190: | Line 180: | ||
RIGHT JOIN is used to return rows from both tables based on the matching condition. The difference from INNER JOIN is that ALL the rows from the second table will be returned and if for some rows there is no matching record on the first table, the result for that column(s) will be NULL. | RIGHT JOIN is used to return rows from both tables based on the matching condition. The difference from INNER JOIN is that ALL the rows from the second table will be returned and if for some rows there is no matching record on the first table, the result for that column(s) will be NULL. | ||
- | == FULL (OUTER) JOIN == | + | **FULL (OUTER) JOIN** |
{{:ewis:laboratoare:full_join.png?200|}} | {{:ewis:laboratoare:full_join.png?200|}} | ||
Line 196: | Line 186: | ||
FULL JOIN combines LEFT JOIN and RIGHT JOIN. Where rows in the tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. | FULL JOIN combines LEFT JOIN and RIGHT JOIN. Where rows in the tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. | ||
- | == CROSS JOIN == | + | **CROSS JOIN** |
CROSS JOIN is a simplest form of JOIN which matches each row from one table to all rows of another table (similar to cartesian product in set theory). The result of a CROSS JOIN can be filtered by using a WHERE clause which may then produce the equivalent of an INNER JOIN. | CROSS JOIN is a simplest form of JOIN which matches each row from one table to all rows of another table (similar to cartesian product in set theory). The result of a CROSS JOIN can be filtered by using a WHERE clause which may then produce the equivalent of an INNER JOIN. | ||
Line 212: | Line 202: | ||
WHERE emp.position_id = pos.id | WHERE emp.position_id = pos.id | ||
</code> | </code> | ||
+ | |||
+ | </spoiler> | ||
=== Subqueries === | === Subqueries === | ||
Line 380: | Line 372: | ||
</code> | </code> | ||
- | For working with tables, the //pandas// package can be used which simplifies many operations such as extracting columns, CRUD operations (creating, reading, updating and deletion of data), filtering and data analysis. //Pandas// is often used for Machine Learning in the form of dataframes: //DataFrame//, allows to import data from various structures and file formats: lists, dictionaries, csv files, excel files, etc. | + | For working with tables, the //pandas// library can be used which simplifies many operations such as extracting columns, CRUD operations (creating, reading, updating and deletion of data), filtering and data analysis. //Pandas// is often used for Machine Learning in the form of dataframes: //DataFrame//, allows to import data from various structures and file formats: lists, dictionaries, csv files, excel files, etc. |
In the following example, data is retrieved from a cursor and added to a pandas //DataFrame//: | In the following example, data is retrieved from a cursor and added to a pandas //DataFrame//: | ||
Line 505: | Line 497: | ||
== 1. Databases (5p) == | == 1. Databases (5p) == | ||
- | <note important> | + | **Task 1.** Run the example queries presented in the lab (based on the company database example). |
- | For this part you only need MySQL Workbench and Server installed on your computer. | + | |
- | * Pick **5** exercises (ex. 0 is required) of your choice and write the SQL query that solves the task. | + | |
- | * Upload your results on Moodle as screenshots or CSV files for grading. | + | |
- | </note> | + | |
- | + | ||
- | 0. Run the example queries. | + | |
- | + | ||
- | 1. Using aggregate functions, write an SQL query to calculate the sum of all salaries in the company database. | + | |
- | + | ||
- | 2. Using aggregate functions, write an SQL query to calculate the average salary in the company database. | + | |
- | + | ||
- | 3. Write an SQL query to return a list of employees with the following information about them: | + | |
- | + | ||
- | * the name | + | |
- | * the name of their department | + | |
- | * the size of their team (number of employees within the same department) | + | |
- | + | ||
- | 4. Write an SQL query to return a list of employees with the following information about them: | + | |
- | + | ||
- | * the name | + | |
- | * the name of their position | + | |
- | * the min/average/max salary of the employees on the same position | + | |
- | + | ||
- | 5. Write an SQL query to return a list of employees with the following information about them: | + | |
- | + | ||
- | * the name in uppercase | + | |
- | * the location of their department | + | |
- | + | ||
- | 6. Write an SQL query to return a list of employees (name, salary) having a higher salary than the average salary within the company. | + | |
- | + | ||
- | 7. Write an SQL query to return a list of employees (name, salary) having a higher salary than their manager. | + | |
- | + | ||
- | 8. Write an SQL query to return the employees (name, salary) having the highest salary within each department. | + | |
- | 9. Write an SQL query to return the managers and the number of employees in their teams. | + | **Task 2.** Solve 5 of the following queries. Present the SQL and results. |
- | 10. Write an SQL query to list all employees and the year they were hired (hint: use the scalar function DATE_FORMAT(hire_date,'%Y')) | + | - Using aggregate functions, write an SQL query to calculate the sum of all salaries in the company database. |
+ | - Using aggregate functions, write an SQL query to calculate the average salary in the company database. | ||
+ | - Write an SQL query to return a list of employees with the following information about them: | ||
+ | * the name | ||
+ | * the name of their department | ||
+ | * the size of their team (number of employees within the same department) | ||
+ | - Write an SQL query to return a list of employees with the following information about them: | ||
+ | * the name | ||
+ | * the name of their position | ||
+ | * the min/average/max salary of the employees on the same position | ||
+ | - Write an SQL query to return a list of employees with the following information about them: | ||
+ | * the name in uppercase | ||
+ | * the location of their department | ||
+ | - Write an SQL query to return a list of employees (name, salary) having a higher salary than the average salary within the company. | ||
+ | - Write an SQL query to return a list of employees (name, salary) having a higher salary than their manager. | ||
+ | - Write an SQL query to return the employees (name, salary) having the highest salary within each department. | ||
+ | - Write an SQL query to return the managers and the number of employees in their teams. | ||
+ | - Write an SQL query to list all employees and the year they were hired (hint: use the scalar function DATE_FORMAT(hire_date,'%Y')) | ||
== 2. MySQL-Python. Charts (5p) == | == 2. MySQL-Python. Charts (5p) == | ||
Line 549: | Line 525: | ||
For this part you also need Python installed on your computer. | For this part you also need Python installed on your computer. | ||
* Check out the requirements section (at the end of the page) to install the required tools and setup working with Python on your computer. | * Check out the requirements section (at the end of the page) to install the required tools and setup working with Python on your computer. | ||
- | * Upload your results on Moodle as image files or CSV files for grading. | ||
</note> | </note> | ||
- | 1. Run the example code (**lab.py**) and check out the generated CSV file that contains the data about the employees in the example database. | + | Run the example code (**lab.py**) and check out the generated CSV file that contains the data about the employees in the example database. |
+ | * Change the example by using the //ex// variable (1/2/3) | ||
- | 2. Run the example code (**plots.py**) to display the results of the functions as charts. Save the figures as image files. | + | Run the example code (**plots.py**) to display the results of the functions as charts. |
- | * Change the mode by using the //ex// variable (1/2) | + | * Change the example by using the //ex// variable (1/2) |
- | * Save the figures as shown in the image below | + | |
- | + | ||
- | {{:ewis:laboratoare:save_figure.png?400|}} | + | |
- | + | ||
- | 3. The example code (**lab.py**) contains multiples examples. Run //example2// function to list all salaries in the company and plot them on a chart. Save the figures as image files. | + | |
- | * Change the mode by using the //ex// variable (1/2/3) | + | |
- | * Save the figures as shown in the image below | + | |
- | + | ||
- | {{:ewis:laboratoare:save_figure.png?400|}} | + | |
- | 4. Based on the examples, create a function to display the results for one of the selected queries as CSV file. | + | **Task 3.** Based on the examples, create a function to display the results for one of the selected queries in CSV format. |
Hint: | Hint: | ||
Line 572: | Line 539: | ||
* Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | * Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | ||
- | 5. Based on the examples, create a function to display the results for one of the selected queries as a plot. You may plot a single column, using either plot, stem, barchart or histogram. | + | **Task 4.** Based on the examples, create a function to display the results for one of the selected queries as a plot. You may plot a single column, using either plot, stem, barchart or histogram. |
+ | * Save the figures as shown in the image below | ||
+ | |||
+ | {{:ewis:laboratoare:save_figure.png?400|}} | ||
- | Hint: | + | Hints: |
* Check how the example in **lab.py** works (see ex. 3) | * Check how the example in **lab.py** works (see ex. 3) | ||
* Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | * Change the SQL query provided to the database module (''db.run_query("SELECT * FROM employee")'') | ||
Line 581: | Line 551: | ||
== 3. Online evaluation == | == 3. Online evaluation == | ||
- | * Upload all results (csv files, images) as a zip archive on Moodle. | + | * Include the results and discussion in a pdf as either text or screenshots along with the task description. |
- | * Add a readme file (readme.txt) to the archive where you describe your results | + | |
<note important> | <note important> | ||
Line 589: | Line 558: | ||
You will start from a Python project that is set up for connecting to the database, processing results and showing charts. | You will start from a Python project that is set up for connecting to the database, processing results and showing charts. | ||
- | * Install Python. | + | * Install Python. |
+ | * {{:ewis:laboratoare:python_workflow.pdf|Python Workflow}} | ||
* Download the Python project from the archive below. | * Download the Python project from the archive below. | ||
* Unzip the archive. | * Unzip the archive. | ||
Line 596: | Line 566: | ||
* Modify the code to solve the exercises. | * Modify the code to solve the exercises. | ||
- | {{:ewis:laboratoare:project_lab6.zip|Project Archive}} | + | {{:ewis:laboratoare:lab6:project_lab6.zip|Project Archive}} |
</note> | </note> | ||
- | |||
- | <hidden> | ||
- | -- 4. -- | ||
- | SELECT * | ||
- | FROM employee as e | ||
- | WHERE salary > ( | ||
- | SELECT DISTINCT AVG(salary) | ||
- | FROM employee as emp | ||
- | ); | ||
- | </hidden> | ||
- | |||
- | <hidden> | ||
- | -- 5. -- | ||
- | SELECT * | ||
- | FROM employee as e | ||
- | WHERE salary > ( | ||
- | SELECT m.salary | ||
- | FROM employee as m | ||
- | WHERE m.id = e.manager_id | ||
- | ); | ||
- | </hidden> | ||
- | |||