Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:061 [2019/09/27 06:30]
andreea.alistar
ep:labs:061 [2023/10/07 21:54] (current)
emilian.radoi [[10p] Feedback]
Line 1: Line 1:
-====== Lab 06 - Advanced plotting ====== +====== Lab 06 - Advanced plotting ​(seaborn & pandas) ​======
- +
-== You’ve got the basics, now let’s unleash the power! ​==+
  
 ===== Objectives ===== ===== Objectives =====
  
-  * Conditional plotting +  * Introduction to pandas 
-  * Time-based data when plotting in gnuplot +  * Easy data manipulations ​with pandas 
-  * Advanced plotting concepts: Histograms, animations, heatmaps, three-dimensional plots +  * Introduction ​to seaborn 
-  * Insertion of graphics in the .tex file +  * More types of cool looking ​plots with seaborn 
- +  * Apply what you learned on exploring COVID data for Romania
-===== Contents ===== +
- +
-{{page>:​ep:​labs:​061:​meta:​nav&​nofooter&​noeditbutton}} +
- +
-===== Introduction ​===== +
- +
-A quick plot is enough when you are exploring a data set or a function. But when you present your results ​to others you need to prepare the plots much more carefully so that they give the information to someone who does not know all the background you do. +
- +
-**Using PostScript plots with LaTeX** +
- +
-  - Make sure all the individual image files are properly trimmed EPS files. +
-  - Create a LaTeX document. +
-  - Process this document using LaTeX. +
-  - Use the dvips utility with the -E flag to turn the resulting DVI file into Encapsulated PostScript. +
- +
-===== Summary from the previous laboratory ===== +
- +
-<​code>​ +
-scatter plot: +
-plot ’data.txt’ using 1:2 +
-plot ’data.txt’ using 1:2 with points +
- +
-example for the short format: +
-p ’data.txt’ u 1:2 w p pt 1 lt 2 lw 2 +
-notitle +
- +
-line plot: +
-plot ’data.txt’ using 1:2 with lines +
- +
-multiple data series: +
-use replot or separate by commas +
-plot ’data.txt’ using 1:2, ’data.csv’ using 1:3 +
- +
-set key: +
-plot ’data.txt’ using 1:2 title "​key"​ +
-</​code>​ +
- +
-===== Tutorial ===== +
- +
-{{namespace>:​ep:​labs:​061:​contents:​tutorial&​nofooter&​noeditbutton}} +
- +
-===== Exercises ===== +
- +
-== Exercise 01. [10p] Tutorials == +
- +
-  * Go through tutorials. +
- +
-== Exercise 02. [10p] Conditional plotting== +
- +
-<note warning>​ +
-Datafile: {{:​ep:​labs:​conditional_plotting.txt|}} +
-</​note>​ +
- +
-Using Gnuplot, generate two separate bar graphs for the following:​ +
-  * **calories_consumed/​km-ran**. +
-  * **sugar_consumed/​km-ran**. +
-  * **ratio = too high? colour the ticks in red : colour the ticks in green**. +
-The ratio is considered to be high enough when $6/$4 > 1. This will help you spot the people who live less healthy. The graphs should be as complete as possible (title, axes names, etc.). +
- +
-<​solution -hidden>​ +
-    set multiplot +
-    plot "data.txt" using 1:($6 / $4 > 1? $5 : 1/0) lt rgb "​red"​ +
-    plot "​data.txt"​ using 1:($6 / $4 < 1? $5 : 1/0) lt rgb "​green"​ +
- +
-File generated ​with+
- +
-<code python>​ +
-import random +
-nume = ["​Cazan",​ "​Ionescu",​ "​Popescu",​ "​Mateescu",​ "​Pop",​ "​Stancu",​ "​Almas",​ "​Bucur",​ "​Ghelbea",​ "​Rusu",​ "​Toncu",​ "​Bogza",​ "​Avram",​ "​Nicolae",​ "​Bibescu"​] +
-prenume = ["​Andrei",​ "​George",​ "​Adrian",​ "​Alexandra",​ "​Mircea",​ "​Andreea",​ "​Ioana",​ "​Dana",​ "​Iulia",​ "​Horia",​ "​Vlad"​] +
-out = open("​data.txt",​ "​w"​) +
-out.write("​Idx\tName\tSurname\tKm_ran\tCalories_consumed\tSugar_consumed\n"​) +
-for i in range(0, 1000): +
-  numei = random.choice(nume) +
-  prenumei = random.choice(prenume) +
-  km = round(random.uniform(0,​ 40), 1) +
-  kcal = random.randint(1200,​ 7000) +
-  sugar = random.randint(0,​ 100) +
-  out.write(str(i) + "​\t"​ + numei + "​\t"​+ prenumei + "​\t"​ + str(km) + "​\t"​ + str(kcal) + "​\t"​ + str(sugar) + "​\n"​) +
-out.close() +
-</​code>​ +
-</​solution>​ +
- +
-== Exercise 03. [10p] Stats == +
- +
-<note warning>​ +
-Datafile: {{:​ep:​labs:​health.txt|}} +
-</​note>​ +
- +
-Use Gnuplot to generate the following graphs: +
-  * Using the '​stats'​ command, find out the mean and standard deviation value for the “Temperature” and “Heart Rate” columns. +
-  * Create a rectangle that contains all the data points considered ​to be in the average normal values (assume that the “normal” values should be in the interval [mean-stddev,​ mean+stddev]). +
-  * Create a multiplot containing 3 plots using the “Temperature” and “Heart Rate” columns: one for all genders, one for males and one for females. +
-  * The graphs should be as complete as possible (title, axes names, etc.) +
- +
-<​solution -hidden>​ +
-<code bash> +
-reset #flush all variables +
- +
-set size 1, 1 +
-set multiplot layout 2,2 rowsfirst +
- +
-stats '​health.txt'​ using 2:4 nooutput +
- +
-set object 1 rect from STATS_mean_x -STATS_stddev_x,​STATS_mean_y - STATS_stddev_y to STATS_mean_x + STATS_stddev_x,​ STATS_mean_y + STATS_stddev_y lw 2 +
- +
-set title 'All genders'​ +
-set xlabel '​Temperature(F)'​ +
-set ylabel 'Heart Rate'​ +
-unset key +
-plot '​health.txt'​ using 2:4 +
- +
-set title '​Male'​ +
-set xlabel '​Temperature(F)'​ +
-set ylabel 'Heart Rate'​ +
-unset key +
-plot '​health.txt'​ using (strcol(3) eq "​male"​ ? $2: 1/0):4 +
- +
-set title '​Female'​ +
-set xlabel '​Temperature(F)'​ +
-set ylabel 'Heart Rate'​ +
-unset key +
-plot '​health.txt'​ using (strcol(3) eq "​female"​ ? $2: 1/0):4 +
-</​code>​ +
-</​solution>​ +
- +
-== Exercise 04. [10p] Time-based data when plotting in gnuplot == +
- +
-<note warning>​ +
-Datafile: {{:​ep:​labs:​time_data.txt|}} +
-</​note>​ +
- +
-Using the code provided in “Tutorial 02. Time-based data when plotting in gnuplot”, use the histogram style, and format the xtic labels using strftime and timecolumn. +
- +
-<​code>​ +
-set timefmt "​%H:​%S"​ +
-set style fill solid 0.6 border -1 +
-set style data histogram +
-set style histogram clustered gap 1 plot '​data.dat'​ using 2:​xtic(strftime('​%H',​ timecolumn(1))),​ \ ''​ using ($2*0.5), \ ''​ using ($2*0.7) +
-</​code>​ +
- +
-== Exercise 05. [10p] Plot histograms == +
- +
-<note warning>​ +
-Datafile: ​ {{:​ep:​labs:​histograms.txt|}} +
-</​note>​ +
- +
-== [5p] Task A - Multiple histograms == +
- +
-Using Gnuplot, create multiple histograms ​with '**set style histogram**'​ and '​**boxes**'​. +
- +
-== [5p] Task B - Bar graphs == +
- +
-Create a simple bar graph. Remember to make the lines solid. +
-  * Style your bars differently (set a different color for every bar). +
-  * Do multiple bars for each entry. +
-  * Use a function to pick the colors ​you want. Remember to set width and fill. +
- +
-== Exercise 05. [10p] Animations == +
- +
-<note warning>​ +
-Datafile: ​ {{:​ep:​labs:​animations.txt|}} +
-</​note>​ +
- +
-  * Create a script that animates a trajectory. Set a circle in the centre as a green filled circle. +
-  * **Hint:** Check the code from “Tutorial 04. Animations” and adjust. +
- +
-<​solution -hidden>​ +
-<code bash> +
-reset +
- +
-# Plot setting  +
-# ------------------ +
-set xrange [-1:1] +
-set yrange [-1:1] +
- +
-set xlabel "​x"​ font ", 18" +
-set ylabel "​y"​ font ", 18" +
-set ylabel "​z"​ font ", 18" +
- +
-unset key +
- +
-set pointsize 2                          # symbol size +
-set style line 2 lc rgb '#​0060ad'​ pt 7   # circle +
- +
-set object circle at first 0,0 size scr 0.01 fillcolor rgb '​green'​ fillstyle solid +
- +
-do for [ii=1:3762] { +
-   title = sprintf ("Step = %d",​ii) +
-   set title title +
-   plot '​data0.txt'​ using 2:3  every ::ii::ii linestyle 2 +
-   pause 0.02 +
-+
-</​code>​ +
-</​solution>​ +
- +
-== Exercise 06. [20p] Heatmaps == +
- +
-<note warning>​ +
-Datafile: {{:​ep:​labs:​heatmaps.txt|}} +
-</​note>​ +
- +
-== [10p] Task A - With image/​pm3d/​dgrid3d == +
- +
-Using Gnuplot, create heatmaps using: +
-  * **“with image”** +
-  * **“pm3d/​dgrid3d”** and **“splot”** +
- +
-== [10p] Task B - Interpolation == +
- +
-Create heatmap WITHOUT interpolation;​ +
-  * As default, pm3d uses a color map which varies from black to yellow via blue and red. Change the pallete! +
-  * Double the number of visible points. +
-  * Question: Have Gnuplot choose the correct number of interpolation points by itself. +
- +
-<​solution -hidden>​ +
-<​code>​ +
-set pm3d map +
-splot ‘map_data.txt’ matrix +
-set palette rgbformulae 33,13,10 +
-OR +
-set palette negative +
-OR +
-set palette grey +
-</​code>​+
  
-or 
-<​code>​ 
-set view map 
-set yrange [0.4:0.8] 
-set xrange [0.2:0.8] 
-set dgrid3d 100,100,4 
-splot '​map_data.txt'​ u 1:2:3 w pm3d 
-</​code>​ 
-</​solution>​ 
  
-<​solution --hidden>​ +===== Resources =====
-<​code>​ +
-set pm3d map interpolate 2,2 +
-splot ‘map_data.txt’ matrix +
-</​code>​ +
-</​solution>​+
  
-<​solution --hidden>​ +In this labwe will study the basic API of pandas for easier data manipulations,​ and seaborn for some more advanced and visually appealing plots that are also easy to produce. ​
-<​code>​ +
-set pm3d map interpolate 0,+
-</​code>​ +
-</​solution>​+
  
-== Exercise 07[20p] Latex  ==+For the exercises, you will explore the evolution of the COVID pandemic in Romania, using the information learned in this lab
  
-<note warning>​ +For scientific computing we need an environment that is easy to use, and provides a couple of tools like manipulating data and visualizing results. We will use Google Colab, which comes with a variety of useful tools already installed
-Datafile: {{:​ep:​labs:​heat_map_data.txt|}} +
-</​note>​+
  
-== [10p] Task A - 2D maps ==+Check out these cheetsheets for fast reference to the common libraries:
  
-Use Gnuplot to create three 2D maps in a single 3D graph. Export the result as a .pdf file (using gnuplottex package) and include also a \caption{Describe how you did the exercise}. HintYou have to give the splot command 4 pieces of information:​ the x, y, and the z coordinate,​and the value for the color.+**Cheat sheets:**
  
-<​code>​ +  - [[https://​perso.limsi.fr/​pointal/​_media/​python:cours:mementopython3-english.pdf)|python]] 
-set view 55,110 +  [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Numpy_Python_Cheat_Sheet.pdf|numpy]] 
-splot "​heat_map_data.txt" matrix ​ u 1:2:(-0.5):3 w image, \ +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] 
-      ""​ matrix u 1:(-0.5):2:3 w image, \ +  [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] 
-      ""​ matrix u (-0.5):1:2:3 w image +  - [[https://​github.com/​pandas-dev/​pandas/​blob/​master/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|pandas]] 
-</code>+  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Seaborn_Cheat_Sheet.pdf|seaborn]]
  
-== [5p] Task B - Generate pdf ==+<​note>​This lab is organized in a Jupyer Notebook hosted on Google Colab. You will find there some intuitions and applications for pandas and seaborn. Check out the Tasks section below.</​note>​
  
-Create myscript.tex and add the lines below. You should put in your '​begin{gnuplot}…end{gnuplot}'​ your solution for plotting. The main advantage for using gnuplottex is that you are allowed to use gnuplot directly inside the .tex file.+===== Tasks =====
  
-<​code>​ +==== Google Colab Notebook ====
-\documentclass[a4paper]{article} +
-\usepackage{gnuplottex} +
-  +
-\begin{document} +
-  +
-\begin{gnuplot}[terminal=pdf,​terminaloptions={font ",​10"​ linewidth 3}] +
-    plot sin(x), cos(x) +
-\end{gnuplot} +
-  +
-\begin{gnuplot}[scale=0.8] +
-    set grid +
-    set title '​gnuplottex test $e^x$'​ +
-    set ylabel '​$y$'​ +
-    set xlabel '​$x$'​ +
-    plot exp(x) with linespoints +
-\end{gnuplot} +
-  +
-\end{document} +
-</​code>​+
  
-== [5p] Task C - Compile == 
  
-Compile it! Your final result should look like this: myscript.pdf.+For this lab, we will use Google Colab for exploring pandas and seaborn. Please solve your tasks [[https://​github.com/​cosmaadrian/​ml-environment/​blob/​master/​EP_Plotting_II.ipynb|here]] by clicking "​**Open in Colaboratory**"​.
  
-<code> +You can then export this python notebook as a PDF (**File -Print**) and upload it to **Moodle**.
-#compile with +
-pdflatex --shell-escape myscript.tex +
-</​code>​+
  
-Observations:​ If gnuplottex is missing, here is gnuplottex.sty+==== [10p] Feedback ====
  
-=== 05 - Feedback ===+Please take a minute to fill in the **[[https://​forms.gle/​NpSRnoEh9NLYowFr5 | feedback form]]** for this lab.
  
-  * Please take a minute to fill in the **[[https://​docs.google.com/​forms/​d/​e/​1FAIpQLSfsMBl2EFu10jJG2qHEiSsR-qYr3wkzQPfDwjhChKnjRtDT_w/​viewform | feedback form]]** for this lab. 
ep/labs/061.1569555001.txt.gz · Last modified: 2019/09/27 06:30 by andreea.alistar
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0