Differences

This shows you the differences between two versions of the page.

Link to this comparison view

isc:labs:11 [2022/04/20 11:00]
dan.sporici [Objectives]
isc:labs:11 [2024/01/08 13:27] (current)
florin.stancu
Line 1: Line 1:
-====== Lab 11 - Security and Machine Learning ​======+====== Lab 11 - Privacy Technologies======
  
-===== Objectives ​=====+===== Overview ​=====
  
-  * learn about the vulnerabilities of deep learning models to adversarial samples +Privacy is a usually included in the larger security landscape, but it deals with aspects ​that concern people more that technologies and tries to answer ​very tough question"How to access/compute data without the owner know who you are?"While, like everything, is a sword with two blades, it tries to allow people own their data in the digital world and to provide anonymity while browsing the Internet
-  * learn to craft adversarial samples ​that manipulate ​deep neural network into producing desired outputs +
-  * generate an image which tricks this deep neural network[[https://isc-lab.api.overfitted.io/]]+
  
-===== Background ​=====+===== Exercises ​=====
  
-This laboratory discusses the **security** aspect with regard to **Deep Neural Networks** (DNNs) and their robustness to specific attacks+==== 00 [0p]Users ====
  
-As many of you already know, **DNNs** are popular nowadays and can efficiently solve a multitude of problems. Howeverthese models can be seen as powerful function ​**approximators** that work with a **large feature space** (i.e., they use many parameters)This means that deep neural networks can extract highly specific details ​and can learn to approximate a function with a pretty good accuracy+Create the following users: ​**//red//**, **//green//** and **//blue//**. Make sure that you can ssh into the VM using this usersFor examplecopy the ".ssh/" directory from student to the newly added users and "​chown"​ it accordingly
  
-//But...//+<​code>​ 
 +sudo useradd -m -s /bin/bash red  
 +sudo useradd -m -s /bin/bash green  
 +sudo useradd -m -s /bin/bash blue  
 +</​code>​
  
-==== Problem #1 ====+==== 01 [50p]. Pretty Good Privacy====
  
-Since their training methodology relies on **minimizing the overall error** between the **generated outputs** and the **expected outputs** for given dataset by employing **gradient descent**, they tend to also learn **unusual / purely numeric features** which might not always make sense to usThis happens because ​the whole optimization / training process works with purely numeric information (i.e., gradients) and doesn'​t have to "​justify"​ specific decisions as long as they match the outputs in the training set.+Pretty Good Privacy (PGP) is an encryption standard that can be used to authenticate in distributed mannerGNU Privacy Guard (GPG) is an open-source implementation of the PGP standardsIn this exercise you are required to send one file encrypted from one user to the other
  
-Moreover, ​the training is usually performed on a discrete set of inputs while the actual input distribution is continuous. This is a fancy sentence so let's look at a more concrete case+For the next exercises, you will need to be logged in as users red/​green/​blue via ssh in order to generate ​the gpg key
  
-**Example:** you're training some approximator ​(such as a neural network) to predict the values of **log(x)*for a (discrete) set of 5 points in your training setSo your model takes **[1, 2, 4, 8, 16]** as inputs and must output **[0, 1, 2, 3, 4]** -- which it does! But when you start picking values from a specific interval, e.g., between 8 and 16, the results look pretty bad.+  ​Unfortunately,​ gpg doesn'​t work when the user is with ''​su'​' (tty permission problems, owned by ''​student''​). If you want to do this, either use ''​ssh'',​ or ''​tmux''​ after logging in: it allocates a new TTY ;) 
 +  ​Generate a private/​public key using the gpg tool for each of the three users previously created. **Use <​red|green|blue>​@cs.pub.ro for the emails ;) **
  
 +<​hidden>​
 +<​code>​
 +su - blue 
 +gpg --gen-key ​
 +su - red 
 +gpg --gen-key ​
 +su - green 
 +gpg --gen-key ​
  
-{{ :isc:labs:​lab11_p1_approx.png?​nolink&​600 |}}+sudo apt-get install rng-tools 
 +sudo rngd -v -f -r /​dev/​urandom  
 +</​code>​ 
 +</​hidden>​ 
 +  * First, we are going to send **//​red//​**'​s public key to **//​green//​**. Export it into an ASCII file format and import it into **//​green//​**'​s account.  
 +<​note>​ After importing the key you should list it and double check that it was stored in the public ring. At this moment the key is not trusted yet, we will do this in a future step. </​note>​ 
 +  * You should see something similar (for red and green)<​code>​ 
 +green@isc:~$ gpg --list-keys 
 +/​home/​green/​.gnupg/​pubring.gpg 
 +------------------------------ 
 +pub   ​2048R/​13C73580 2019-04-23 
 +uid                  green <​green@cs.pub.ro> 
 +sub   ​2048R/​F1C1FF9A 2019-04-23
  
-**Conclusion:​** during the training, the approximator'​s parameters were tuned to minimize the error for the points you provided; but this doesn'​t mean that it also captures all the specifics of the **log()** functionSo you'd get **unexpected results** for some specific pointsThis is an easy example where the errors are discoverable by plotting but think of what happens when the approximator takes 1,000 inputs and uses thousands ​of parameters+pub   ​2048R/​860244A1 2019-04-23 
 +uid                  red-student <red@cs.pub.ro> 
 +sub   ​2048R/​E7626ADD 2019-04-23 
 +</​code>​ 
 +<​note>​ The description ​of fields is available [[https://​github.com/​gpg/​gnupg/​blob/​master/​doc/​DETAILS#​field-1---type-of-record|here]]</​note>​
  
 +<​hidden>​
 +<​code>​
 +student@isc:​~$sudo cp /​home/​red/​pub_red.asc /​home/​green/​.
 +[sudo] password for student:
 +student@isc:​~$ sudo chown green:green /​home/​green/​pub_red.asc
 +</​hidden>​
 +  * Now, **//​green//​** can use **//​red//​**'​s public key to authenticate him and send an encrypted file. Create a file containing a secret message, encrypt it and send it to the other party.
 +<​hidden>​
 +<​code>​
 +green@isc:​~$ echo "this is a secret message"​ > secret_file.txt
 +green@isc:​~$ gpg --encrypt --recipient red@cs.pub.ro secret_file.txt
 +gpg: E7626ADD: There is no assurance this key belongs to the named user
  
-==== Problem #2 ====+pub  2048R/​E7626ADD 2019-04-23 red-student <​red@cs.pub.ro>​ 
 + ​Primary key fingerprint:​ 950D 2356 F2DB B4D7 F4FC  9BB2 EB86 5C35 8602 44A1 
 +      Subkey fingerprint:​ F07B EFBB 284A 99F3 10BF  D964 517A 10DE E762 6ADD
  
-Neural networks don't know how to say: //I don't know//. +It is NOT certain that the key belongs ​to the person named 
-The problem here is especially visible at **classifiers**;​ a classifier is a model which tries to map an input to a specific class. Howeverthey'​re trained on a **limited number of classes** and therefore have a **limited number of possible outputs**.+in the user ID If you *reallyknow what you are doing, 
 +you may answer the next question with yes.
  
-**Example:** you've trained ​DNN which learns to identify Bob, Ben and Alice by looking at photos of their facesIt does the jobNowsomeone comes up and provides ​as input photo of a catThe classifier ​must output something ​and can only pick between Bob, Ben and Alice (because it wasn't trained ​to acknowledge ​the existence ​of other photos).+Use this key anyway? (y/N) y 
 +green@isc:~$ ls 
 +pub_red.asc ​ secret_file.txt ​ secret_file.txt.gpg 
 +</​code>​ 
 +</​hidden>​ 
 +  ​Create ​text file with some contents ​and encrypt it(echo "​text"​ > secret_file.txt)  
 +  * Send the encrypted file back to **//red//** and decrypt it. 
 +<​hidden>​ 
 +<​code>​ 
 +student@isc:​~$ sudo cp /​home/​green/​secret_file.txt.gpg /​home/​red/​. 
 +student@isc:​~$ sudo chown red:red /​home/​red/​secret_file.txt.gpg 
 +student@isc:​~$ su - red 
 +Password: 
 +red@isc:~$ ls 
 +pub_red.asc ​ secret_file.txt.gpg 
 +red@isc:~$ gpg --decrypt secret_file.txt.gpg 
 +gpg: encrypted with 2048-bit RSA keyID E7626ADD, created 2019-04-23 
 +      "​red-student <​red@cs.pub.ro>"​ 
 +this is a secret message 
 +</​code>​ 
 +</​hidden>​ 
 +  * The next step is to create a trust channel between **//​blue//​** ​and **//red//** using **//​green//​** ​as a trusted partyTo do so, **//​green//​** ​must firstly sign **//​red//​**'​s key and export both his key and **//red//**'to **//​blue//​**. Move the exported files into **//​blue//​**'​s directory and import them. After the import was done, list the keys available to **//​blue//​**. 
 +<​note>​ The signing process typically involves manually verifying the fingerprint ​of the key </​note>​ 
 +<​hidden>​ 
 +<​code>​ 
 +green@isc:​~$ gpg --sign-key red@cs.pub.ro 
 +green@isc:​~$ gpg --export -a green@cs.pub.ro > pub_green.asc 
 +green@isc:​~$ gpg --export -a red@cs.pub.ro > pub_red_signed_by_green.asc 
 +green@isc:​~$ exit 
 +logout 
 +student@isc:​~$ sudo cp /​home/​green/​pub_green.asc /​home/​blue/​ 
 +student@isc:​~$ sudo cp /​home/​green/​pub_red_signed_by_green.asc /​home/​blue/​ 
 +student@isc:​~$ su - blue 
 +blue@isc:~$ gpg --import pub_green.asc 
 +blue@isc:~$ gpg --import pub_red_signed_by_green.asc 
 +blue@isc:~$ gpg --list-key 
 +/​home/​blue/​.gnupg/​pubring.gpg 
 +----------------------------- 
 +pub   ​2048R/​C1CD918F 2019-04-23 
 +uid                  blue-student <​blue@cs.pub.ro> 
 +sub   ​2048R/​0F45CB72 2019-04-23
  
-{{ :​isc:​labs:​lab11_p2_nn.png?​nolink&​400 |}}+pub   ​2048R/​13C73580 2019-04-23 
 +uid                  green <​green@cs.pub.ro>​ 
 +sub   ​2048R/​F1C1FF9A 2019-04-23
  
-**Conclusion:​** while the classifier can output, besides the most probable class, a **confidence value** (which indicates how sure it is of its prediction),​ that value is not always very relevant because the model discriminates only between Bob, Ben and Alice.+pub   ​2048R/​860244A1 2019-04-23 
 +uid                  red-student <red@cs.pub.ro>​ 
 +sub   ​2048R/​E7626ADD 2019-04-23
  
 +</​code>​
 +</​hidden>​
 +  * Now, **//​blue//​** should mark **//​green//​**'​s key as trusted (by signing it). After this, as the **//red//** user, create a file with an important message and sign it (do not encrypt it for this step). Transfer the file to **//​blue//​**,​ read the file and verify the signature.
 +<​hidden>​
 +<​code>​
 +red@isc:~$ echo "this is an important message"​ > important_file.txt
 +red@isc:~$ gpg --sign important_file.txt
 +red@isc:~$ exit
 +student@isc:​~$ sudo cp /​home/​red/​important_file.txt.gpg /home/blue/
 +student@isc:​~$ sudo chown blue:blue /​home/​blue/​important_file.txt.gpg
 +student@isc:​~$ su - blue
 +Password:
 +blue@isc:~$ ls
 +important_file.txt.gpg ​ pub_green.asc ​ pub_red_signed_by_green.asc
 +blue@mihai-isc:​~$ gpg important_file.txt.gpg
 +gpg: Signature made Tue 23 Apr 2019 02:25:50 PM UTC using RSA key ID 860244A1
 +gpg: Good signature from "​red-student <​red@cs.pub.ro>"​
 +gpg: WARNING: This key is not certified with a trusted signature!
 +gpg:          There is no indication that the signature belongs to the owner.
 +Primary key fingerprint:​ 950D 2356 F2DB B4D7 F4FC  9BB2 EB86 5C35 8602 44A1
 +blue@isc:~$ cat important_file.txt
 +this is an important message
 +</​code>​
 +</​hidden>​
 +  * In the default setup mode, the last step should have given a warning stating that the key is not trusted while still being valid ("Good signature"​). This is because GPG uses a more complex trusted model. As a last step, login as the **//​blue//​** user and change the trust level for **//​green//​**'​s key to "I trust ultimately"​. After this verify the previous file signature again.
 +<​note>​ The web of trust allows a more elaborate algorithm to be used to validate a key. A more flexible algorithm can now be used: a key K is considered valid if it meets two conditions: \\ 1. it is signed by enough valid keys, meaning \\ a. you have signed it personally, \\ b. it has been signed by one fully trusted key, or \\ c. it has been signed by three marginally trusted keys; and \\ 2. the path of signed keys leading from K back to your own key is five steps or shorter. [[https://​www.gnupg.org/​gph/​en/​manual.html#​AEN335|ref]]</​note>​
 +<​hidden>​
 +<​code>​
 +blue@isc:~$ gpg --edit-key green@cs.pub.ro
 +gpg> trust
  
-Now let's dive into the training part...+Please decide how far you trust this user to correctly verify other users' ​keys 
 +(by looking at passports, checking fingerprints from different sources, etc.)
  
-==== Gradient Descent ====+  1 I don't know or won't say 
 +  2 I do NOT trust 
 +  3 I trust marginally 
 +  4 I trust fully 
 +  5 I trust ultimately 
 +  m back to the main menu
  
-A DNN is pretty much a complex function; ​to optimize its parameters during training a technique called **gradient descent** is employed.+Your decision? 5 
 +Do you really want to set this key to ultimate trust? (y/N) y
  
-This technique tries to minimize the **loss** (let'​s ​name this function **E()**) between the output (**y_pred**) generated by your DNN (let's call it **f()**) and the known output (**y_true**)So, the training would go as follows+gpg> quit 
- +blue@isc:~$ gpg -v --verify-files important_file.txt.gpg 
-  - use the DNN to generate a prediction from an input**y_pred = f(x)** +gpg: original file name='important_file.txt' 
-  - compute**loss = E(y_pred, y_true)** +gpgSignature made Tue 23 Apr 2019 02:44:00 PM UTC using RSA key ID 860244A1 
-  - tweak the parameters of **f()** so the **loss** will be smaller the next time (so the output is more accurate+gpg: using PGP trust model 
-This is done by computing the derivative of **E(f(x), y_true)** with respect to each parameter (**w**) from your DNN (**f()**) while keeping the inputs fixed. Consider that **f()** does the following**f(x) = w1 * x1 + w2 * x2 + ...**. Each **w** is adjusted using its derivative.  +gpg: checking ​the trustdb 
- +gpg3 keys cached ​(8 signatures
-Why the derivative? Because it can indicatewith its signin which direction (either increase or decrease) you should change the value of **w** so that the error function **E()** will decrease. ​ +gpg3 keys processed ​(3 validity counts cleared
- +gpg: 3 marginal(sneeded, 1 complete(sneeded, PGP trust model 
-{{ :isc:labs:lab11_gd.png?​nolink&​400 |}} +gpg: depth: 0  valid: ​  ​2 ​ signed: ​  ​1 ​ trust0-0q0n, 0m, 0f, 2u 
- +gpgdepth1  valid  1  signed: ​  ​0 ​ trust: 1-, 0q, 0n, 0m, 0f, 0u 
-==== Generating Adversarial Inputs ==== +gpg: Good signature from "​red-student <red@cs.pub.ro>"​ 
- +gpg: binary signature, digest algorithm SHA1 
-Now... what happens if you use **gradient descent** to... tweak inputs (**x**) instead of adjusting DNN's parameters (**w**)? +</​code>​ 
-You can pretty much generate an input that forces the DNN to generate a desired output. +</​hidden>​
- +
-  +
-===== Exercises ===== +
-This laboratory can be solved using **Google Colab** (so you don't have to install all the stuff on your machines). You'll have a concrete scenario in which you must fill some **TODO**s and generate fancy adversarial samples for a DNN. All you have to do is upload your final image on the Moodle assignment for this laboratory.+
  
-**Link to Google Colab:** https://​colab.research.google.com/​drive/​1qgzbG_2FRRXNO9ttvGnGYxoMFtqizc0d?​usp=sharing 
  
-* you'll have to clone / duplicate it in order to save changes.+==== 02[40p] TOR ====
  
 +The Tor (The Onion Routing) project ​ is an implementation of the more generic "onion routing"​ idea that allows a user to gain network anonymity while surfing the Internet. The mechanism that allows for a private surfing is based on re-encryption and "​randomly"​ routing of the packet at the level of each router within the network, allowing each router to only know the previous and the next router in the route (not the source/​destination of the packet) [[https://​www.torproject.org/​about/​history/​|ref]]. Accessing the Tor network can be done either through a local proxy of via a Browser pre-configured with the proxy server. ​
  
-===== Feedback =====+  * First, please install `tor`: <​code>​ 
 +sudo apt update 
 +sudo apt install tor 
 +</​code>​ 
 +  * Enable SOCKS proxy by editing /etc/torrc and uncommenting ''​SOCKSPort 9050''​ ;)   
 +<​note>​ Tor only supports TCP traffic, some make sure your DNS queries are done over TCP.</​note>​ 
 +<​hidden>​ 
 +<​code>​ 
 +root@isc:/​etc/​tor#​ netstat -nltp 
 +Active Internet connections (only servers) 
 +Proto Recv-Q Send-Q Local Address ​          ​Foreign Address ​        ​State ​      ​PID/​Program name 
 +tcp        0      0 127.0.0.1:​3306 ​         0.0.0.0:​* ​              ​LISTEN ​     1276/​mysqld ​     
 +tcp        0      0 0.0.0.0:​22 ​             0.0.0.0:​* ​              ​LISTEN ​     25926/​sshd ​      
 +tcp        0      0 0.0.0.0:​9050 ​           0.0.0.0:​* ​              ​LISTEN ​     1414/​tor ​        
 +tcp6       ​0 ​     0 :::80                   :::​* ​                   LISTEN ​     3280/​apache2 ​    
 +tcp6       ​0 ​     0 :::22                   :::​* ​                   LISTEN ​     25926/​sshd ​      
 +</​code>​ 
 +</​hidden>​ 
 +  * //​torsocks//​ is a tool that forces any opened program to use the Tor network for connectivity. Open a shell and find out your real IP address. Now, open a shell using //​torsocks//​ and find out the IP address via the Tor network. Restart the **tor** service and discovery your newly allocated IP address. 
 +<note tip><​code>​dig TXT +tcp +short o-o.myaddr.l.google.com @ns1.google.com | awk -F'"'​ '{ print $2}'</​code></​note>​ 
 +<​hidden>​ 
 +<​code>​ 
 +root@isc:/​etc/​tor#​ torsocks --shell 
 +/​usr/​bin/​torsocks:​ New torified shell coming right up... 
 +root@isc:/​etc/​tor#​ dig TXT +tcp +short o-o.myaddr.l.google.com @ns1.google.com | awk -F'"'​ '{ print $2}' 
 +199.249.230.72 
 +root@isc:/​etc/​tor#​ exit 
 +exit 
 +root@isc:/​etc/​tor#​ dig TXT +tcp +short o-o.myaddr.l.google.com @ns1.google.com | awk -F'"'​ '{ print $2}' 
 +141.85.241.165 
 +</​code>​ 
 +</​hidden>​ 
 +  * You are going to configure your local Firefox browser to use the Tor proxy on the VM. First, use ssh local port forwarding to make port 9050 available to your machine: <​code>​ 
 +ssh -J <​username>​@fep.grid.pub.ro -L 9050:​localhost:​9050 student@<​VM_IP>​ 
 +</​code>​ 
 +  * Next, change the **Firefox** Network Settings to use Socks5 proxy using the IP address and port from your VM. You can verify that your browser is using Tor by accessing the following [[https://​check.torproject.org/​|website]]. 
 +<​hidden>​ 
 +[[https://​1.bp.blogspot.com/​-b-MahPstRzA/​WvgwatvGq5I/​AAAAAAAAQiA/​e1rJp8RGKU08O-tV5W0oUA9kDGY5tEq5gCLcBGAs/​s1600/​proxy.png|Firefox Settings]] 
 +</​hidden>​
  
-We're in beta; help us improve this lab: https://​forms.gle/​BugCwG6GNkdq5DTg7+==== 03[10p] Feedback ====
  
 +Please take a minute to fill in the [[https://​forms.gle/​5Lu1mFa63zptk2ox9|feedback form]] for this lab.
  
  
isc/labs/11.1650441606.txt.gz · Last modified: 2022/04/20 11:00 by dan.sporici
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0