Line 130: Line 130:
 **T2 (1p)** Override the method //say_hi// to show the grade as well. **T2 (1p)** Override the method //say_hi// to show the grade as well.
   *Hint: You can define (override) the method in the //Student// class and re-use the method defined in the parent class   *Hint: You can define (override) the method in the //Student// class and re-use the method defined in the parent class
-**T3 (2p)** **Polymorphism** represents a key principle of OOP. To understand this principle, create a list that contains multiple objects of class //Person// and //​Student//​. For each of the elements print the name using the method //say_hi//. Is there any difference between the two types of objects when we use them in the main program?+**T3 (1p)** **Polymorphism** represents a key principle of OOP. To understand this principle, create a list that contains multiple objects of class //Person// and //​Student//​. For each of the elements print the name using the method //say_hi//. Is there any difference between the two types of objects when we use them in the main program?
 </​note>​ </​note>​
Line 227: Line 227:
 <code python> <code python>
-from subprocess ​import ​Popen, PIPE +import ​requests 
-from lxml import etree + 
-from io import StringIO +# url to scrape data from
-user_agent = '​Mozilla/​5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/​537.36 (KHTML, like Gecko) Chrome/​55.0.2883.95 Safari/​537.36'​+
 url = '​https://​webscraper.io/​test-sites/​e-commerce/​allinone/​computers/​laptops'​ url = '​https://​webscraper.io/​test-sites/​e-commerce/​allinone/​computers/​laptops'​
-print("​fetching" ​+ url+ 
-get = Popen(['​curl',​ '​-s',​ '​-A',​ user_agent, ​url], stdout=PIPE+print("​fetching ​page") 
-result = get.stdout.read().decode('​utf8'​) + 
-tree etree.parse(StringIO(result),​ etree.HTMLParser()) +get response object 
-str_tree = etree.tostring(tree,​ encoding='​utf8',​ method='​xml'​) +response ​requests.get(url) 
-str_data ​str_tree.decode()+ 
 +get byte string 
 +byte_data ​response.content 
 +# get html source code 
 +html_data ​byte_data.decode("​utf-8"​) 
 print("​writing file") print("​writing file")
 with open("​index.html",​ "​w",​ encoding="​utf-8"​) as f: with open("​index.html",​ "​w",​ encoding="​utf-8"​) as f:
-    f.write(str_data)+    f.write(html_data)
 </​code>​ </​code>​
-<note tip>The Python script ​uses ''​curl'',​ the command line tool that can request the web page from the HTTP server. You can find more about ''​curl'' ​[[https://curl.se/docs/httpscripting.html|here]].</​note>​+<note tip>The Python script ​makes an HTTP request ​to retrieve ​the web page from the server. You can find more about HTTP requests ​[[https://developer.mozilla.org/​en-US/docs/Web/​HTTP/​Overview|here]].</​note>​
 To parse the HTML file (separating the different tags in the HTML), we use the //etree// module from //lxml// To parse the HTML file (separating the different tags in the HTML), we use the //etree// module from //lxml//
Line 252: Line 257:
 filename = "​index.html"​ filename = "​index.html"​
 +parser = etree.HTMLParser()
 tree = etree.parse(filename) tree = etree.parse(filename)
 tags = [[elem.tag, elem.attrib,​ elem.text] for elem in tree.iter()] tags = [[elem.tag, elem.attrib,​ elem.text] for elem in tree.iter()]
Line 261: Line 267:
 <​note>​ <​note>​
-**T5 (1p)** Examine the downloaded HTML file. Extract the laptop names into a text file.+**T5 (2p)** Examine the downloaded HTML file. Extract the laptop names into a text file.
   *Hint: filter the extracted tags by tag and attribute   *Hint: filter the extracted tags by tag and attribute
   *which combination of tag and attribute brings us to the data that we want to extract (laptop names)   *which combination of tag and attribute brings us to the data that we want to extract (laptop names)
