
    

    <?xml version="1.0" encoding="utf-8"?>
<!-- generator="FeedCreator 1.7.2-ppt DokuWiki" -->
<?xml-stylesheet href="http://ocw.cs.pub.ro/courses/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="http://ocw.cs.pub.ro/courses/feed.php">
        <title>CS Open CourseWare ii:labs:03:tasks</title>
        <description></description>
        <link>http://ocw.cs.pub.ro/courses/</link>
        <image rdf:resource="http://ocw.cs.pub.ro/courses/lib/tpl/arctic/images/favicon.ico" />
       <dc:date>2026-04-04T10:14:33+03:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/01?rev=1731634317&amp;do=diff"/>
                <rdf:li rdf:resource="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/02?rev=1730973775&amp;do=diff"/>
                <rdf:li rdf:resource="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/03?rev=1731634956&amp;do=diff"/>
                <rdf:li rdf:resource="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/04?rev=1731670968&amp;do=diff"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="http://ocw.cs.pub.ro/courses/lib/tpl/arctic/images/favicon.ico">
        <title>CS Open CourseWare</title>
        <link>http://ocw.cs.pub.ro/courses/</link>
        <url>http://ocw.cs.pub.ro/courses/lib/tpl/arctic/images/favicon.ico</url>
    </image>
    <item rdf:about="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/01?rev=1731634317&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-11-15T03:31:57+03:00</dc:date>
        <title>01. [30p] Python environment</title>
        <link>http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/01?rev=1731634317&amp;do=diff</link>
        <description>01. [30p] Python environment

Python libraries are collections of reusable code that provide functionality for a wide range of tasks, from data analysis and machine learning to web development and automation. Libraries are often hosted on the Python Package Index (PyPI) and can be easily installed using package managers like pip.</description>
    </item>
    <item rdf:about="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/02?rev=1730973775&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-11-07T12:02:55+03:00</dc:date>
        <title>02. [20p] Making HTTP Requests</title>
        <link>http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/02?rev=1730973775&amp;do=diff</link>
        <description>02. [20p] Making HTTP Requests

Now that we have the requests library, we can easily send HTTP requests to any URL. This prompts the server to respond with the information we need. When the request is successful, the server will reply with a standard status code: 200 (Success), indicating everything went smoothly. Simply replace the URL with the desired website (try a wikipedia page), and you’re ready to go!</description>
    </item>
    <item rdf:about="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/03?rev=1731634956&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-11-15T03:42:36+03:00</dc:date>
        <title>03. [40p] Parsing the HTML content</title>
        <link>http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/03?rev=1731634956&amp;do=diff</link>
        <description>03. [40p] Parsing the HTML content

Now, let’s parse through the HTML we just received. We’ll use BeautifulSoup, a powerful library commonly used in web scraping. BeautifulSoup helps us navigate and work with the HTML content of any page, making it easy to locate specific data we want to extract.</description>
    </item>
    <item rdf:about="http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/04?rev=1731670968&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-11-15T13:42:48+03:00</dc:date>
        <title>04. [20p] Handling multi-page websites</title>
        <link>http://ocw.cs.pub.ro/courses/ii/labs/03/tasks/04?rev=1731670968&amp;do=diff</link>
        <description>04. [20p] Handling multi-page websites

Most websites have multiple pages, so our scraper should be capable of handling this by navigating through pagination. Pagination is typically controlled through the URL, so we’ll need to make multiple requests for each page. By identifying the pattern in the website’s subpage URLs, we can scan each page separately to gather all the data we need.</description>
    </item>
</rdf:RDF>
