Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
project-3-draft [2021/04/19 10:38]
pdmatei
project-3-draft [2021/04/25 16:07] (current)
roxana_elena.stiuca prerequisite for taskset3
Line 26: Line 26:
     | Cartesian (Row -> Row -> Row) [String] Query Query     | Cartesian (Row -> Row -> Row) [String] Query Query
     | Projection [String] Query     | Projection [String] Query
-    | forall a. Filter (FilterCondition a) Query +    | forall a. FEval a => Filter (FilterCondition a) Query 
-    | Graph EdgeOp ​query+    | Graph EdgeOp ​Query 
 +     
 +-- where EdgeOp is defined: 
 +type EdgeOp = Row -> Row -> Maybe Value
  </​code> ​   ​  </​code> ​   ​
-  +**Don'​t worry about Graph or Filter queries yet.** 
-Filtering operations are important, for which reason ...+ 
 +==== Prerequisite ==== 
 +Add the following lines at the beginning of your .hs files:
 <code haskell> <code haskell>
-data FilterCondition a = +{-# LANGUAGE ExistentialQuantification #-} 
-    Eq String a | +{-# LANGUAGE FlexibleInstances #-}
-    Lt String a | +
-    Gt String a | +
-    In String [a] | +
-    FNot (FilterCondition a) | +
-    FieldEq String String +
--- where EdgeOp is defined: +
-type EdgeOp = Row -> Row -> Value+
 </​code>​ </​code>​
-**Don't worry about Graph or Filter queries yet.** + 
-=== Query Evaluation === +The first line allows ''​forall a''. 
-Each query defined above should ​evaluate to either ​a String, a Table +The second allows ''​instance FEval String''​. 
-or a list of Strings. Thus we define QResult. + 
-Enroll QResult in show. For Table, use write_csv. For List and String, use +===== Query Evaluation ​===== 
-default.+ 
 +While most queries take a ''​Table''​ and return a transformed ''​Table'',​ there are some queries which evaluate to a ''​String'' ​or a list of ''​String''​. Thus we define ​the type: ''​QResult''​ which describes any of these three possible query result types: 
 <code haskell> <code haskell>
 data QResult = CSV CSV | Table Table | List [String] data QResult = CSV CSV | Table Table | List [String]
 -- don't confuse first '​CSV'​ and second '​CSV':​ first refers to constructor name, -- don't confuse first '​CSV'​ and second '​CSV':​ first refers to constructor name,
 -- second refers to type CSV (defined in taskset 1); same with '​Table'​. -- second refers to type CSV (defined in taskset 1); same with '​Table'​.
 +</​code>​
 +
 +**Task 3.1.**: Enroll ''​QResult''​ in class ''​Show''​. For Table, use ''​write_csv''​ (see task set X). For ''​List''​ and ''​String'',​ use default.
 +
 +<code haskell>
 instance Show QResult where instance Show QResult where
     ...     ...
 </​code>​ </​code>​
-We define class Eval, which offers function ​**eval**. Your job is to enroll Query + 
-in this class. For each constructor from Query, we will explain what eval should +In order to ensure separation between queries and their evaluation (!?! is it so?) we define class ''​Eval''​, which offers function ​''​eval''​ Your job is to enroll ​''​Query'' ​in this class. ​
-produce.+
 <code haskell> <code haskell>
 class Eval a where class Eval a where
     eval :: a -> QResult     eval :: a -> QResult
 +    ​
 instance Eval Query where instance Eval Query where
     ...     ...
 </​code>​ </​code>​
-**FromCSV str**: converts string str to a Table. +We explain below how each data constructor from ''​Query''​ should be evaluated:​ 
-**ToCSV query**: converts ​Table obtained from the evaluation of query to a +  ​''​FromCSV str''​: converts string ​''​str'' ​to a ''​Table''​
-string ​in CSV format. +  ''​ToCSV query''​: converts ​a table obtained from the evaluation of query to a ''​String'' ​in CSV format. 
-**AsList colname query**: returns values from column colname as a list. +  ''​AsList colname query''​: returns values from column ​''​colname'' ​as a list. 
-**Sort colname query**: sorts table by column colname. +  ''​Sort colname query''​: sorts table by column ​''​colname''​
-**ValueMap op query**: maps all values from table, using op. +  ''​ValueMap op query''​: maps all values from table, using ''​op''​
-**RowMap op colnames query**: maps all rows from table, using op. +  ''​RowMap op colnames query''​: maps all rows from table, using ''​op''​
-**VUnion query1 query2**: vertical union of the 2 tables obtained through +  ''​VUnion query1 query2''​: vertical union of the 2 tables obtained through the evaluations of ''​query1'' ​and ''​query2''​
-the evaluations of query1 and query2. +  ''​HUnion query1 query2''​: horizontal union of the 2 tables. 
-**HUnion query1 query2**: horizontal union of the 2 tables. +  ''​TableJoin colname query1 query2''​: table join with respect to column ​''​colname''​
-**TableJoin colname query1 query2**: table join with respect to column +  ''​Cartesian op colnames query1 query2''​: cartesian product.  
-colname. +  ''​Projection colnames query''​: extract specified columns from table. 
-**Cartesian op colnames query1 query2**: cartesian product. + 
-**Projection colnames query**: extract specified columns from table.+===== Filters & filter conditions ===== 
 + 
 +You may have noticed that filter query is commented-out. You can **uncomment** it at this stage. Filtering will receive a special treatment. Because filter conditions are usually complex, instead of performing successive filter queries it is better to build complex query conditions. For this reason, we define type ''​FilterCondition a'',​ illustrated below: 
 + 
 +<code haskell>​ 
 +data FilterCondition a = 
 +    Eq String a | 
 +    Lt String a | 
 +    Gt String a | 
 +    In String [a] | 
 +    FNot (FilterCondition a) | 
 +    FieldEq String String 
 +</​code>​ 
 + 
 +**Remark:** the type ''​FilterCondition a''​ is **polymorphic** because such conditions may be expressed over (in this homework) two types: 
 +  * ''​Float''​ and 
 +  * ''​String''​ 
 + 
 +We briefly explain what each condition expresses:​ 
 +  - ''​Eq colname ref'':​ checks if value from column ''​colname''​ is equal to ''​ref''​. 
 +  - ''​Lt colname ref'':​ checks if value from column ''​colname''​ is less than ''​ref''​. 
 +  - ''​Gt colname ref'':​ checks if value from column ''​colname''​ is greater than ''​ref''​. 
 +  - ''​In colname list'':​ checks if value from column ''​colname''​ is in list. 
 +  - ''​FNot cond'':​ negates condition. 
 +  - ''​FieldEq colname1 colname2'':​ checks if values from columns ''​colname1''​ and ''​colname2''​ are equal. 
 + 
 === FilterCondition Evaluation === === FilterCondition Evaluation ===
-Let's take a look at FilterCondition. It is used in a Filter queryin order + 
-to filter the entries (rows) in a tablebased on condition. +A ''​FilterCondition''​ must evaluate to an actual filtering functionwhich has type: 
-We are going to define ​class FEvalwhich contains function feval, through +<code haskell>​ 
-which we evaluate a FilterCondition to a function of type FilterOp ​(defined +type FilterOp = Row -> Bool 
-also below). In order to do so, feval will also receive the column names +</​code>​ 
-(the table head).+ 
 +Since such filtering functions work differently for ''​FilterCondition Float''​ and ''​FilterCondition String''​we need a class ''​FEval'' ​which contains function ​''​feval''​. The latter is used to evaluate a ''​FilterCondition ​a'' ​to a function of type ''​FilterOp''​. In order to do so, ''​feval''​ needs to have information about column names (the table head), hence it's type is shown below.
 <code haskell> <code haskell>
 class FEval a where class FEval a where
     feval :: [String] -> (FilterCondition a) -> FilterOp     feval :: [String] -> (FilterCondition a) -> FilterOp
-type FilterOp = Row -> Bool 
 </​code>​ </​code>​
-- **Eq colname ref**: checks if value from column colname is equal to ref. + 
-**Lt colname ref**: checks if value from column colname is less than ref. +**Task 3.x.**: Your task is to write the instances for ''​(FEval Float)'' ​and ''​(FEval String)''​.
-- **Gt colname ref**: checks if value from column colname is greater than ref. +
-- **In colname list**: checks if value from column colname is in list. +
-- **FNot cond**: negates condition. +
-- **FieldEq colname1 colname2**: checks if values from columns colname1 and +
-colname2 are equal. +
-Your task is to write the instances for (FEval Float) and (FEval String).+
 <code haskell> <code haskell>
 instance FEval Float where instance FEval Float where
Line 104: Line 129:
     ...     ...
 </​code>​ </​code>​
-Now you can write the evaluation for Filter query (**eval**). + 
-=== Graph Query === +Now you can write the evaluation for the data constructor ''​Filter query'' ​(see function ​**eval** ​from the previous section). 
-We define a graph as a table with column names: ["​From",​ "​To",​ "​Value"​]. + 
-Each row defines a weighted edge between node "From" ​and node "To"+ 
-- **Graph edgeop query**: creates a graph starting from the table +===== Graph queries ===== 
-query evaluates to. + 
-The nodes are the rows in table T. +A **graph** is special kind of table which has precisely the following ​column names: ​''​["​From",​ "​To",​ "​Value"​]''​. Each row defines a **weighted edge** between node ''​From'' ​and node ''​To''​
-The weight of an edge between 2 nodes is given by edgeop. ​We will only + 
-keep the edge between row1 and row2 if (edgeop row1 row2) > 0+The query ''​Graph edgeop query''​: creates ​such table starting from the result of the evaluation of ''​query''​. Suppose the query evaluates to a table **T**
-In the resulting table, ​row describes an edge between node_i and node_j + 
-and will have the values: +  * The nodes are the **rows** in table **T**
-"​From"​ = first column from node_i +  ​* ​The weight of an edge between 2 nodes is given by ''​edgeop'',​ which returns a ''​Maybe Value''​If ''​edgeop row1 row2''​ returns ''​Nothing'',​ then we don't have an edge between ​those 2 nodes. If it returns ''​Just val''​ then we have an edge between ''​row1'' ​and ''​row2''​ of weight ''​val''​
-"​To"​ = first column from node_j +  ​* ​In the resulting table, ​each row describes an edge between node_i and node_j and will have the values: 
-"​Value"​ = edgeop node_i node_j +    ​* ​"​From"​ = first column from node_i 
-The edge node_i-node_j is the same as node_j-node_i,​ so it should only +    ​* ​"​To"​ = first column from node_j 
-appear once. "​From"​ value should be lexicographically before "​To"​. +    ​* ​"​Value"​ = edgeop node_i node_j 
-=== Similarities graph, using queries ===+ 
 +The edge //node_i-node_j// is the same as //node_j-node_i//, so it should only appear once (graphs are unoriented). "​From"​ value should be lexicographically before "​To"​. 
 + 
 +**Example:​** Suppose **T** is the table shown below: 
 +<​code>​ 
 +Name      Grade      Class 
 +Mihai     ​9 ​         321 
 +Andrei ​   8          322 
 +Stefan ​   10         321 
 +Ana       ​9 ​         322 
 +</​code>​ 
 + 
 +If we would like to build a graph that connects all students in the same class, then: 
 +<code haskell>​ 
 +edgeop [_,_,z] [_,_,c]  
 +   | z == c = Just c 
 +   | otherwise = Nothing 
 +</​code>​ 
 + 
 +and the resulting graph will be: 
 +<​code>​ 
 +From      To      Value 
 +Mihai     ​Stefan ​ 321 
 +Ana       ​Andrei ​ 322 
 +</​code>​ 
 + 
 +If we would like to build a graph that connects students with grades equal or with a difference of **at least** a point, then: 
 +<code haskell>​ 
 +edgeop [_,x,_] [_,y,_]  
 +   | abs $ (read x :: Int) - (read y :: Int) <= 1 = Just "​similar"​ 
 +   | otherwise = Nothing 
 +</​code>​ 
 + 
 +and the resulting graph is: 
 +<​code>​ 
 +From      To      Value 
 +Andrei ​   Mihai   ​similar 
 +Mihai     ​Stefan ​ similar 
 +Ana       ​Mihai ​  ​similar 
 +Ana       ​Andrei ​ similar 
 +Ana       ​Stefan ​ similar 
 +</​code>​ 
 + 
 + 
 +==== Similarities graph, using queries ===
 We want to check the similarities between students lecture points. We want to check the similarities between students lecture points.
-For that, we want to obtain a graph where "​From"​ and "​To"​ are students'​ +  * For that, we want to obtain a graph where "​From"​ and "​To"​ are students'​ emails and "​Value"​ is the distance between the 2 students'​ points. 
-emails and "​Value"​ is the distance between the 2 students'​ points. +  ​* ​We define the distance between stud1 and stud2 as ''​the sum of questions where they both received the same points''​Keep only the rows with ''​distance >= 5''​. 
-We define the distance between stud1 and stud2 as the sum of questions +  * The edges in the resulting graph (the rows in the resulting table) should be sorted by the "​Value"​ column. If email is missing, ​don't include ​that entry. 
-where they both received the same points. + 
-Also, the edges in the resulting graph (the rows in the resulting table) +Your task is to write ''​similarities_query'' ​as a **sequence of queries**, that once evaluated results in the graph described above. 
-should be sorted by the "​Value"​ column. Keep only the rows with + 
-distance >= 5. If email is missing, ​ignore ​that entry. +**Note**: ''​similarities_query''​ is a Query. The checker applies ''​eval''​ on it. 
-Your task is to write **similarities_query** as a sequence of + 
-queries, that once evaluated results in the graph described above. +===== TL;DR Tasks ===== 
-=== TL;DR Tasks === +  ​- ​Enroll ​''​Query'' ​in class ''​Eval'' ​(without ​''​Filter'' ​or ''​Graph''​). **0.3p** 
-1. Enroll Query in class Eval (without Filter or Graph). 0.2p +  ​- ​Enroll ​''​FilterCondition'' ​in class ''​FEval'' ​and implement ​''​eval'' ​for ''​Filter'' ​query. ​**0.2p** 
-2. Enroll FilterCondition in class FEval and implement eval for Filter query. 0.2p +  ​- ​Implement ​''​eval'' ​for ''​Graph'' ​query. 0.2p 
-3. Implement eval for Graph query. 0.2p +  - Extract similarity ​graph. 0.3p 
-4. Get graph for similarities. 0.3p + 
-=== Checker === +===== Checker ===== 
-=== Submit ===+ 
 +===== Submit ​=====
 **Deadline**:​ 16.05, 23:50. **Deadline**:​ 16.05, 23:50.
 **Vmchecker**:​ TBA. **Vmchecker**:​ TBA.