Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
project-3-draft [2021/04/19 10:39] pdmatei |
project-3-draft [2021/04/25 16:07] (current) roxana_elena.stiuca prerequisite for taskset3 |
||
---|---|---|---|
Line 26: | Line 26: | ||
| Cartesian (Row -> Row -> Row) [String] Query Query | | Cartesian (Row -> Row -> Row) [String] Query Query | ||
| Projection [String] Query | | Projection [String] Query | ||
- | -- | forall a. Filter (FilterCondition a) Query | + | | forall a. FEval a => Filter (FilterCondition a) Query |
- | | Graph EdgeOp query | + | | Graph EdgeOp Query |
+ | |||
+ | -- where EdgeOp is defined: | ||
+ | type EdgeOp = Row -> Row -> Maybe Value | ||
</code> | </code> | ||
- | + | **Don't worry about Graph or Filter queries yet.** | |
- | Filtering operations are important, for which reason ... | + | |
+ | ==== Prerequisite ==== | ||
+ | Add the following lines at the beginning of your .hs files: | ||
<code haskell> | <code haskell> | ||
- | data FilterCondition a = | + | {-# LANGUAGE ExistentialQuantification #-} |
- | Eq String a | | + | {-# LANGUAGE FlexibleInstances #-} |
- | Lt String a | | + | |
- | Gt String a | | + | |
- | In String [a] | | + | |
- | FNot (FilterCondition a) | | + | |
- | FieldEq String String | + | |
- | -- where EdgeOp is defined: | + | |
- | type EdgeOp = Row -> Row -> Value | + | |
</code> | </code> | ||
- | **Don't worry about Graph or Filter queries yet.** | + | |
- | === Query Evaluation === | + | The first line allows ''forall a''. |
- | Each query defined above should evaluate to either a String, a Table | + | The second allows ''instance FEval String''. |
- | or a list of Strings. Thus we define QResult. | + | |
- | Enroll QResult in show. For Table, use write_csv. For List and String, use | + | ===== Query Evaluation ===== |
- | default. | + | |
+ | While most queries take a ''Table'' and return a transformed ''Table'', there are some queries which evaluate to a ''String'' or a list of ''String''. Thus we define the type: ''QResult'' which describes any of these three possible query result types: | ||
<code haskell> | <code haskell> | ||
data QResult = CSV CSV | Table Table | List [String] | data QResult = CSV CSV | Table Table | List [String] | ||
-- don't confuse first 'CSV' and second 'CSV': first refers to constructor name, | -- don't confuse first 'CSV' and second 'CSV': first refers to constructor name, | ||
-- second refers to type CSV (defined in taskset 1); same with 'Table'. | -- second refers to type CSV (defined in taskset 1); same with 'Table'. | ||
+ | </code> | ||
+ | |||
+ | **Task 3.1.**: Enroll ''QResult'' in class ''Show''. For Table, use ''write_csv'' (see task set X). For ''List'' and ''String'', use default. | ||
+ | |||
+ | <code haskell> | ||
instance Show QResult where | instance Show QResult where | ||
... | ... | ||
</code> | </code> | ||
- | We define class Eval, which offers function **eval**. Your job is to enroll Query | + | |
- | in this class. For each constructor from Query, we will explain what eval should | + | In order to ensure separation between queries and their evaluation (!?! is it so?) we define class ''Eval'', which offers function ''eval''. Your job is to enroll ''Query'' in this class. |
- | produce. | + | |
<code haskell> | <code haskell> | ||
class Eval a where | class Eval a where | ||
eval :: a -> QResult | eval :: a -> QResult | ||
+ | | ||
instance Eval Query where | instance Eval Query where | ||
... | ... | ||
</code> | </code> | ||
- | - **FromCSV str**: converts string str to a Table. | + | We explain below how each data constructor from ''Query'' should be evaluated: |
- | - **ToCSV query**: converts Table obtained from the evaluation of query to a | + | - ''FromCSV str'': converts string ''str'' to a ''Table''. |
- | string in CSV format. | + | - ''ToCSV query'': converts a table obtained from the evaluation of query to a ''String'' in CSV format. |
- | - **AsList colname query**: returns values from column colname as a list. | + | - ''AsList colname query'': returns values from column ''colname'' as a list. |
- | - **Sort colname query**: sorts table by column colname. | + | - ''Sort colname query'': sorts table by column ''colname''. |
- | - **ValueMap op query**: maps all values from table, using op. | + | - ''ValueMap op query'': maps all values from table, using ''op''. |
- | - **RowMap op colnames query**: maps all rows from table, using op. | + | - ''RowMap op colnames query'': maps all rows from table, using ''op''. |
- | - **VUnion query1 query2**: vertical union of the 2 tables obtained through | + | - ''VUnion query1 query2'': vertical union of the 2 tables obtained through the evaluations of ''query1'' and ''query2''. |
- | the evaluations of query1 and query2. | + | - ''HUnion query1 query2'': horizontal union of the 2 tables. |
- | - **HUnion query1 query2**: horizontal union of the 2 tables. | + | - ''TableJoin colname query1 query2'': table join with respect to column ''colname''. |
- | - **TableJoin colname query1 query2**: table join with respect to column | + | - ''Cartesian op colnames query1 query2'': cartesian product. |
- | colname. | + | - ''Projection colnames query'': extract specified columns from table. |
- | - **Cartesian op colnames query1 query2**: cartesian product. | + | |
- | - **Projection colnames query**: extract specified columns from table. | + | ===== Filters & filter conditions ===== |
+ | |||
+ | You may have noticed that filter query is commented-out. You can **uncomment** it at this stage. Filtering will receive a special treatment. Because filter conditions are usually complex, instead of performing successive filter queries it is better to build complex query conditions. For this reason, we define type ''FilterCondition a'', illustrated below: | ||
+ | |||
+ | <code haskell> | ||
+ | data FilterCondition a = | ||
+ | Eq String a | | ||
+ | Lt String a | | ||
+ | Gt String a | | ||
+ | In String [a] | | ||
+ | FNot (FilterCondition a) | | ||
+ | FieldEq String String | ||
+ | </code> | ||
+ | |||
+ | **Remark:** the type ''FilterCondition a'' is **polymorphic** because such conditions may be expressed over (in this homework) two types: | ||
+ | * ''Float'' and | ||
+ | * ''String'' | ||
+ | |||
+ | We briefly explain what each condition expresses: | ||
+ | - ''Eq colname ref'': checks if value from column ''colname'' is equal to ''ref''. | ||
+ | - ''Lt colname ref'': checks if value from column ''colname'' is less than ''ref''. | ||
+ | - ''Gt colname ref'': checks if value from column ''colname'' is greater than ''ref''. | ||
+ | - ''In colname list'': checks if value from column ''colname'' is in list. | ||
+ | - ''FNot cond'': negates condition. | ||
+ | - ''FieldEq colname1 colname2'': checks if values from columns ''colname1'' and ''colname2'' are equal. | ||
+ | |||
=== FilterCondition Evaluation === | === FilterCondition Evaluation === | ||
- | Let's take a look at FilterCondition. It is used in a Filter query, in order | + | |
- | to filter the entries (rows) in a table, based on a condition. | + | A ''FilterCondition'' must evaluate to an actual filtering function, which has type: |
- | We are going to define class FEval, which contains function feval, through | + | <code haskell> |
- | which we evaluate a FilterCondition to a function of type FilterOp (defined | + | type FilterOp = Row -> Bool |
- | also below). In order to do so, feval will also receive the column names | + | </code> |
- | (the table head). | + | |
+ | Since such filtering functions work differently for ''FilterCondition Float'' and ''FilterCondition String'', we need a class ''FEval'' which contains function ''feval''. The latter is used to evaluate a ''FilterCondition a'' to a function of type ''FilterOp''. In order to do so, ''feval'' needs to have information about column names (the table head), hence it's type is shown below. | ||
<code haskell> | <code haskell> | ||
class FEval a where | class FEval a where | ||
feval :: [String] -> (FilterCondition a) -> FilterOp | feval :: [String] -> (FilterCondition a) -> FilterOp | ||
- | type FilterOp = Row -> Bool | ||
</code> | </code> | ||
- | - **Eq colname ref**: checks if value from column colname is equal to ref. | + | |
- | - **Lt colname ref**: checks if value from column colname is less than ref. | + | **Task 3.x.**: Your task is to write the instances for ''(FEval Float)'' and ''(FEval String)''. |
- | - **Gt colname ref**: checks if value from column colname is greater than ref. | + | |
- | - **In colname list**: checks if value from column colname is in list. | + | |
- | - **FNot cond**: negates condition. | + | |
- | - **FieldEq colname1 colname2**: checks if values from columns colname1 and | + | |
- | colname2 are equal. | + | |
- | Your task is to write the instances for (FEval Float) and (FEval String). | + | |
<code haskell> | <code haskell> | ||
instance FEval Float where | instance FEval Float where | ||
Line 104: | Line 129: | ||
... | ... | ||
</code> | </code> | ||
- | Now you can write the evaluation for Filter query (**eval**). | + | |
- | === Graph Query === | + | Now you can write the evaluation for the data constructor ''Filter query'' (see function **eval** from the previous section). |
- | We define a graph as a table with column names: ["From", "To", "Value"]. | + | |
- | Each row defines a weighted edge between node "From" and node "To". | + | |
- | - **Graph edgeop query**: creates a graph starting from the table | + | ===== Graph queries ===== |
- | query evaluates to. | + | |
- | The nodes are the rows in table T. | + | A **graph** is a special kind of table which has precisely the following column names: ''["From", "To", "Value"]''. Each row defines a **weighted edge** between node ''From'' and node ''To''. |
- | The weight of an edge between 2 nodes is given by edgeop. We will only | + | |
- | keep the edge between row1 and row2 if (edgeop row1 row2) > 0. | + | The query ''Graph edgeop query'': creates such a table starting from the result of the evaluation of ''query''. Suppose the query evaluates to a table **T**. |
- | In the resulting table, a row describes an edge between node_i and node_j | + | |
- | and will have the values: | + | * The nodes are the **rows** in table **T**. |
- | "From" = first column from node_i | + | * The weight of an edge between 2 nodes is given by ''edgeop'', which returns a ''Maybe Value''. If ''edgeop row1 row2'' returns ''Nothing'', then we don't have an edge between those 2 nodes. If it returns ''Just val'' then we have an edge between ''row1'' and ''row2'' of weight ''val''. |
- | "To" = first column from node_j | + | * In the resulting table, each row describes an edge between node_i and node_j and will have the values: |
- | "Value" = edgeop node_i node_j | + | * "From" = first column from node_i |
- | The edge node_i-node_j is the same as node_j-node_i, so it should only | + | * "To" = first column from node_j |
- | appear once. "From" value should be lexicographically before "To". | + | * "Value" = edgeop node_i node_j |
- | === Similarities graph, using queries === | + | |
+ | The edge //node_i-node_j// is the same as //node_j-node_i//, so it should only appear once (graphs are unoriented). "From" value should be lexicographically before "To". | ||
+ | |||
+ | **Example:** Suppose **T** is the table shown below: | ||
+ | <code> | ||
+ | Name Grade Class | ||
+ | Mihai 9 321 | ||
+ | Andrei 8 322 | ||
+ | Stefan 10 321 | ||
+ | Ana 9 322 | ||
+ | </code> | ||
+ | |||
+ | If we would like to build a graph that connects all students in the same class, then: | ||
+ | <code haskell> | ||
+ | edgeop [_,_,z] [_,_,c] | ||
+ | | z == c = Just c | ||
+ | | otherwise = Nothing | ||
+ | </code> | ||
+ | |||
+ | and the resulting graph will be: | ||
+ | <code> | ||
+ | From To Value | ||
+ | Mihai Stefan 321 | ||
+ | Ana Andrei 322 | ||
+ | </code> | ||
+ | |||
+ | If we would like to build a graph that connects students with grades equal or with a difference of **at least** a point, then: | ||
+ | <code haskell> | ||
+ | edgeop [_,x,_] [_,y,_] | ||
+ | | abs $ (read x :: Int) - (read y :: Int) <= 1 = Just "similar" | ||
+ | | otherwise = Nothing | ||
+ | </code> | ||
+ | |||
+ | and the resulting graph is: | ||
+ | <code> | ||
+ | From To Value | ||
+ | Andrei Mihai similar | ||
+ | Mihai Stefan similar | ||
+ | Ana Mihai similar | ||
+ | Ana Andrei similar | ||
+ | Ana Stefan similar | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Similarities graph, using queries ==== | ||
We want to check the similarities between students lecture points. | We want to check the similarities between students lecture points. | ||
- | For that, we want to obtain a graph where "From" and "To" are students' | + | * For that, we want to obtain a graph where "From" and "To" are students' emails and "Value" is the distance between the 2 students' points. |
- | emails and "Value" is the distance between the 2 students' points. | + | * We define the distance between stud1 and stud2 as ''the sum of questions where they both received the same points''. Keep only the rows with ''distance >= 5''. |
- | We define the distance between stud1 and stud2 as the sum of questions | + | * The edges in the resulting graph (the rows in the resulting table) should be sorted by the "Value" column. If email is missing, don't include that entry. |
- | where they both received the same points. | + | |
- | Also, the edges in the resulting graph (the rows in the resulting table) | + | Your task is to write ''similarities_query'' as a **sequence of queries**, that once evaluated results in the graph described above. |
- | should be sorted by the "Value" column. Keep only the rows with | + | |
- | distance >= 5. If email is missing, ignore that entry. | + | **Note**: ''similarities_query'' is a Query. The checker applies ''eval'' on it. |
- | Your task is to write **similarities_query** as a sequence of | + | |
- | queries, that once evaluated results in the graph described above. | + | ===== TL;DR Tasks ===== |
- | === TL;DR Tasks === | + | - Enroll ''Query'' in class ''Eval'' (without ''Filter'' or ''Graph''). **0.3p** |
- | 1. Enroll Query in class Eval (without Filter or Graph). 0.2p | + | - Enroll ''FilterCondition'' in class ''FEval'' and implement ''eval'' for ''Filter'' query. **0.2p** |
- | 2. Enroll FilterCondition in class FEval and implement eval for Filter query. 0.2p | + | - Implement ''eval'' for ''Graph'' query. 0.2p |
- | 3. Implement eval for Graph query. 0.2p | + | - Extract similarity graph. 0.3p |
- | 4. Get graph for similarities. 0.3p | + | |
- | === Checker === | + | ===== Checker ===== |
- | === Submit === | + | |
+ | ===== Submit ===== | ||
**Deadline**: 16.05, 23:50. | **Deadline**: 16.05, 23:50. | ||
**Vmchecker**: TBA. | **Vmchecker**: TBA. |