Edit this page Backlinks This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Homework 2. Sets as functions ====== In this homework, you will implement a **binary search tree**, that you will use to gather stats about **words** from a particular text. Generally, in a [[https://en.wikipedia.org/wiki/Binary_search_tree| binary search tree]]: * each non-empty node contains **exactly one value** and **two children** * all values from the **left** sub-tree are smaller or equal to that of the current node * all values from the **right** sub-tree are larger or equal to that of the current node In your project, the **value** of each node will be represented by ''Token'' objects. The class ''Token'' is already implemented for you: <code scala> case class Token(word: String, freq: Int) </code> A token stores: * the **number of occurrences**, or **frequency** ''freq'' of a string ''word'', in a text. Your binary search tree will use **frequencies** as an ordering criterion. For instance, the text: ''All for one and one for one'', may be represented by the tree: <code> for (2) / \ and (1) one (3) / all (1) </code> Notice that there are multiple possible BS trees to represent one text, however you do not need to take this into account in this homework. Our tree is called ''WTree'', and is implemented by the following case classes: <code scala> case object Empty extends WTree case class Node(word: Token, left: WTree, right: WTree) extends WTree </code> ''WTree'' implements the following trait: <code scala> trait WTreeInterface { def isEmpty: Boolean def filter(pred: Token => Boolean): WTree def ins(w: Token): WTree def contains(s:String): Boolean def size: Int } </code> The method ''ins'' is already implemented, but the rest must be implemented by you. The project has two parts: * **building a WTree** from a text, and * **using a WTree**, to gather info about that particular text. In the next section you will find implementation details about each of the above. ===== Implementation ===== **1.** Write a function which splits a text using the single whitespace character as a separator. Multiple whitespaces should be treated as a single separator. If the list contains only whitespaces, ''split'' should return the empty list. (//Hints: Your implementation must be recursive, but do not try to make it tail-recursive. It will make your code unnecessarily complicated. Several patterns over lists, in the proper order will make the implementation cleaner.//) <code scala> /* split(List('h','i',' ','t','h','e','r','e')) = List(List('h','i'), List('t','h','e','r','e')) */ def split(text: List[Char]): List[List[Char]] = ??? </code> **2.** Write a function which computes a list of ''Token'' from a list of strings. Recall that Tokens keep track of the string frequency. Use an auxiliary function ''insWord'' which inserts a new string in a list of Tokens. If the string is already a token, its frequency is incremented, otherwise it is added as a new token. (//Hint: the cleanest way to implement aux is to use one of the two folds//). <code scala> def computeTokens(words: List[String]): List[Token] = { /* insert a new string in a list of tokens */ def insWord(s: String, acc: List[Token]): List[Token] = ??? def aux(rest: List[String], acc: List[Token]): List[Token] = ??? ??? } </code> **3.** Write a function ''tokensToTree'' which creates a ''WTree'' from a list of tokens. Use the insertion function ''ins'' which is already implemented. (//Hint: you can implement it as a single fold call, but you have to choose the right one//) <code scala> def tokensToTree(tokens: List[Token]): WTree = ?? </code> **4.** Write a function ''makeTree'' which takes a string and builds a ''WTree''. ''makeTree'' relies on all the previous functions you implemented. You should use ''_.toList'', which converts a ''String'' to ''List[Char]''. You can also use ''andThen'', which allows writing a concise and clear implementation. ''andThen'' is explained in detail in the next section. <code scala> def makeTree(s:String): WTree = ??? </code> **5.** Implement the member method ''size'', which must return the number of non-empty nodes in the tree. **6.** Implement the member method ''contains'', which must check if a string is a member of the tree (no matter its frequency). **7.** Implement the ''filter'' method in the abstract class ''WTree''. Filter will rely on the tail-recursive ''filterAux'' method, which must be implemented in the case classes ''Empty'' and ''Node''. **8.** In the code template you will find a string: ''scalaDescription''. Compute the number of occurrences of the keyword "Scala" in ''scalaDescription''. Use word-trees and any of the previous functions you have defined. <code scala> def scalaFreq: Int = ??? </code> **9.** Find how many programming languages are referenced in the same text. You may consider that a programming language is any keyword which starts with an uppercase character. To reference character ''i'' in a string ''s'', use ''s(i)''. You can also use the method ''_.isUpper''. <code scala> def progLang: Int = ??? </code> **10.** Find how many words which are not prepositions or conjunctions appear in the same text. You may consider that a preposition or conjunction is any word whose size is less or equal to 3. <code scala> def wordCount : Int = ??? </code> **Note:** In order to be graded, exercises 5 to 9 must rely on a correct implementation of the previous parts of the homework. ===== Using andThen ===== Suppose you want to apply a **sequence** of transformations over an object ''o''. Some of them may be functions (''f'', ''g'') while other may be member functions (''m1,m2''). Instead of defining expressions such as: ''g(f(o).m1).m2'' which reflects the sequence: ''f'', ''m1'', ''g'', ''m2'' of transformations on object ''o'', you can instead use ''andThen'': <code scala> val sequence = (x => f(x)) andThen (_.m1) andThen (x => g(x)) andThen(_.m2) </code> which is more legible and easy to debug. ====== Homework 1. Sets as functions ====== ===== Problem statement ===== Sets are **unordered** collections of **unique** elements. There are several ways to store sets. One of them relies on **characteristic functions**. Such **functional sets** are especially useful if we expect many **insert/retrieve** operations and less **traversals** in our code. A **characteristic function** of a set $math[A \subseteq U] is a function $math[f: U \rightarrow \{0,1\}] which assigns $math[f(x) = 1] for each element $math[x \in A] and $math[f(x) = 0] for each element $math[x \not\in A]. In our implementation, $math[U] will be the set of integers, hence we shall encode only **sets of integers**. Hence, the type of a set will be: <code scala> type Set = Int => Boolean </code> For instance, the set $math[\{1,2,3\}] will be encoded by the anonymous function: <code scala> (x: Int) => (x == 1 || x == 2 || x == 3) </code> Also, the empty set can be encoded as: <code scala> (x: Int) => false </code> while the entire set of integers may be encoded as: <code scala> (x: Int) => true </code> **1.** Write a function ''singleton'' which takes an integer and returns **the set** containing only that integer: <code scala> def singleton(x: Int): Set = ??? </code> Note that ''singleton'' could have been equivalently defined as: ''def singleton(x: Int)(e: Int): Boolean = ???'', however, the previous variant is more legible, in the sense that it highlights the idea that we are returning **set objects**, namely **characteristic functions**. **2.** Write a function ''member'' which takes a set and an integer and checks if the integer is a member of the set. Note that ''member'' should be defined and called as a curry function: <code scala> def member(e: Int)(set: Set): Boolean = ??? </code> **3.** Write a function ''ins'' which inserts a new element in a set. More precisely, given $math[x] and $math[set], ''ins'' returns a new set $math[\{x\} \cup set]. <code scala> def ins(x: Int)(set: Set): Set = ??? </code> **4.** Write a function ''fromBounds'' which takes two integer bounds ''start'' and ''stop'' and returns the set $math[\{start, start+1, \ldots, stop\}]. It is guaranteed that $math[start \leq stop] (you do not need to check this condition in your implementation). <code scala> def fromBounds(start: Int, stop: Int): Set = ??? </code> **5.** Write the function which performs the union of two sets: <code scala> def union(set1: Set, set2: Set): Set = ??? </code> **6.** Write a function which computes the complement of a set with respect to the set of integers: <code scala> def complement(s1: Set): Set = ??? </code> **7.** Write a function which computes the sum of value ''b'' to all elements from a set, for given **bounds**. Use a tail-end recursive function: <code scala> def sumSet(b: Int)(start: Int, stop: Int)(set: Set): Int = { def auxSum(crt: Int, acc: Int): Int = ??? ??? } </code> **8.** Generalise the previous function such that we can **fold** a set using any binary commutative operation over integers. Make sure this is a **left** fold: Folding the set: ''{x,y,z}'' with ''b'' should produce: ''( (b op x) op y) op z'' <code scala> def foldLeftSet (b:Int) // initial value (op: (Int,Int) => Int) // folding operation (start: Int, stop: Int) // bounds (inclusive) (set: Set): Int = ??? // the set to be folded </code> **9.** Implement an alternative to the previous function, namely **foldRight**. Applying ''foldRight'' on the set ''{x,y,z}'' with ''b'' should produce: ''a op (b op (c op b))''. Use direct recursion instead of tail recursion. <code scala> def foldRightSet (b:Int) // initial value (op: (Int,Int) => Int) // folding operation (start: Int, stop: Int) // bounds (inclusive) (set: Set): Int = ??? // the set to be folded </code> **10.** Implement operation ''filter'' which takes a set and returns another one containing only those elements that satisfy the predicate: <code scala> def filter(p: Int => Boolean)(set: Set): Set = ??? </code> **11.** Implement a function which **partitions** a set into two sets. The left-most contains those elements that satisfy the predicate, while the right-most contains those elements that do not satisfy the predicate. Use pairs. A pair is constructed with simple parentheses. E.g. ''(1,2)'' is a pair of two integers. Suppose ''val p: (Int,Int)'' is another pair of two integers. Then ''p._1'' is the left-most part of the pair while ''p._2'' is the right-most part of the pair. <code scala> def partition(p: Int => Boolean)(set: Set): (Set,Set) = ??? </code> **12.** Implement a function ''forall'' which checks if all elements in a given range of a set satisfy a predicate (condition). (Such a condition may be that all elements from given bounds are even numbers). <code scala> def forall(cond: Int => Boolean) // condition to be checked (start: Int, stop: Int) // start,stop values (inclusive) (set: Set): Boolean // set to be checked = ??? </code> **13.** Implement a function ''exists'' which checks if a predicate holds for **some** element from the range of a set. Hint: it is easier to implement ''exists'' using the logical relation: $math[ \exists x. P(X) \iff \lnot \forall x.\lnot P(X)]. **14.** Implement the function ''setOfDivByK'' which returns the set of integers divisible by a value ''k''. Use the appropriate functions you have defined. <code scala> def setOfDivByK(k: Int): Set = ?? </code> **15.** Implement the function ''moreDivs'' which verifies if ''set1'' contains more divisors of ''k'' than ''set2'', over the range ''[start,stop]''. Use any combination of the previous functions you have defined for your implementation. <code scala> def moreDivs(k: Int)(start: Int, stop:Int)(set1: Set, set2: Set): Boolean = ??? </code> ===== Submission rules ===== * Please follow the [[fp2023:submission-guidelines| Submission guidelines]] which are the same for all homework. * To solve your homework, download the {{:fp2023:???|Project template}}, import it in IntellIJ, and you are all set. Do not rename the project manually, as this may cause problems with IntellIJ.