Tutorial 1 (week 2) Solutions · CS3217 Documents

Problem Set 1 Discussion

For Stack, one can use a normal Array with the append and popLast methods.

For Queue, there are several ways to do it:
- Using two stacks, see, e.g. Cracking The Coding Interview
- Using a linked list. If you are using a double linked list, there is the danger of retain cycles causing memory leaks.
- Using an ArraySlice, which has an efficient popFirst method
- Using a dictionary [Int: T]. Think of it as an array instead: to enqueue, you insert to the dictionary with a number greater than everything else, and to dequeue, you pop the smallest number.
One of the changes that is intended is to modify Node and Edge to be hashable instead of just equatable. To allow for this, you also need to modify the type variable T to be hashable as well. The reason for doing this is so that one can use an adjacency list to back the graph, i.e. [N: Set<E>] or [N: [E]].

Note that this actually violates the open-closed principle, as we are modifying the class instead of extending it. In particular, this means that if I have a complicated object that is only equatable, I cannot use this anymore.

One easy way to prevent this is to create a separate class, e.g. FastNode and FastEdge, that adopts the changes, instead of changing Node and Edge directly.
Several invalid reasons to not use checkRepresentation:
- It slows down performance. assert is turned off in production, so this does not affect performance in production.
- The listed invariants are already satisfied. This is invalid, as you still need to make sure that the invariants your specific implementation brings is also fulfilled. Specifically, if you are using an adjacency list in the style of [N: Set<E>], you need to make sure that e.g. [Node(1): Edge(from: 3, to: 4)] is not allowed.
Several valid reasons to not use checkRepresentation:
- Writing proper unit tests to ensure correctness instead. This is definitely an acceptable alternative to using checkRepresentation as a safety net.
- checkRepresentation slows down testing, and removing the assert statements speed up testing. Ideally you want to include some benchmarks, e.g. it took 2 hours to run, now it is only 2 minutes.
Recommended places to put checkRepresentation:
- End of constructors
- End of mutating methods
- Start of mutating methods
- End of non-mutating methods
Putting it at the start of constructors is useless as there is nothing to be checked, and putting it at the beginning of non-mutating methods is useless as well if the method is truly non-mutating. Putting at the start of mutating methods and end of non-mutating methods gives an additional safety net; for example, if a new developer joins your team, they might change things such that this safety net is useful.
Adjacency list is the most reasonable implementation, as it is fast specifically for the use case of doing BFS/DFS, and it is easier to implement.

It also takes up less space compared to adjacency matrix, only in the case of sparse graphs. Note that, for example, in complete graphs, an adjacency matrix is actually the most space-conserving implementation.
It would be recommended to write from scratch, as it will essentially only be 4 lines:
```
struct Tree<T> {
    var value: T
    var children: [Tree<T>]
}
```

Tutorial Questions

Here is one possible approach.

func createPerson(fromRow row: [String]) -> Person? {
    let person = row.split(separator: ",")
    guard person.count == 2,
          let name = String(person[0]),
          let age = Int(person[1]) else {
        return nil
    }
    return Person(name: name, age: age)
}

/// This function ignores rows that represent invalid input.
func deserialise(csvString: String) -> [Person] {
    let rowData = s.split(separator: "\n")
    return rowData.compactMap { createPerson($0) }
}

One can also write it in the original imperative style instead of functionally, and one can opt to null-coalesce the age variable (i.e. age ?? 0) instead of discarding it if it is invalid.

Special value is the least preferable as it is not communicative to the clients of the code -- an exception would be able to tell clearer things while serving as special values in some sense.

nil is preferable in these cases:
- cases where the client of the code can benefit from the language features, e.g. optional chaining instead of doing a series of try statements
- consistency with the language's features, e.g. failable initialisers
- where it is semantically correct for it to be a null value
Exceptions in preferable in these cases:
- when there is more than one way the code can "fail"
- when there is benefit in giving additional information
Analysis of the scenarios:

a. Throw an exception, so that one can include things like at which character the parser fails. In addition, null is a valid JSON input that should get parsed as nil, so we do not want nil to serve as invalid input as well.

b. Throw an exception, as opening a file might fail for various reasons like file does not exist, file is a directory, the user does not have sufficient permissions, and so on.

c. Trick question -- in many languages adding elements to a set returns nothing, e.g. in Python. This is consistent with the mathematical behaviour of sets. One might also opt to return a boolean value indicating whether the number exists in the set as well (i.e. return false if the number already exists).

d. Also trick question -- if the function checks that the array is not sorted, it will take O(n), defeating the purpose of binary search. In this case, performance constraints mean that one cannot afford to return nil or throw exceptions.

Here is a suggested answer.

/// Uses the Newton-Raphson method, set for 100 stages. It converges
/// for 1e-22 to 1e22.
/// It returns `nil` for invalid input, i.e. negative numbers.
func squareRoot(_ num: Double) -> Double? {
    if (num < 0) {
        return nil
    }

    var result: Double = 1
    for _ in 0..<100 {
        // One Newton-Raphson step
        result = (result + num / result) / 2
    }

    return result
}

One argument was made in favor of removing the "Newton-Raphson" description, as it is better placed in the documentation instead. Putting it in comments above the function this way makes it put inside Swift docs, hence it does serve that purpose in some sense.

The Swift way would be most preferable as it is the most descriptive. In addition, if we have our own objects, it would be best to construct an initialiser for the object that takes in a string so that one do not need to pollute the global namespace or monkey-patch things.

C++ has std::stoi to be consistent with C APIs such as atoi and atoll. Ruby is an expressive language, and having to_i, to_s, to_f, etc. allows the client of the code to chain things and write one-liners.