Nested List

Learning Objectives

At the end of this sub-unit, students should

know how to work with nested list.

Inductive Nested List

We first started with a general nested list that we defined inductively as follows.

Inductive Nested List

list :: []               # empty list
     :: [item] ++ list   # non-list followed by list
     :: [list] ++ list   # list followed by list

However, subsequently, we put restrictions on the structure such that we are dealing only with regular structure. This, of course, simplifies the problem significantly. But can we work with a general nested list?

Before we even answer that, is there even a use case for a nested list? Believe it or not, it can be used to model quite a number of things. The simplest example will be the directory structure in our computer. We can model with a nested list as shown below.

DirectoryAs Nested List

Nest01

Nest02

Counting

We have implemented a recursive function to count the number of element in a list, but we were only working with a non-nested list. Let us call such list as flat list. The code is reproduced below.

Counting

def count(lst):
  if len(lst) == 0:
    return 0
  else:
    return 1 + count(lst[1:])

Note that this corresponds to the inductive list definition below.

list :: []              # count is 0
     :: [item] ++ list  # count is 1 + count of list

But if we want to count all the non-list element in a nested list, this function will not give us the correct result. We can show this by running the code for a nested list. After all, the function is really just len(lst) implemented recursively.

>>> lst = [[1, 2], [[3, 4], 5], 6]
>>> count(lst)
3

So what can we do to make this code count all the non-list element? We do that in the same way we design the code for the inductive list. Based on the definition of inductive list, we managed to design a code that works for a flat list. It is instructive to compare the two design first.

Flat List

list :: []
     :: [item] ++ list
     # no nesting

Nested List

list :: []
     :: [item] ++ list
     :: [list] ++ list

We know the following from our earlier design.

The case [] is handled by len(lst) == 0.
The other case is handled by the else-block.

But this is because we only had two cases. Now that we have three cases, we need to split the last case. In particular, we need to check if the item is a list or not. How do we do that?

Recap that we can check the type of a value using the type(...) function. This will give us the type. We can use this to differentiate the two cases. If the item is a list, the following check must be true: type(item) == list. This gives us the basic structure of our code.

def count_nested(lst):
  if len(lst) == 0:        # []
    return 0
  else:
    item = lst[0]
    if type(item) != list: # [item] ++ list
      pass # todo
    else:                  # [list] ++ list
      pass # todo

We had similar problems before and we can remove the nesting with elif keyword. So let us do that first.

def count_nested(lst):
  if len(lst) == 0:          # []
    return 0
  elif type(lst[0]) != list: # [item] ++ list
    pass # todo
  else:                      # [list] ++ list
    pass # todo

Now the remaining question is what to do on the remaining two cases. As a side note, this way of writing code with pass # todo is called a stub. It is useful to see the structure of the code before writing the details.

Back to the problem. The easier case is when the item is not a list. In this case, the count is 1 plus whatever the count of the remaining list is.

def count_nested(lst):
  if len(lst) == 0:          # []
    return 0
  elif type(lst[0]) != list: # [item] ++ list
    return 1 + count_nested(lst[1:])
  else:                      # [list] ++ list
    pass # todo

But what about when the item is a list? In this case, we need to count the number of item in lst[0] and we need to count the number of item in lst[1:]. We know the latter part is count_nested(lst[1:]), what about the former? Since this list may yet still be nested, we need to use the function that can handle nested list. Luckily, we have such function. In fact, we are currently writing the code for one. So this is yet another recursion.

Counting

def count_nested(lst):
  if len(lst) == 0:          # []
    return 0
  elif type(lst[0]) != list: # [item] ++ list
    return 1 + count_nested(lst[1:])
  else:                      # [list] ++ list
    return count_nested(lst[0]) + count_nested(lst[1:])

>>> lst = [[1, 2], [[3, 4], 5], 6]
>>> count_nested(lst)
6

This is clearly a non-linear recursion as we have two calls to count_nested. How does this work? It is difficult to trace the execution without animation, so we will use reasoning instead. We separate the reasoning into the three cases.

Empty List

This is trivial, the number of element is simply 0.

Non-Nested First Element

An example case is the following. Note that lst[0] (i.e., highlighted in green) is the value 1. There is no recursion here and it corresponds to the value of 1 in 1 + .... On the other hand, lst[1:] (i.e., highlighted in red) can still be nested list. We do recursion (i.e., ... + count_nested(lst[1:])) and it should give us the total number of non-list element. Finally, the result is the sum of the green segment and the red segment.

Count01

Nested First Element

An example case is the following. In this case, even for lst[0] (i.e., highlighted in green), we need to already use recursion. While not shown, the element inside lst[0] may still contain an arbitrarily nested list. Similar as before, the case for lst[1:] is highlighted in red and will need to use recursion to compute the correct result. Also similar to before, the result is the sum of the green segment and the red segment.

Count02

Nested Map

The idea for solving nested counting is quite general for a nested list that we can immediately extend this concept to solve other problem. Take the case of multiplying all non-list element in the nested list by n. We did this for flat list before, but now we want to solve for nested list.

Since we are still using the inductive nested list view, the code structure will still be similar. So we write this down as stubs first.

def nested_map_n(lst, n):
  if len(lst) == 0:
    return []
  elif type(lst[0]) != list:
    pass # todo
  else:
    pass # todo

Now we need to figure out what to do in each case.

Non-List
- We simply have to multiply the first element (i.e., lst[0]) by n.
- Then we need to concatenate this with the result of multiplying the remaining parts of the list by n.
List
- Since the first element is a list, we need to multiply all non-list element by n.
- Then we need to concatenate this with the result of multiplying the remaining parts of the list by n.

Nested Map

def nested_map_n(lst, n):
  if len(lst) == 0:
    return []
  elif type(lst[0]) != list:
    return [lst[0] * n] + nested_map_n(lst[1:], n)
  else:
    return [nested_map_n(lst[0], n)] + nested_map_n(lst[1:], n)

>>> lst = [[1, 2], [[3, 4], 5], 6]
>>> nested_map_n(lst, 2)
[[2, 4], [[6, 8], 10], 12]

Flatten

There is an interesting property about nested map that we can exploit. To see this, we need to consider the code above. Recap that we can only concatenate list with list. That is why we need to wrap the first element with [lst[0] * n] in the case it is not a list.

Now we consider what will be the return type of nested_map(lst, n). If we look at each of the branches above, we can see that it will return a list. We first need to assume that the result of nested_map(lst, n) is a list.

Empty: We return an empty list [].
Non-List: We return a list concatenated with the result of nested_map(lst, n). Since we assume that the result of nested_map(lst, n) is a list, we have a list concatenated with a list, which gives us a list.
List: Same argument as non-list.

However, note that if the result of nested_map(lst, n) is already a list, then in the last case, we do not need to wrap this again in a list (i.e., [nested_map(lst[0], n)]). So why do we need this? The answer is to preserve the nested hierarchy of the original list.

But if the result of nested_map(lst, n) is already a list, we can easily concatenate it. So what will happen if we do not wrap this in another list. Well, we will lose the nested hierarchy of the original list. The result of nested_map(lst, n) is supposed to be nested back, but if we do not wrap this again in a list, we unwrap the list. The final result will be a flattened list. So to change from nested list to a flat list, we can use a modification of the nested map.

Flatten

def flatten(lst):
  if len(lst) == 0:
    return []
  elif type(lst[0]) != list:
    return [lst[0]] + flatten(lst[1:])
  else:
    return flatten(lst[0]) + flatten(lst[1:])

>>> lst = [[1, 2], [[3, 4], 5], 6]
>>> flatten(lst)
[1, 2, 3, 4, 5, 6]