This course has already ended.

The latest instance of the course can be found at: O1: 2024

Luet oppimateriaalin englanninkielistä versiota. Mainitsit kuitenkin taustakyselyssä osaavasi suomea. Siksi suosittelemme, että käytät suomenkielistä versiota, joka on testatumpi ja hieman laajempi ja muutenkin mukava.

Suomenkielinen materiaali kyllä esittelee englanninkielisetkin termit. Myös suomenkielisessä materiaalissa käytetään ohjelmaprojektien koodissa englanninkielisiä nimiä kurssin alkupään johdantoesimerkkejä lukuunottamatta.

Voit vaihtaa kieltä A+:n valikon yläreunassa olevasta painikkeesta. Tai tästä: Vaihda suomeksi.


Chapter 6.4: More Collections, More Programs

About This Page

Questions Answered: How about some more handy collection methods? How do I use these methods in combination with each other?

Topics: More methods on collections; more use cases for the previously introduced methods. We’ll also devote a few words to local functions and closures.

What Will I Do? Mostly program.

Rough Estimate of Workload:? Three or four hours.

Points Available: A95 + B30 + C30.

Related Projects: Election, FlappyBug, HigherOrder.

../_images/person09.png

Computing a Result from Elements

The foldLeft method

Chapter 6.1 ended with you implementing turnElementsIntoResult, a function for computing a result (such as a sum) by processing each element of a given collection in turn and repeatedly applying a given function.

A similar very generic method is also available on Scala collections. It’s called foldLeft. Here is one way to sum up the numbers in a collection:

val numbers = Vector(10, 5, 4, 5, -20)numbers: Vector[Int] = Vector(10, 5, 4, 5, -20)
numbers.foldLeft(0)( (sumSoFar, next) => sumSoFar + next )res0: Int = 4
foldLeft has two parameter lists (like tabulate did in Chapter 6.1). In the first list goes the initial value, which is also the end result in case the collection is empty. And in the second list...
... goes the function that combines each intermediate result with the next element. In this first example, we use a simple summing function.

Here is a re-run of the animation from Chapter 6.1, this time with an updated vocabulary.

There’s a simpler way to sum numbers with foldLeft; see below. An example of multiplying all the elements is also shown.

numbers.foldLeft(0)( _ + _ )res1: Int = 4
numbers.foldLeft(1)( _ * _ )res2: Int = -20000

In the following example, the function that we pass to foldLeft adds the square of its second parameter (the element) to the first parameter (the intermediate result). This yields the sum of the squares of all the elements:

numbers.foldLeft(0)( (sum, next) => sum + next * next )res3: Int = 566

Side note: an easier way to add and multiply elements

Because this chapter is supposed to teach you foldLeft, and because summing is a nice, simple first example, we just conveniently “forgot” something.

Summing the values in a collection is a common need, and there’s a bespoke method for it in the Scala API. You can just write numbers.sum. That method is available only on collections whose elements are numeric. There’s also a product method.

So that works for summing and multiplying, but not all specific needs have a ready-made solution in the form of a method. As you already saw, foldLeft can do either of those two things, and much more besides.

More folds: checking a collection’s elements

Above, foldLeft’s return value had the same type as the collection’s elements (Int, in this case). That doesn’t need to be the case. You can, for instance, fold over numbers but produce a Boolean.

The next examples of foldLeft search a collection for numbers that meet a criterion, much like the previously familiar exists method does.

This corresponds to numbers.exists( _ < 0 ):

numbers.foldLeft(false)( (foundYet, next) => foundYet || next < 0 )res4: Boolean: true
The initial value is false. This is what the fold will produce on an empty vector (in which a negative number does not exist).
The function literal’s first parameter is of type Boolean. This intermediate result indicates whether a negative element has been found so far.
The parameter function returns true if and only if a negative element had already been found or if the element at hand is negative.

Here’s a shorter version of the above as well as another example that fails to find a match:

numbers.foldLeft(false)( _ || _ < 0 )res5: Boolean = true
numbers.foldLeft(false)( _ || _ < -100 )res6: Boolean = false

More folds: combining pictures

The code below forms an image by combining multiple smaller images. What exactly does the code do and what kind of combined image does it produce? Read the code, make a prediction, and then try to see if you were right.

val base = triangle(50, 50, Yellow)base: Pic = triangle-shape
val circles = (1 to 255).map( n => circle(n, Color(n, 0, 255)) )circles: IndexedSeq[Pic] = Vector(circle-shape, circle-shape, circle-shape, circle-shape, circle-shape, [...])
val artwork1 = circles.foldLeft(base)( _.onto(_) )artwork1: Pic = combined pic
val artwork2 = (1 to 50).foldLeft(circle(50, Red))( (result, n) => result.leftOf(circle(n, Black)) )artwork2: Pic = combined pic

And perhaps you’ll recall the function placeStar that you wrote for the Stars app in Chapter 4.4? We didn’t discuss the matter then, but one of the given methods in the same project uses your placeStar function repeatedly in a fold. Here’s the code:

def create(skyData: StarMap, bgSize: Int) = {
  val darkSky = rectangle(bgSize, bgSize, Black)
  skyData.stars.foldLeft(darkSky)(placeStar)
}

Levels of abstraction revisited

Getting to know foldLeft

These five questions aren’t crucial for success in O1, but they can be educational nonetheless.

For this question and the next two, assume numbers is a variable of type Vector[Int] and refers to a vector that contains the integers 1, 2, and 3, in that order.

Consider the expression below. Notice that the strings contain space characters.

numbers.foldLeft("Numbers: ")( (result, next) => result + next + " " )

Which of the following does the expression evaluate to?

The options below have quotation marks around them in order to highlight the spaces. These quotation marks are not characters within the actual string.

What about this expression?

numbers.foldLeft("Numbers: ")( _ + _ + " " )

Here’s one more:

numbers.foldLeft("Numbers: ")( (result, next) => result + " " + next )

For this question, assume that data is a variable that refers to some collection of elements. Further assume that f is a function of type Int => Boolean.

For the sake of practice, let’s use foldLeft to implement an equivalent of a forall method call: does every element meet the given criterion? Here’s a start:

data.foldLeft(AAA)( (result, next) => result BBB f(next) )

What should we write instead of AAA and BBB so that the expression would return the same value as data.forall(f)?

Use an abbreviated function literal with anonymous parameters to rewrite the expression from the previous question. (Also replace AAA and BBB as appropriate.)

The reduceLeft method

The reduceLeft method is very similar to foldLeft. The differences are:

  • reduceLeft doesn’t need to be given an initial value. Instead, it uses the collection’s first element for that purpose.
  • reduceLeft fails on an empty collection, causing a runtime error.
  • reduceLeft’s return value has the same type as the collection’s elements (as do the intermediate results that lead up to the return value).

As an example, let’s use reduceLeft to find the smallest element:

val numbers = Vector(10, 5, 4, 5, -20)numbers: Vector[Int] = Vector(10, 5, 4, 5, -20)
numbers.reduceLeft( min(_, _) )res7: Int: -20

Or even more simply:

numbers.reduceLeft(min)res8: Int: -20

reduceLeft is convenient when 1) you can assume that there is at least one element in the collection; and 2) the first element is what you want as a result in case it is the only element.

This additional example uses images:

val pics = Vector(Pic("face.png"), Pic("horse.png"), Pic("obstacle.png"))pics: Vector[Pic] = Vector(face.png, glider.png, obstacle.png)
val columnOfPics = pics.reduceLeft( _.above(_) )columnOfPics: Pic = combined pic

Terminological note

The verb “fold” appears in the method name because, metaphorically, it “folds” a sequence of values together into a single result. The word “left” refers to the order in which the method traverses the elements: foldLeft starts from the “left” — the front of the collection — and advances “rightwards”.

As an optional activity, you may wish to look up the methods foldRight and reduceRight.

reduceLeft plus Option — and a new favorite

It was mentioned that reduceLeft relies on there being at least one element in the collection; otherwise its result isn’t defined.

On the other hand, you’ve already learned to use the Option class to represent situations where a result may or may not exist.

We can combine the two: let the result be None if there are no elements; otherwise, we’ll call reduceLeft and wrap its return value in a Some.

Let’s return for a moment to the Category class in project GoodStuff and its favorite method, which we’ve already implemented in different ways in Chapters 4.3 and 5.5. Here’s one more version:

def favorite =
  if (this.experiences.isEmpty) None else Some(this.experiences.reduceLeft( _.chooseBetter(_) ))

If we didn’t check the collection with the if, reduceLeft would cause the program to crash with a runtime error (cf. the NullPointerException we got in Chapter 4.2).

Combining reduceLeft with an Option wrapper is common enough that there’s a separate method for it. This method, reduceLeftOption gives us a succinct implementation for favorite:

def favorite = this.experiences.reduceLeftOption( _.chooseBetter(_) )

This is essentially shorthand for the slightly longer version above.

Further reading: MapReduce

This chapter and the previous one have introduced you to methods called map and reduce. Those methods, like the other ones we’ve discussed hereabouts, are typical of the functional programming paradigm (which we’ll discuss in Chapter 10.2).

Do an internet search and find out the basic principles of MapReduce, a way of structuring programs that Google, for instance, has used for processing vast amounts of data. How is MapReduce similar to the collection methods that we’ve discussed?

Assignment: Reimplementing Election (Part 2 of 2)

In the previous chapter you reimplemented a couple of methods from class District. Reimplement two more: the two totalVotes methods.

Instead of loops, use either foldLeft or combine map with sum. Or perhaps you’ll find it instructive to see if you can come up with both implementations.

If you did as Chapter 5.6 suggested and wrote a countVotes method as an auxiliary for the two totalVotes methods, you need to change only that method.

Reimplementing those methods is enough for now. We’ll return to add more methods to District in Chapter 9.2.

A+ presents the exercise submission form here.

Assignment: FlappyBug (Part 17 of 17: Prettier Code)

Let’s also revisit the methods you wrote for FlappyBug in Chapter 5.6 and see how we can use our higher-order equipment there.

Task description

Reimplement the following methods. They should do what they did before, but you should replace any loops with higher-order method calls. You work from either your own earlier implementation or the example solution.

  1. timePasses in class Game. Use foreach (Chapter 6.3).
  2. isLost in the same class. The implementation will be much simpler if you use a method from Chapter 6.3 to directly check whether there is an obstacle that’s fatally close.
  3. makePic in the GUI. Replace the loop with a foldLeft call that places each obstacle where it should be.

Submission form

A+ presents the exercise submission form here.

Afterword #1: levels of abstraction

In the Snake assignment of Chapter 6.3 you could use the placeCopies method since all the snake’s segments looked alike. You had this fairly specific method for doing just what was needed, highly abstracted away from implementation details.

In the preceding assignment, on the other hand, you hopefully used foldLeft to implement makePic, as requested. placeCopies wouldn’t have served you here, since the obstacles aren’t identical. Lacking a specific tool, you resorted to a more generic tool at a lower level of abstraction.

If the o1 package hadn’t happened to provide placeCopies, you could have used foldLeft in Snake, too. (Try it.)

As a matter of fact, the placeCopies method has been implemented with foldLeft, so you actually used it indirectly in any case.

Afterword #2: efficiency

You hopefully used exists (or perhaps forall) to implement isLost.

In an optional section of Chapter 5.6, we raised a question about efficiency. Did our earlier implementation do extra work (in principle) by traversing the whole vector even though a matching obstacle had already been found?

exists and forall stop traversing as soon as the return value is apparent, so using them is one way to avoid this issue (which is of course entirely negligible in this application anyway).

There won’t be any further official FlappyBug assignments in O1, but feel free to make your own additions and modifications.

Using the Methods in Combination

You will often find it useful to combine the collection methods that we’ve discussed.

Here’s one toy example of using a combination of methods to process a vector of numbers. The following command counts positive elements whose square is an even number.

val numbers = Vector(10, 5, 4, 5, -20)numbers: Vector[Int] = Vector(10, 5, 4, 5, -20)
numbers.filter( _ > 0 ).map( n => n * n ).count( _ % 2 == 0 )res9: Int = 2

This mini-assignment looks at combining methods on a more general level:

Suppose we have a variable data that refers to a collection. The code given below uses a loop, a couple of ifs and a var to process the collection and compute a result. The code uses three functions p, q, and r; these can be any effect-free functions of a compatible type.

var result = false
for (element <- data) {
  if (p(element)) {
    val temp = q(element)
    if (r(temp)) {
      result = true
    }
  }
}
println(result)

The snippet below is instead written in functional style. It’s missing a few key components, however:

println(data.AAA(p).BBB(q).CCC(r))

Which methods should we call where it now says AAA, BBB, and CCC so that this (much simpler!) program would do the same thing as the longer program above? Bonus question: can you think up a more practical scenario where you might use this combination of higher-order methods?

Where it says AAA we should use:
Where it says BBB we should use:
Where it says CCC we should use:
That assignment where we have turn imperative code into a functional form clearly showed how a mixture of conditionals [in the original loop-based code] messes with your head.

Assignment: Rain

Introduction

Imagine we have a device that generates reports of rainfall measurements in Vector[Int] form. More specifically:

  • Each zero and positive number in the vector represents a measured amount of rainfall.
  • The vector contains one or more series of measurements. Each such series ends with the special value 999999, which is not a measurement but simply marks the end of the series.
  • The vector may also contain negative numbers. Such additional data does not indicate rainfall and is superfluous to our purposes.

Task description

In rainfall.scala of project HigherOrder, write averageRainfall, a function that receives a single Vector[Int] parameter as described above. The function should return the average rainfall computed from the first data series in the vector. Here are some examples of the expected behavior:

import o1.rainfall._import o1.rainfall._
averageRainfall(Vector(40, 0, 50, 999999))res10: Option[Int] = Some(30)
averageRainfall(Vector(0, 0, 0, 0, 999999))res11: Option[Int] = Some(0)
averageRainfall(Vector(999999))res12: Option[Int] = None
The average is an Int and comes in an Option wrapper.
The number 999999 just terminates a data series and isn’t included in the average. The average is that of the preceding numbers.
If there are no measurements, there’s no average either.

In case the vector contains multiple data series, only the first one matters:

averageRainfall(Vector(50, 150, 100, 100, 999999, 20, 30, 90, 999999, 0, 0, 999999))res13: Option[Int] = Some(100)
averageRainfall(Vector(999999, 50, 100, 999999))res14: Option[Int] = None
The return value reflects the first data series only.
Any numbers beyond the first 999999 are irrelevant.
If the first series has no data, there’s no result.

Negative values must not impact on the return value at all:

averageRainfall(Vector(-10, 50, -100, 150, -1, -5, 100, 100, 999999, 20, 30, 90, 999999, 0, 0, 999999))res15: Option[Int] = Some(100)
averageRainfall(Vector(-100, -5, -10, 999999, 50, 100, 999999))res16: Option[Int] = None

Instructions and hints

  • Just divide the Ints like Scala divides Ints (Chapter 1.3). Don’t round the result to the nearest whole number or anything like that.
  • Use higher-order methods on collections. There are many ways you can combine them to solve this problem. Can you come up with multiple solutions?
    • For the sake of comparison, you could sketch out a solution based on a loop and ifs. Which solution captures the programmer’s intent better?
  • You may assume that the input vector always contains the number 999999 at the end.

Submission form

A+ presents the exercise submission form here.

Assignment: Lack of Rain

Task description

In the same file, write a function drySpell that:

  • takes an input vector of rainfall data just like averageRainfall;
  • takes as its second parameter a positive integer length that indicates the minimum length of the “dry spell” that the function should look for in the data;
  • searches the vector for the first subsequence of consecutive numbers that are between 0 and 5, inclusive;
  • returns the first index of the dry spell (if found) or a negative number (if not found); and
  • considers all the numbers in the vector. However, a sequence of numbers split by the series-terminating 999999 doesn’t count, nor does a sequence split by a negative number.

An example:

val exampleData = Vector(10, 0, 3, 999999, 2, -10, 0, 0, 2, 1, 20, 1, 0, 1, 999999)exampleData: Vector[Int] = Vector(10, 0, 3, 999999, 2, -10, 0, 0, 2, 1, 20, 1, 0, 1, 999999)
drySpell(exampleData, 3)res17: Int = 6
We’re looking for the first uninterrupted dry spell that is at least three measurements long...
... and find the first such sequence at index six.

Instructions and hints

Use the higher-order methods you already know. In addition, learn these two and use them:

  • indexWhere looks for the first element that matches a criterion:

    val myVector = Vector(10, 5, 0, 3, 0, 100, 50)myVector: Vector[Int] = Vector(10, 5, 0, 3, 0, 100, 50)
    myVector.indexWhere( _ <= 0 )res18: Int = 2
    myVector.indexWhere( _ < 0 )res19: Int = -1
    
  • sliding is given a length as a parameter and generates “slices” of that length from the original collection:

    myVector.sliding(4).foreach(println)Vector(10, 5, 0, 3)
    Vector(5, 0, 3, 0)
    Vector(0, 3, 0, 100)
    Vector(3, 0, 100, 50)
    myVector.sliding(5).foreach(println)Vector(10, 5, 0, 3, 0)
    Vector(5, 0, 3, 0, 100)
    Vector(0, 3, 0, 100, 50)
    myVector.sliding(5).toVectorres20: Vector[Vector[Int]] = Vector(Vector(10, 5, 0, 3, 0), Vector(5, 0, 3, 0, 100), Vector(0, 3, 0, 100, 50))
    

    To be more specific, sliding returns a so-called iterator that provides access to a “list of slices”; you can use the iterator to access each “slice” once (and only once). Many of the familiar collection methods are available on the iterator object, including foreach, toVector, and indexWhere. If you wish to apply multiple methods to each slice, you can first copy the slices from the iterator to a vector. (That isn’t actually necessary in this assignment though.)

Submission form

A+ presents the exercise submission form here.

Reflecting on Functions

We’ll round off this chapter with a few questions that challenge you to study the given snippets of Scala code and reflect on how they work (if indeed they do). These programs will also introduce new programming techniques that will be increasingly useful as we go on.

What we have covered so far in O1 is probably not enough to impart ironclad answers to the questions below. This is deliberate. On the other hand, the answers are rather intuitive. Remember to read the feedback you receive whether or not you answered correctly!

Topic #1: a function within a function

Here are a couple of example functions. They don’t do anything too meaningful but study them anyway.

object Experiment1 {
  def findBig(vector: Vector[Int]) = {
    def countsAsBig(number: Int) = number > 1000
    vector.filter(countsAsBig)
  }
}
object Experiment2 {
  def chat() = {
    import scala.io.StdIn._
    def promptForName() = {
      readLine("Enter a name: ")
    }
    def greeting(recipient: String) = "Ave, " + recipient
    val first = promptForName()
    val second = promptForName()
    println(greeting(first + " and " + second))
  }
}
Within these two functions...
... there are further function definitions.

Which of the following claims accurately describes these function definitions?

Here is some more code for you to assess. Each of the three snippets below attempts to use Experiment1, defined above.

// Attempt 1:
println(Experiment1.countsAsBig(5000))
// Attempt 2:
val bigOnes = Experiment1.findBig(Vector(2000, 300, 5000))
println(Experiment1.countsAsBig(5000))
// Attempt 3:
val bigOnes = Experiment1.findBig(Vector(2000, 300, 5000))
println(countsAsBig(5000))

Which of the following describes these three examples?

Topic #2: local variables

Here’s a variant of findBig.

def findBig(vector: Vector[Int], limit: Int) = {
  def countsAsBig(number: Int) = number > limit
  vector.filter(countsAsBig)
}
This function takes a limit parameter that specifies how big a number should be to count as big and be included in the output vector.
In the local function countsAsBig, we use not only its own variable number...
... but also the variable limit from the surrounding function.

The question is: can we do that? Can a local variable use names defined in the surrounding environment, such as limit?

How about this other aspect of our function?

def findBig(vector: Vector[Int], limit: Int) = {
  def countsAsBig(number: Int) = number > limit
  vector.filter(countsAsBig)
}
Our function calls the filter method on a vector. The Vector class and its methods have been defined elsewhere, as part of the standard Scala API. During a program run, filter runs in its own frame within the call stack.
As we call filter, we pass in a function that uses limit, which is defined as a local variable in the surrounding context. However, filter isn’t local to findBig.

Does that work?

A few words about closures

def findBig(vector: Vector[Int], limit: Int) = {
  def countsAsBig(number: Int) = number > limit
  vector.filter(countsAsBig)
}

Our example code is easy to understand: filter the vector for numbers whose value exceeds that of a particular variable. It’s “natural” that the code works.

That the code works is not self-evident, though. After all, filter runs in its own frame on the call stack; findBig has a separate frame below and, generally speaking, functions don’t have access to local variables in other frames (Chapter 1.8).

What we did there doesn’t work in all programming languages, but it works in others (including Scala). Why?

Since the code works, it’s clear that, behind the scenes, filter somehow receives not only a reference to the function countsAsBig but some construct that has access to limit. Such a combination of a function and one or more external variables is known as a closure (sulkeuma). (The function “closes over” variables defined outside of itself.)

The Scala compiler constructs closures automatically when you use a variable from the surrounding context within a function. As a consequence, using closures is very intuitive; we’re often barely aware that we do it.

Benefiting from closures is one reason for nesting a function within another. For example, countsAsBig can use limit precisely because it is nested within findBig; we need that in order to pass in the corresponding closure to filter.

In our example, we defined a named function countsAsBig, but you can also construct closures from function literals. Here’s another way to put our example code:

def findBig(vector: Vector[Int], limit: Int) =
  vector.filter( _ > limit )
We pass filter a comparison function defined by a literal
To be more precise: we pass filter a closure whose code is _ > limit and which has access to the limit variable that it needs.

Function literals like the above, which use a variable from the surrounding context (and automatically produce closures), are perfectly common in Scala programs.

You can learn more about closures in O1’s follow-on courses or by looking up the topic online. For O1, it’s enough to know that code such as the above simply works. Just go ahead and use the variables from the surrounding context where you need them. In later chapters, we’ll do that often and unceremoniously.

More nesting

You have seen that Scala lets you nest functions in functions. In fact, you can nest classes, objects, and functions within each other pretty much freely. For instance, you can define a class within a class, or a singleton object within a method. Try it out if you like.

Summary of Key Points

  • In addition to the convenient methods introduced in the preceding chapter, collections have two highly generic methods named foldLeft and reduceLeft. You can use them to compute an overall result by processing each element of the collection in turn.
  • It’s often unnecessary to detail the execution of a method with a loop that manages ifs and updates vars and juggles many things at once. You can instead express what you want as a chain of higher-order method calls so that each method operates on the collection returned by the previous step.
  • You can define a local function much as you can define a local variable.
  • As you define a function, you can use variables from the surrounding environment in the function body. Such a construct is called a closure.
  • Links to the glossary: higher-order function; collection; level of abstraction; local function, scope; closure.

Feedback

Please note that this section must be completed individually. Even if you worked on this chapter with a pair, each of you should submit the form separately.

Credits

Thousands of students have given feedback that has contributed to this ebook’s design. Thank you!

Weeks 1 to 13 of the ebook, including the assignments and weekly bulletins, have been written in Finnish and translated into English by Juha Sorva.

Weeks 14 to 20 are by Otto Seppälä. That part of the ebook isn’t available during the fall term, but we’ll publish it when it’s time.

The appendices (glossary, Scala reference, FAQ, etc.) are by Juha Sorva unless otherwise specified on the page.

The automatic assessment of the assignments has been developed by: (in alphabetical order) Riku Autio, Nikolas Drosdek, Joonatan Honkamaa, Jaakko Kantojärvi, Niklas Kröger, Teemu Lehtinen, Strasdosky Otewa, Timi Seppälä, Teemu Sirkiä, and Aleksi Vartiainen.

The illustrations at the top of each chapter, and the similar drawings elsewhere in the ebook, are the work of Christina Lassheikki.

The animations that detail the execution Scala programs have been designed by Juha Sorva and Teemu Sirkiä. Teemu Sirkiä and Riku Autio have done the technical implementation, relying on Teemu’s Jsvee and Kelmu toolkits.

The other diagrams and interactive presentations in the ebook are by Juha Sorva.

The O1Library software has been developed by Aleksi Lukkarinen and Juha Sorva. Several of its key components are built upon Aleksi’s SMCL library.

The pedagogy behind O1Library’s tools for simple graphical programming (such as Pic) is inspired by the textbooks How to Design Programs by Flatt, Felleisen, Findler, and Krishnamurthi and Picturing Programs by Stephen Bloch.

The course platform A+ has been created by Aalto’s LeTech research group and is largely developed by students. The current lead developer is Jaakko Kantojärvi; many other students of computer science and information networks are also active on the project.

For O1’s current teaching staff, please see Chapter 1.1.

Additional credits for this page

The rainfall assignment is a version of a classic programming problem by Elliot Soloway.

a drop of ink
Posting submission...