Luet oppimateriaalin englanninkielistä versiota. Mainitsit kuitenkin taustakyselyssä osaavasi suomea. Siksi suosittelemme, että käytät suomenkielistä versiota, joka on testatumpi ja hieman laajempi ja muutenkin mukava.

Suomenkielinen materiaali kyllä esittelee englanninkielisetkin termit.

Kieli vaihtuu A+:n sivujen yläreunan painikkeesta. Tai tästä: Vaihda suomeksi.


Chapter 1.5: Collections and References

../_images/person01.png

Introduction: Collections of Elements

Most computer programs manipulate many pieces of information that are related to each other in some way. For instance, a program might need to keep track of multiple scientific measurements, multiple hotel experiences recorded by the user, multiple friends of the user, or multiple students enrolled in a course.

For such purposes, we use collections (kokoelma). A collection may contain, for instance, all the friends of a particular user, or a sequence of measurements.

A value stored in a collection is called an element (alkio) of the collection. An element could be a single measurement, a single experience, or a single person, for example.

Since the need for collections is so universal, programming languages provide a variety of collection types for the programmer to pick and choose from. Scala offers a plentiful selection, but we’ll start with just one type of collection known as a buffer (puskuri).

Creating a Buffer

In Scala, a buffer is a collection that stores elements in a specific order. You can add new elements to a buffer, remove elements that were previously added, or replace elements with new ones. Informally, you can think of a buffer as an editable list of items stored in computer memory.

To get started, let’s try creating a buffer that contains four strings as its elements:

Buffer("the first element", "second", "third", "and a fourth")res0: scala.collection.mutable.Buffer[String] = ArrayBuffer(the first element, second, third, and a fourth)

As we create a buffer, we provide the buffer’s elements as parameters, enclosing them in round brackets. In this example, the elements are arbitrarily chosen strings. We could have chosen to use some other type as well; the elements could be numbers, for instance. Commas separate the parameter expressions.

The REPL reports that the result is of type Buffer[String]. In Scala, square brackets mark type parameters (tyyppiparametri) that further specify the data type. In our case, the type parameter is String, so what we have here isn’t just any buffer, but a buffer that contains strings. You can read Buffer[String] as “buffer of string”.

A description of the buffer and its elements appears in the REPL. As you can see, the buffer contains the strings in exactly the order in which we passed them to the command that created it.

The REPL further informs us that what we got is an ArrayBuffer: a buffer that is internally based on a particular construct (arrays). In Scala, buffers are ArrayBuffers by default, but that’s not something we need to concern ourselves with at this point.

Two more examples of buffer creation are shown below. We have a buffer with two String elements and another with five Doubles.

Buffer("ABC", "XYZ")res1: scala.collection.mutable.Buffer[String] = ArrayBuffer(ABC, XYZ)
Buffer(2.40, 3.11, 4.56, 10.29, 8.11)res2: scala.collection.mutable.Buffer[Double] = ArrayBuffer(2.4, 3.11, 4.56, 10.29, 8.11)

Note how the buffers have different type parameters, which Scala determines automatically from the elements you put in each buffer.

Side note about this ebook: The REPL reports the buffer’s type thoroughly, prefixing it with a package name as in scala.collection.mutable.Buffer[Double]. For your reading convenience, future examples in this ebook slightly simplify such outputs: you’ll see just Buffer[Double] instead of the longer type description.

Accessing a Buffer’s Contents

The examples that follow will use the following buffer and a numbers variable that refers to the buffer.

val numbers = Buffer(12, 2, 4, 7, 4, 4, 10, 3)numbers: Buffer[Int] = ArrayBuffer(12, 2, 4, 7, 4, 4, 10, 3)

We’ll find this variable useful for accessing the collection after it’s been created, as you’ll see below.

Examining an element

Each of a buffer’s elements is stored at a particular location within the buffer. Each such location is identified by its position number, called an index (indeksi). When we manipulate buffers, we often use indices to target specific elements.

Here’s how you can access the value stored at a particular index. First indicate the buffer that you wish to examine, then put the desired index in round brackets:

numbers(0)res3: Int = 12

Indices start from zero! The above expression accesses the buffer’s first element.

You can use expressions like the one above in lots of ways. For instance, you can access a buffer element and assign that value to a variable. In the example below, the buffer’s fourth element (that is, the value at index three) is assigned to the variable fourthNumber:

val fourthNumber = numbers(3)fourthNumber: Int = 7

Practice on buffers

The questions below should help you build an understanding of the basic buffer commands. Work out what the given code fragments do. Experiment in the REPL.

Examine the following commands. What do they do with the buffer?

val testBuffer = Buffer(4, 10, 3, 10, 15, -2)
println(testBuffer(4))

Please enter the number that the last command prints out:

Examine the following commands. What do they do with the buffer?

val testBuffer = Buffer(4, 10, 3, 10, 15, -2)
val oneOfTheNumbers = testBuffer(0)
val sum = oneOfTheNumbers + testBuffer(3)

Please enter the number that gets stored in sum:

Examine the following commands. What do they do with the buffer?

val testBuffer = Buffer(4, 10, 3, 10, 15, -2)
val someIndex = 1
println(testBuffer(someIndex) + testBuffer(someIndex + 1))

Please enter the number that the last command prints out:

Assume a buffer and a myNumbers variable have been defined like this:

val myNumbers = Buffer(10.5, 10.3, 9.8, 7.9, 10.2, 9.7)

Below, enter a Scala command that accesses the buffer’s fifth element (which is 10.2 in our example) and prints it out. (Print out just the number on a separate line, nothing else.)

Buffers are Mutable

Replacing an element

You can replace an element with a new value by adding an equals sign. In this example, the fourth element in the buffer is replaced by the number one:

numbers(3) = 1

That command does not — in itself — produce any value of interest. It just instructs the computer to modify the buffer’s contents. This is why the REPL stays silent. However, we can request the value of the numbers variable and see that the fourth element has indeed changed in the computer’s memory:

numbersres4: Buffer[Int] = ArrayBuffer(12, 2, 4, 1, 4, 4, 10, 3)

That change affected only the buffer, not the variable that we used earlier to store the old value:

fourthNumberres5: Int = 7

Adding an element

The operator += adds a new element at the end of the buffer. This means that the buffer’s size increases.

numbers += 11res6: Buffer[Int] = ArrayBuffer(12, 2, 4, 1, 4, 4, 10, 3, 11)

Using a buffer is similar to using a variable, with the addition that we need indices to target specific elements. In a sense, a buffer is like a numbered list of var variables.

Speaking of which: The variable that we used to refer to the buffer is a val. That does not prevent us from modifying the buffer’s contents, since being val is not an attribute of the buffer itself.

Since the numbers variable is defined as a val, that variable cannot be reassigned to later refer to some other buffer. Nevertheless, we can change the contents of the buffer that numbers refers to.

Empty buffers

It’s often useful to create a buffer that starts out empty. For instance, when the GoodStuff application launches, there are initially no recorded experiences, even though experiences may be added later.

The following doesn’t quite accomplish what we want:

val wonderIfThisWorks = Buffer()wonderIfThisWorks: Buffer[Nothing] = ArrayBuffer()

That command did make us a new empty buffer, but empty it will remain. The computer doesn’t know what we’re planning to store and gives the buffer the unhelpful type Buffer[Nothing]. Such a buffer won’t bring us any joy, since we can’t add anything to it.

We can, however, tell the computer what kind of buffer we want to create. Let’s try creating an empty buffer where we’ll be able to add strings. The type parameter — String in this case — goes in square brackets just as you saw in REPL outputs:

val words = Buffer[String]()words: Buffer[String] = ArrayBuffer()

We can now add strings to this buffer:

words += "llama"res7: Buffer[String] = ArrayBuffer(llama)

Actually, you’re allowed to write out the type parameter whenever you create a buffer, even where it’s not required. This works, too:

val numbers = Buffer[Int](2, -1, 10)numbers: Buffer[Int] = ArrayBuffer(2, -1, 10)

We could have left out the type parameter [Int], because the Scala toolkit figures out that we want a Buffer[Int]. It does so by examining the expressions 2, -1, and 10, which determine the buffer’s initial elements.

More practice

The questions below should help you build an understanding of the basic buffer commands. Work out what the given code fragments do. Experiment in the REPL.

The following example adds elements to a buffer. What happens to the buffer at each step?

val words = Buffer[String]()
val word = "love"
words += "ents"
words += word
words += "elem" + words(0)
println(words(0) + " " + words(1) + " " + words(2))

Please enter the text printed out by the last line:

Examine the following commands. What do they do to the buffer?

val testBuffer = Buffer(4, 10, 3, 10, 15, -2)
var index = 0
testBuffer(index) = 0
index = index + 1
testBuffer(index) = 0
index = index + 1
testBuffer(index) = 0
index = index + 1
testBuffer(index) = 0

After executing these commands, what is the sum of the buffer’s elements?

(If you try the code in the REPL, please enter the commands one by one.)

Let’s assume the following lines of code have just been executed.

val myNumbers = Buffer(10.5, 10.3, 9.8, 7.9, 10.2, 9.7)
val numberOfElements = 6
val empty = Buffer[Double]()

Below are a few attempts to access an individual element of the buffer. Which of them result in an error message (and why)?

Say we’ve created a buffer as well as a variable myNumbers that refers to that buffer, exactly as in the previous question. In the field below, enter a Scala command that adds the number 9.9 as the last (seventh) element of the buffer.

References

Chapter 1.4 said that a variable stores just a single value. But didn’t we just have a buffer variable that contained multiple values?

Sort of, but not really.

To answer that question properly, we need to introduce the concept of reference (viittaus). Eventually, it will turn out that this concept is important in many aspects of programming, not just when working with buffers.

Did you notice that test is the name of a variable, not the name of a buffer? (Buffers don’t have names.) This fact will be important momentarily.

References have a real effects on how programs behave. Take a look at another animation.

Student questions about the references in the animations

I didn’t quite get what the ref@ and id stuff was in the animations.

Is it possible to write ref@something in the REPL? I tried it, but got an error message.

The animations in this ebook show, as diagrams, what happens in the computer’s memory as it runs Scala programs. The animations leave out various details, but they do illustrate the relevant parts of what the computer does.

The ref that you see in the animations is not Scala code but part of these ebook illustrations. It means a reference to some information (here: a buffer) that is located in another location in the computer’s memory.

You can think of it like this: Whenever your program creates a new buffer, the computer automatically gives it a unique number, an id. A reference is a value that indicates a particular buffer: in the animations, ref@7 means a reference to the buffer whose id is 7. When a variable stores a reference, it stores such a number.

Keeping track of such id numbers is part of the internal ”housekeeping” that the computer does in order to run your program correctly.

The computer deals with such things automatically. Under most circumstances, the programmer doesn’t need to think about the actual id numbers that their program generates in memory. (In O1, you’ll never need to.) The id numbers in the ebook animations are made-up and arbitrary — the specific numbers are not important. What matters are the general principles:

  • Each buffer has its own unique number that the computer uses to distinguish it from other memory contents.

  • Buffers and variables are separate entities in memory.

  • Multiple variables may refer to the same buffer. This is possible, because a variable does not store the buffer itself but a reference to the buffer — you can think of this as storing the buffer’s id number. Multiple different variables may store identical references; in other words, the variables may ”point to the same location”.

Scala programmers don’t write id numbers directly into code. For example, ref@7 is not a Scala command, and you’ll get an error message if you enter it in the REPL.

Practice on references

Think through the following code and answer the questions. If you find this assignment hard, you may want to draw a diagram of the buffers, variables, and references involved. You can also enter the code in the REPL. If you struggle to make progress, review the material on buffers and references above or ask for help.

var person = Buffer("Cai-Göran")
var another = Buffer("Cai-Göran")
var presidentOfFinland = another

How many buffers does this code create?

What about variables that store a reference to some buffer? How many of those does the code create?

Let’s continue by executing the following lines next.

presidentOfFinland += "Alexander"
another += "Stubb"

How many buffers exist now?

How many elements are now stored in the buffer that the variable presidentOfFinland refers to?

How many elements are now stored in the buffer that the variable another refers to?

Let’s continue with the previous example. In addition to the commands we already entered, let’s execute this one:

another = Buffer("Suzanne", "Innes-Stubb")

How many elements are there now in the buffer that the variable presidentOfFinland refers to?

How many elements are there now in the buffer that the variable another refers to?

And finally:

another = person
person = Buffer("Milo")

How many of the variables now refer to a buffer that contains the names "Suzanne" and "Innes-Stubb"?

How many of the variables now refer to a buffer that contains only the name "Cai-Göran" and nothing else?

On References and This Chapter

There are myriad things that you can do with buffers and other kinds of collections. We’ve hardly scratched the surface. You’ll see many examples of collections in the chapters that follow. As already mentioned, one such example is the experience categories in the GoodStuff app. We’ll be storing many different kinds of values in collections, not just numbers or strings.

Right now, what matters most is that you know how to create buffers and how to access and modify their individual elements. And that you know that you manipulate buffers via references.

You can expect the concept of reference to come up time and again. We’ll use references to point at many other things besides buffers.

Beginners often find references a hard concept to grasp. One of the reasons is language. When we humans talk or write about programming, we often take “shortcuts”. It’s rare for us to say that “the variable numbers contains a reference to a buffer”; instead, we often say “the variable numbers contains a buffer” or something imprecise like that. We may also speak of a “numbers buffer”, even though the variable really contains only a reference and numbers is the name of a variable, not of a buffer. (As you’ve seen, multiple differently named variables can store a reference to the same buffer.) Speaking imprecisely is natural and convenient, and we don’t need to strictly avoid it, but if we are to write programs that work, we do need to understand what the words really mean.

Psst! As a matter of fact...

... to be precise, the values of many standard Scala data types — including Strings — are also stored at the ends of references just like you’ve seen buffers to be. We didn’t bring this up before, and our animated diagrams generally don’t display references for certain common data types. But here’s one animation that is more “truthful” — that is, more detailed — than the earlier animations with strings, even if that truthfulness makes the animation a bit more cumbersome:

Generally, it’s more convenient to think in simplified terms. For example, we may think of the string “cat” itself as being stored in a variable (as in earlier animations) rather than the variable referring to another memory location that contains the string. We’ll continue use the simplified representation in later animations.

The above simplification is “safe” for the String data type and won’t lead us to reason erroneously about program behavior. This is because — unlike a buffer — a string value never changes internally. Even when we combine a string with another, like in the preceding example, we don’t modify the existing strings but create a new string. That’s something we’ll return to in Chapters 5.2 and 11.2.

A Frequent Question from Students with Prior Experience

Surely it doesn’t work like that in Python, Java, etc.?

Some O1 students have prior experience with other programming languages, such as Python. When they reach this chapter, some of those students start to wonder things like this:

References to buffers had me lost for a while, since I haven’t seen anything like this in Python.

The fact that multiple variables can point to the same buffer via references is not very intuitive to me. Also, if I’ve understood correctly, this means that when we modify such buffers, both variables are affected. This is very confusing to me, especially since I have some background in Python and don’t recall such a feature there.

I found this topic really interesting, haven’t encountered the same in Java.

This reference stuff shows how different Scala is from Python.

However, references do actually work essentially the same way in Python as they do in Scala. Collections (and similar constructs) are manipulated through references in Python, too. And in Java, and in many other languages.

Consider this example from the Python REPL:

>>> first_list = [123, 456, 789](The Python REPL remains silent.)
>>> second_list = first_list(The Python REPL remains silent.)
>>> first_list.append(1111)(The Python REPL remains silent.)
>>> second_list.append(2222)(The Python REPL remains silent.)
>>> first_list[123, 456, 789, 1111, 2222]
>>> second_list[123, 456, 789, 1111, 2222]

The code creates a collection and two variables that refer to it. The collection initially holds three elements, which are integers.

The append command adds an element. We can do it via either of the two variables...

... but we still have just the one collection. The changes are visible through either of the variables.

This example illustrates that the Python variables, too, hold references to collections, just like in our Scala examples.

Unfortunately, there are lots of introductory courses and tutorials out there that oversimplify things, don’t explain how that works, and leave students with misconceptions.

Summary of Key Points

  • A programmer can store multiple pieces of information — elements — in a single collection.

  • One kind of collection is the buffer.

    • Each of a buffer’s elements is associated with a number called an index.

    • You can replace individual elements in a buffer, and you can also add new elements.

    • In a sense, a buffer is similar to a bunch of var variables grouped together.

  • You manipulate buffers via references. A variable can store a reference that indicates where in the computer’s memory the actual buffer is.

    • Multiple variables may refer to the same buffer.

    • If you make a change to a buffer’s contents via one variable, you can also observe the change via any other variables that refer to the same buffer.

  • References are similarly used for data other than collections, as you’ll see later.

  • Links to the glossary: reference; collection, buffer, element, index; type parameter; package.

An updated concept map:

Feedback

Please note that this section must be completed individually. Even if you worked on this chapter with a pair, each of you should submit the form separately.

Credits

Thousands of students have given feedback and so contributed to this ebook’s design. Thank you!

The ebook’s chapters, programming assignments, and weekly bulletins have been written in Finnish and translated into English by Juha Sorva.

The appendices (glossary, Scala reference, FAQ, etc.) are by Juha Sorva unless otherwise specified on the page.

The automatic assessment of the assignments has been developed by: (in alphabetical order) Riku Autio, Kai Bukharenko, Nikolas Drosdek, Kaisa Ek, Rasmus Fyhrqvist, Joonatan Honkamaa, Antti Immonen, Jaakko Kantojärvi, Onni Komulainen, Niklas Kröger, Kalle Laitinen, Teemu Lehtinen, Mikael Lenander, Ilona Ma, Jaakko Nakaza, Strasdosky Otewa, Kaappo Raivio, Timi Seppälä, Teemu Sirkiä, Onni Tammi, Joel Toppinen, Anna Valldeoriola Cardó, and Aleksi Vartiainen.

The illustrations at the top of each chapter, and the similar drawings elsewhere in the ebook, are the work of Christina Lassheikki.

The animations that detail the execution Scala programs have been designed by Juha Sorva and Teemu Sirkiä. Teemu Sirkiä and Riku Autio did the technical implementation, relying on Teemu’s Jsvee and Kelmu toolkits.

The other diagrams and interactive presentations in the ebook are by Juha Sorva.

The O1Library software has been developed by Aleksi Lukkarinen, Juha Sorva, and Jaakko Nakaza. Several of its key components are built upon Aleksi’s SMCL library.

The pedagogy of using O1Library for simple graphical programming (such as Pic) is inspired by the textbooks How to Design Programs by Flatt, Felleisen, Findler, and Krishnamurthi and Picturing Programs by Stephen Bloch.

The course platform A+ was originally created at Aalto’s LeTech research group as a student project. The open-source project is now shepherded by the Computer Science department’s edu-tech team and hosted by the department’s IT services; dozens of Aalto students and others have also contributed.

The A+ Courses plugin, which supports A+ and O1 in IntelliJ IDEA, is another open-source project. It has been designed and implemented by various students in collaboration with O1’s teachers.

For O1’s current teaching staff, please see Chapter 1.1.

Additional credits appear at the ends of some chapters.

a drop of ink
Posting submission...