The latest instance of the course can be found at: O1: 2024
Luet oppimateriaalin englanninkielistä versiota. Mainitsit kuitenkin taustakyselyssä osaavasi suomea. Siksi suosittelemme, että käytät suomenkielistä versiota, joka on testatumpi ja hieman laajempi ja muutenkin mukava.
Suomenkielinen materiaali kyllä esittelee englanninkielisetkin termit.
Kieli vaihtuu A+:n sivujen yläreunan painikkeesta. Tai tästä: Vaihda suomeksi.
Chapter 12.1: Arrays and a Faulty Train
About This Page
Questions Answered: How about some more practice on loops, code-reading, and debugging? The word “array” has appeared here and there — what does it mean?
Topics: See above.
What Will I Do? Start by reading a bit. Then spend most of the chapter reading and debugging a given program. Some optional reading about efficiency at the end.
Rough Estimate of Workload:? Three or four hours? Remember to ask for help if you get stuck in the debugging assignment.
Points Available: B80.
Related Modules: Train (new).
Introduction: What’s an Array?
You’ve already seen the word “array” in O1:
val numbers = Buffer(1, 2, 3)numbers: Buffer[Int] = ArrayBuffer(1, 2, 3) val words = "one,two,three".split(",")words: Array[String] = Array(one, two, three)
The split
method (Chapter 5.2) divides a string and returns
the pieces — not in a vector or a buffer but an array.
Arrays (taulukko) are collections of elements:
Like buffers and vectors, arrays store elements at specific indices.
You can replace an array element with another. In this respect, an array is similar to a buffer and different from a vector.
However, the size of an array is fixed, just like a vector’s is. The size is set when the array is created and the number of indices in an array never changes.
A very familiar sort of construct, then. Let’s now look at a few examples of using an array before we discuss why you might want to use them.
Arrays in Scala
The Array
type
Arrays, like vectors and unlike buffers, are always available in Scala programs without
an import
.
Creating an array looks familiar:
val myArray = Array("first", "second", "third", "fourth")myArray: Array[String] = Array(first, second, third, fourth) val anotherArray = Array.tabulate(5)( index => 2 * index )anotherArray: Array[Int] = Array(0, 2, 4, 6, 8)
Indices and methods also work like you’d expect:
myArray(2)res0: String = third myArray(3) = "last"myArraymyArray: Array[String] = Array("first", "second", "third", "last") myArray.sizeres1: Int = 4 myArray.indexOf("third")res2: Int = 2 myArray.mkString(":")res3: String = first:second:third:last myArray.map( _.length )res4: Array[Int] = Array(3, 4, 6, 4)
Appending an new element and thereby increasing the size of the array is impossible:
myArray += "one more?"-- Error: myArray += "one more?" ^^^^^^^^^^ value += is not a member of Array[String]
Creating an unitialized array with ofDim
Sometimes, it’s convenient to create a collection of a specific size whose contents are
initialized only later. For that, you can use a method that isn’t available
for vectors or buffers, Array.ofDim
:
val myArray = Array.ofDim[Int](5)myArray: Array[Int] = Array(0, 0, 0, 0, 0)
ofDim
needs a type parameter. Here, we have elements of type
Int
.
As a regular parameter, we pass in a number that sets the array’s size. Here, we create an array with five elements.
The method creates and returns an array of the specified size. It contains copies of some default value; here, that value is zero.
Creating an array so reserves the desired number of memory slots for the actual array elements, which we haven’t really set yet but whose (maximum) number we know.
You can read Array.ofDim
as “array of dimensions”. Like that name implies, the
method works for creating “multidimensional” (nested) arrays. They are similar to the
“multidimensional” vectors of Chapter 6.1:
val twoDimensional = Array.ofDim[Int](2, 3)twoDimensional: Array[Array[Int]] = Array(Array(0, 0, 0), Array(0, 0, 0)) val threeDimensional = Array.ofDim[Double](2, 2, 2)threeDimensional: Array[Array[Array[Double]]] = Array(Array(Array(0.0, 0.0), Array(0.0, 0.0)), Array(Array(0.0, 0.0), Array(0.0, 0.0)))
ofDim
just creates the array, it doesn’t put any meaningful content in it. Still, any
array you create like this always contains something. The default value depends on the
element type:
for numerical types (
Char
included), it’s zero;for Booleans, it’s
false
; andfor all other types — including
String
, for instance — it’s the “non-existent value”null
(which is a spawning bed for errors; Chapter 4.3).
You can and almost always should replace the default values with something more meaningful sooner or later.
Careful with those default values!
Suppose you have a class Footballer
and you create this array:
val finnishTeam = Array.ofDim[Footballer](11)
It’s a common beginner’s error to forget that the above command does not actually create
even a single instance of the Footballer
class, even though it ostensibly creates “an
array of footballers”. What you get is an array that contains eleven copies of the null
reference. If you want to create Footballer
s and store them in the array, you’ll need
to do that separately:
finnishTeam(0) = Footballer("Tinja-Riikka Korpela")
finnishTeam(1) = // etc.
Unless there is an actual object reference at the appropriate index, commands like these
will result in a NullPointerException
at runtime:
finnishTeam(9).score()
val keeper = finnishTeam(0)
println(keeper.name)
Doesn’t Sound Too Useful?
Since an array’s size never changes, the Array
type supports fewer operations than the
Buffer
type; any piece of functionality that you might wish to implement using arrays
is possible to implement using buffers. (The reverse is also true.) If you want an
immutable, numerically indexed collection, you can use a vector. Uninitialized arrays
are infested with null
s.
Given all that, why bother with yet another collection type? Here are a few reasons:
Commonness: Arrays are part of a programmer’s general knowledge. They are a basic data structure that is used for implementing other collection types. Arrays are available in many programming languages; in quite a few of languages, they are the most common type of collection.
Efficiency: Arrays sometimes make a program more efficient, which is due to differences in the implementations of the various collection types. As far as Scala arrays are concerned, this is one of the more likely reasons why you might occasionally opt for an array, but — once again — we leave efficiency concerns for later courses to deal with. The Scala website has a table that compares the efficiency characteristics of different collections.
Natural usage scenarios: An array is an intuitive choice when you need a collection whose contents can change but whose size cannot. The
Grid
class of Chapter 8.1, for instance, has been implemented using a “two-dimensional” array. You can also consider an array if you need a collection whose size is capped at a preset limit.Incidence in libraries: The
Array
type crops up in some software libraries, including Scala’s core API. The aforementionedsplit
method is an example.
Arrays as an implementation tool
The word ArrayBuffer
reveals how Scala’s buffer class has been
implemented: each buffer object internally stores its elements in
an array of some fixed size. When that array runs out of capacity,
the buffer object swaps it for a bigger array where it copies the
old array contents and adds any new ones.
When you use such a buffer you therefore also use arrays, albeit indirectly and nearly unnoticeably.
Arrays have also been used (differently) for implementing Scala’s
Vector
class.
More array-based collections: IArray
, ArrayDeque
etc.
The Scala API also defines IArray
, where the I
stands for
“immutable”. An IArray
is similar to an array — in fact, Scala
internally treats it as one — but it doesn’t provide effectful
methods and thus guarantees immutability. An IArray
resembles
Vector
in that sense, but its efficiency characteristics are
different (and quite good for many use cases).
val unchangingArray = IArray(10, 4, 5)unchangingArray: IArray[Int] = Array(10, 4, 5) unchangingArray(1)res5: Int = 4 unchangingArray(1) = 5-- Error: unchangingArray(1) = 5 ^^^^^^^^^^^^^^^ value update is not a member of IArray[Int]
Yet another collection type is ArrayDeque
, a mutable collection
similar to a Buffer
. Adding elements to either end of an ArrayDeque
is fast, and element removals are similarly efficient. (“Deque”,
pronounced “deck”, is a common abbreviation of “double-ended queue”.)
Once again, for more information about these and other collections, see the Scala API, which has separate packages for immutable and mutable collections.
Debugging a Train
This assignment puts you in a position that is familiar to many professional programmers: you need to untangle a mess of code that someone else wrote.
Task description
The Train module contains classes that represent train cars and the seats and cabins
in those cars. Think of the classes as (poor) parts of an imaginary software system
where customers can reserve places on a train. The code is written in a heavily
imperative style: it is built on arrays, while
loops, and
effects on mutable state.
The code isn’t even close to exemplary. The worst thing is that it doesn’t work right.
Your task is to write a main function that tests the given classes and to use it to locate the errors in the program. You should also fix the errors, but that’s the easy part.
See the Scaladocs for how the classes should work.
Instructions and hints
Please don’t get stuck on this assignment. Use the lab sessions and the online forums for hints.
First, test the code thoroughly. Call the methods on different values and in different order. Work out what works and what doesn’t work. Then debug the code: find and fix what causes the problems.
Write a main function for testing. You may also wish to use IntelliJ’s debugger. Without the debugger, this assignment may be significantly harder than it needs to be.
There are eight bugs. Each of them is in a distinct location in the program code.
Various methods work peculiarly or crash the program if you pass in “obviously silly” values (e.g., if you set a negative number of cabins or if you add a
null
to the train instead of an actual train car). In this assignment, that does not count as a bug. Concentrate your efforts on finding behaviors that clearly contradict the specification even on valid parameters.SittingCar
is the most complex of the classes. Inspect the other classes first. Look atSittingCar
last, when you’ll have developed a better understanding of the given code and a better workflow for testing.Some of the classes have companion objects in the same file. Those objects are there just to store constants.
You might feel the urge to improve the programming style or efficiency of the given code. You’re not required to do that, though.
So-called desk checking (pöytätestaus) can help, as suggested by a former O1 student:
In its own way, this was a tough assignment, but in the end I solved it pretty easily once I printed out the code, picked up my pencil, and took the whole stack of papers with me for reading in the sauna. :)
Sauna is probably inessential in this method.
A+ presents the exercise submission form here.
Bonus Material on Efficiency
Efficiency, optimization, and profiling
When talking to programmers, “efficiency” keeps coming up. Does being efficient mean that you can make a program run without using as much power, memory, etc?
When programmers speak of efficiency (or “performance”), they usually mean the time it takes to run a program or algorithm and/or the amount of memory that a program or algorithm needs. There are other efficiency criteria, too, such as energy consumption.
This ebook has intermittently noted how optimizing a program’s running time or memory usage are largely a matter for O1’s follow-on courses (e.g., this chapter from Programming 2). Nevertheless and understandably, some of you have already been thinking about these interesting questions during O1 already.
Sometimes, programmers are even too enthusiastic about efficiency.
Here are the often-repeated three golden rules of efficiency optimization:
Rule #1: Don’t.Rule #2: Don’t. Yet.Rule #3: Profile first.
Profiling (profilointi) a program means estimating the efficiency of the program’s components by taking measurements while running the program. Profiling (in combination with other analyses) helps determine which of a program’s components have a significant impact on overall effiency and which don’t. It’s common that only a small part of the program code is decisive for efficiency:
There is a famous rule in performance optimization called the 90/10 rule: 90% of a program’s execution time is spent in only 10% of its code.
The standard inference from this rule is that programmers should find that 10% of the code and optimize it, because that’s the only code where improvements make a difference in the overall system performance.
But a second inference is just as important: programmers can deoptimize the other 90% of the code (in order to make it easier to use, maintain, etc.), because deterioration (of performance) of that code won’t make much of a diffence in the overall system performance.
—Richard E. Pattis
In application programming, efficiency is a primary concern when processing large amounts of data and/or running programs on mobile computers or the like — both of which are common scenarios, to be sure. Then again, other quality criteria such as readability and modifiability are often more problematic.
Here’s a famous phrase from a gentleman who knows a thing or two about efficiency:
Premature optimization is the root of all evil.
—Donald Knuth
Object creation and efficiency in functional programming
Chapter 11.2’s optional assignment had you implement the football
module’s Match
class in the functional
style.
Since functional programming keeps creating
new objects instead of updating existing
ones, what happens to the objects that become
obsolute? I mean, does the computer know to
automatically erase the Match
object with
two goals after I’ve used it to record a
third goal, creating a new Match
object?
Yes. Those old Match
objects aren’t referred to by any variable
in the program, and the virtual machine’s garbage collector
will promptly sweep them up.
Doesn’t functional programming generate a lot of data that soon becomes “outdated”, so to speak?
It’s true that if the program operates on large data structures, it isn’t efficient to make full copies of the structures, but this isn’t such a big problem in functional programming as you might first think.
There are techiques that avoid unnecessary copying of immutable data structures. Persistent data structures are an example: operations on such a structure don’t actually create slightly modified copy of the original structure but reuse parts of the original. Many of Scala’s standard collections are persistent in this sense, for example, but that’s something for follow-on courses to discuss further.
Memory allocation is something that’s also relevant for mutable collections such as buffers:
Reflections: If I create a new empty buffer, how many references [elements] does it have space for? And what happens if I add more references than that in the buffer? Does the buffer automatically claim more memory? It probably wouldn’t be efficient to reserve more memory every single time a reference is added?
A new empty Buffer
(more specifically: a buffer implemented using
arrays, an ArrayBuffer
) allocates, by default, an array of size
sixteen for storing the buffer’s future elements. And, as the
student correctly surmised, the buffer allocates more memory for
itself automatically as the need arises: when the present array
isn’t big enough, the buffer creates an array twice as big and
copies all the elements there.
(If you feel like it, you can go to the ArrayBuffer
’s source code
and spot the default size of sixteen
and the code that doubles the array
there.)
The text Benchmarking Scala Collections. compares the efficiency characteristics of Scala’s collection types. (It’s a very nice piece in general, although a bit dated by now and there are issues with some of the comparisons.) That link is best suited for students with prior programming experience beyond O1, but the general conclusion that it, too, underlines is easy enough to appreciate: which collection type is the most efficient depends crucially on what you need the collection for.
Summary of Key Points
Feedback
Please note that this section must be completed individually. Even if you worked on this chapter with a pair, each of you should submit the form separately.
Credits
Thousands of students have given feedback and so contributed to this ebook’s design. Thank you!
The ebook’s chapters, programming assignments, and weekly bulletins have been written in Finnish and translated into English by Juha Sorva.
The appendices (glossary, Scala reference, FAQ, etc.) are by Juha Sorva unless otherwise specified on the page.
The automatic assessment of the assignments has been developed by: (in alphabetical order) Riku Autio, Nikolas Drosdek, Joonatan Honkamaa, Antti Immonen, Jaakko Kantojärvi, Niklas Kröger, Kalle Laitinen, Teemu Lehtinen, Jaakko Nakaza, Strasdosky Otewa, Timi Seppälä, Teemu Sirkiä, Anna Valldeoriola Cardó, and Aleksi Vartiainen.
The illustrations at the top of each chapter, and the similar drawings elsewhere in the ebook, are the work of Christina Lassheikki.
The animations that detail the execution Scala programs have been designed by Juha Sorva and Teemu Sirkiä. Teemu Sirkiä and Riku Autio did the technical implementation, relying on Teemu’s Jsvee and Kelmu toolkits.
The other diagrams and interactive presentations in the ebook are by Juha Sorva.
The O1Library software has been developed by Aleksi Lukkarinen and Juha Sorva. Several of its key components are built upon Aleksi’s SMCL library.
The pedagogy of using O1Library for simple graphical programming (such as Pic
) is
inspired by the textbooks How to Design Programs by Flatt, Felleisen, Findler, and
Krishnamurthi and Picturing Programs by Stephen Bloch.
The course platform A+ was originally created at Aalto’s LeTech research group as a student project. The open-source project is now shepherded by the Computer Science department’s edu-tech team and hosted by the department’s IT services. Markku Riekkinen is the current lead developer; dozens of Aalto students and others have also contributed.
The A+ Courses plugin, which supports A+ and O1 in IntelliJ IDEA, is another open-source project. It has been designed and implemented by various students in collaboration with O1’s teachers.
For O1’s current teaching staff, please see Chapter 1.1.
Additional credits appear at the ends of some chapters.
ArrayBuffer
appears in the REPL output when we use buffers.