Luet oppimateriaalin englanninkielistä versiota. Mainitsit kuitenkin taustakyselyssä osaavasi suomea. Siksi suosittelemme, että käytät suomenkielistä versiota, joka on testatumpi ja hieman laajempi ja muutenkin mukava.

Suomenkielinen materiaali kyllä esittelee englanninkielisetkin termit.

Kieli vaihtuu A+:n sivujen yläreunan painikkeesta. Tai tästä: Vaihda suomeksi.


Chapter 12.1: Arrays and a Faulty Train

../_images/robot_fight.png

Introduction: What’s an Array?

You’ve already seen the word “array” in O1:

val numbers = Buffer(1, 2, 3)numbers: Buffer[Int] = ArrayBuffer(1, 2, 3)
val words = "one,two,three".split(",")words: Array[String] = Array(one, two, three)

ArrayBuffer appears in the REPL output when we use buffers.

The split method (Chapter 5.2) divides a string and returns the pieces — not in a vector or a buffer but an array.

Arrays (taulukko) are collections of elements:

  • Like buffers and vectors, arrays store elements at specific indices.

  • You can replace an array element with another. In this respect, an array is similar to a buffer and different from a vector.

  • However, the size of an array is fixed, just like a vector’s is. The size is set when the array is created and the number of indices in an array never changes.

A familiar sort of construct, then. Let’s now look at a few examples of using an array before we discuss why you might want to use them.

Arrays in Scala

The Array type

Arrays, like vectors and unlike buffers, are always available in Scala programs without an import.

Creating an array looks familiar:

val myArray = Array("first", "second", "third", "fourth")myArray: Array[String] = Array(first, second, third, fourth)
val anotherArray = Array.tabulate(5)( index => 2 * index )anotherArray: Array[Int] = Array(0, 2, 4, 6, 8)

Indices and methods also work like you’d expect:

myArray(2)res0: String = third
myArray(3) = "last"myArraymyArray: Array[String] = Array("first", "second", "third", "last")
myArray.sizeres1: Int = 4
myArray.indexOf("third")res2: Int = 2
myArray.mkString(":")res3: String = first:second:third:last
myArray.map( _.length )res4: Array[Int] = Array(3, 4, 6, 4)

Appending a new element and thereby increasing the size of the array is impossible:

myArray += "one more?"-- Error:
myArray += "one more?"
^^^^^^^^^^
value += is not a member of Array[String]

Creating an unitialized array with ofDim

Sometimes, it’s convenient to create a collection of a specific size whose contents are initialized only later. For that, you can use a method that isn’t available for vectors or buffers, Array.ofDim:

val myArray = Array.ofDim[Int](5)myArray: Array[Int] = Array(0, 0, 0, 0, 0)

ofDim needs a type parameter. Here, we have elements of type Int.

As a regular parameter, we pass in a number that sets the array’s size. Here, we create an array with five elements.

The method creates and returns an array of the specified size. It contains copies of some default value; here, that value is zero.

Creating an array so reserves the desired number of memory slots for the actual array elements, which we haven’t really set yet but whose (maximum) number we know.

You can read Array.ofDim as “array of dimensions”. Like that name implies, the method works for creating “multidimensional” (nested) arrays. They are similar to the “multidimensional” vectors of Chapter 6.1:

val twoDimensional = Array.ofDim[Int](2, 3)twoDimensional: Array[Array[Int]] = Array(Array(0, 0, 0), Array(0, 0, 0))
val threeDimensional = Array.ofDim[Double](2, 2, 2)threeDimensional: Array[Array[Array[Double]]] =
Array(Array(Array(0.0, 0.0), Array(0.0, 0.0)), Array(Array(0.0, 0.0), Array(0.0, 0.0)))

ofDim just creates the array, it doesn’t put any meaningful content in it. Still, any array you create like this always contains something. The default value depends on the element type:

  • for numerical types (Char included), it’s zero;

  • for Booleans, it’s false; and

  • for all other types — including String, for instance — it’s the “non-existent value” null (which is a spawning bed for errors; Chapter 4.3).

You can and almost always should replace the default values with something more meaningful sooner or later.

Careful with those default values!

Suppose you have a class Footballer and you create this array:

val finnishTeam = Array.ofDim[Footballer](11)

It’s a common beginner’s error to forget that the above command does not actually create even a single instance of the Footballer class, even though it ostensibly creates “an array of footballers”. What you get is an array that contains eleven copies of the null reference. If you want to create Footballers and store them in the array, you’ll need to do that separately:

finnishTeam(0) = Footballer("Tinja-Riikka Korpela")
finnishTeam(1) = // etc.

Unless there is an actual object reference at the appropriate index, commands like these will result in a NullPointerException at runtime:

finnishTeam(9).score()
val keeper = finnishTeam(0)
println(keeper.name)

Doesn’t Sound Too Useful?

Since an array’s size never changes, the Array type supports fewer operations than the Buffer type; any piece of functionality that you might wish to implement using arrays is possible to implement using buffers. (The reverse is also true.) If you want an immutable, numerically indexed collection, you can use a vector. Uninitialized arrays are infested with nulls.

Given all that, why bother with yet another collection type? Here are a few reasons:

  • Commonness: Arrays are part of a programmer’s general knowledge. They are a basic data structure that is used for implementing other collection types. Arrays are available in many programming languages; in quite a few of languages, they are the most common type of collection.

  • Efficiency: Arrays sometimes make a program more efficient, which is due to differences in the implementations of the various collection types. As far as Scala arrays are concerned, this is one of the more likely reasons why you might occasionally opt for an array, but — once again — we leave efficiency concerns for later courses to deal with. The Scala website has a table that compares the efficiency characteristics of different collections.

  • Natural usage scenarios: An array is an intuitive choice when you need a collection whose contents can change but whose size cannot. The Grid class of Chapter 8.1, for instance, has been implemented using a “two-dimensional” array. You can also consider an array if you need a collection whose size is capped at a preset limit.

  • Incidence in libraries: The Array type crops up in some software libraries, including Scala’s core API. The aforementioned split method is an example.

Arrays as an implementation tool

The word ArrayBuffer reveals how Scala’s buffer class has been implemented: each buffer object internally stores its elements in an array of some fixed size. When that array runs out of capacity, the buffer object swaps it for a bigger array where it copies the old array contents and adds any new ones.

When you use such a buffer you therefore also use arrays, albeit indirectly and nearly unnoticeably.

Arrays have also been used (differently) for implementing Scala’s Vector class.

More array-based collections: IArray, ArrayDeque etc.

The Scala API also defines IArray, where the I stands for “immutable”. An IArray is similar to an array — in fact, Scala internally treats it as one — but it doesn’t provide effectful methods and thus guarantees immutability. An IArray resembles Vector in that sense, but its efficiency characteristics are different (and quite good for many use cases).

val unchangingArray = IArray(10, 4, 5)unchangingArray: IArray[Int] = Array(10, 4, 5)
unchangingArray(1)res5: Int = 4
unchangingArray(1) = 5-- Error:
unchangingArray(1) = 5
^^^^^^^^^^^^^^^
value update is not a member of IArray[Int]

Yet another collection type is ArrayDeque, a mutable collection similar to a Buffer. Adding elements to either end of an ArrayDeque is fast, and element removals are similarly efficient. (“Deque”, pronounced “deck”, is a common abbreviation of “double-ended queue”.)

Once again, for more information about these and other collections, see the Scala API, which has separate packages for immutable and mutable collections.

Debugging a Train

This assignment puts you in a position that is familiar to many professional programmers: you need to untangle a mess of code that someone else wrote.

Task description

The Train module contains classes that represent train cars and the seats and cabins in those cars. Think of the classes as (poor) parts of an imaginary software system where customers can reserve places on a train. The code is written in a heavily imperative style: it is built on arrays, while loops, and effects on mutable state.

The code isn’t even close to exemplary. The worst thing is that it doesn’t work right.

Your task is to write a main function that tests the given classes and to use it to locate the errors in the program. You should also fix the errors, but that’s the easy part.

See the Scaladocs for how the classes should work.

Instructions and hints

  • Please don’t get stuck on this assignment. Use the lab sessions and the online forums for hints.

  • First, test the code thoroughly. Call the methods on different values and in different order. Work out what works and what doesn’t work. Then debug the code: find and fix what causes the problems.

  • Write a main function for testing. You can decide the contents of the function as you see fit. You can write the function in test.scala.

  • You may then wish to use IntelliJ’s debugger the given code (and the main function). Without the debugger, this assignment might be significantly harder than it needs to be.

  • There are eight bugs. Each of them is in a distinct location in the program code.

  • Various methods work peculiarly or crash the program if you pass in “obviously silly” values (e.g., if you set a negative number of cabins or if you add a null to the train instead of an actual train car). In this assignment, that does not count as a bug. Concentrate your efforts on finding behaviors that clearly contradict the specification even on valid parameters.

  • SittingCar is the most complex of the classes. Inspect the other classes first. Look at SittingCar last, when you’ll have developed a better understanding of the given code and a better workflow for testing.

  • Some of the classes have companion objects in the same file. Those objects are there just to store constants.

  • You might feel the urge to improve the programming style or efficiency of the given code. You’re not required to do that, though.

  • So-called desk checking (pöytätestaus) can help, as suggested by a former O1 student:

    In its own way, this was a tough assignment, but in the end I solved it pretty easily once I printed out the code, picked up my pencil, and took the whole stack of papers with me for reading in the sauna. :)

    Sauna is probably inessential in this method.

A+ presents the exercise submission form here.

Bonus Material on Efficiency

Efficiency, optimization, and profiling

When talking to programmers, “efficiency” keeps coming up. Does being efficient mean that you can make a program run without using as much power, memory, etc?

When programmers speak of efficiency (or “performance”), they usually mean the time it takes to run a program or algorithm and/or the amount of memory that a program or algorithm needs. There are other efficiency criteria, too, such as energy consumption.

This ebook has intermittently noted how optimizing a program’s running time or memory usage are largely a matter for O1’s follow-on courses (e.g., this chapter from Programming 2). Nevertheless and understandably, some of you have already been thinking about these interesting questions during O1 already.

Sometimes, programmers are even too enthusiastic about efficiency.

Here are the often-repeated three golden rules of efficiency optimization:

Rule #1: Don’t.

Rule #2: Don’t. Yet.

Rule #3: Profile first.

Profiling (profilointi) a program means estimating the efficiency of the program’s components by taking measurements while running the program. Profiling (in combination with other analyses) helps determine which of a program’s components have a significant impact on overall effiency and which don’t. It’s common that only a small part of the program code is decisive for efficiency:

There is a famous rule in performance optimization called the 90/10 rule: 90% of a program’s execution time is spent in only 10% of its code.


The standard inference from this rule is that programmers should find that 10% of the code and optimize it, because that’s the only code where improvements make a difference in the overall system performance.


But a second inference is just as important: programmers can deoptimize the other 90% of the code (in order to make it easier to use, maintain, etc.), because deterioration (of performance) of that code won’t make much of a diffence in the overall system performance.


—Richard E. Pattis

In application programming, efficiency is a primary concern when processing large amounts of data and/or running programs on mobile computers or the like — both of which are common scenarios, to be sure. Then again, other quality criteria such as readability and modifiability are often more problematic.

../_images/donald_knuth.png

Donald Knuth, a pioneer of program complexity analysis and one of the most legendary programmers of all time

Here’s a famous phrase from a gentleman who knows a thing or two about efficiency:

Premature optimization is the root of all evil.

—Donald Knuth

Object creation and efficiency in functional programming

Chapter 11.2’s optional assignment had you implement the football module’s Match class in the functional style.

Since functional programming keeps creating new objects instead of updating existing ones, what happens to the objects that become obsolute? I mean, does the computer know to automatically erase the Match object with two goals after I’ve used it to record a third goal, creating a new Match object?

Yes. Those old Match objects aren’t referred to by any variable in the program, and the virtual machine’s garbage collector will promptly sweep them up.

Doesn’t functional programming generate a lot of data that soon becomes “outdated”, so to speak?

It’s true that if the program operates on large data structures, it isn’t efficient to make full copies of the structures, but this isn’t such a big problem in functional programming as you might first think.

There are techiques that avoid unnecessary copying of immutable data structures. Persistent data structures are an example: operations on such a structure don’t actually create slightly modified copy of the original structure but reuse parts of the original. Many of Scala’s standard collections are persistent in this sense, for example, but that’s something for follow-on courses to discuss further.

Memory allocation is something that’s also relevant for mutable collections such as buffers:

Reflections: If I create a new empty buffer, how many references [elements] does it have space for? And what happens if I add more references than that in the buffer? Does the buffer automatically claim more memory? It probably wouldn’t be efficient to reserve more memory every single time a reference is added?

A new empty Buffer (more specifically: a buffer implemented using arrays, an ArrayBuffer) allocates, by default, an array of size sixteen for storing the buffer’s future elements. And, as the student correctly surmised, the buffer allocates more memory for itself automatically as the need arises: when the present array isn’t big enough, the buffer creates an array twice as big and copies all the elements there.

(If you feel like it, you can go to the ArrayBuffer’s source code and spot the default size of sixteen and the code that doubles the array there.)

The text Benchmarking Scala Collections. compares the efficiency characteristics of Scala’s collection types. (It’s a nice piece in general, although a bit dated now and there are issues with some of the comparisons.) That link is best suited for students with prior programming experience beyond O1, but the general conclusion that it, too, underlines is easy enough to appreciate: which collection type is the most efficient depends crucially on what you need the collection for.

Summary of Key Points

  • An array is a mutable collection of fixed size.

    • Arrays are common in many programing languages and programs.

    • Depending on the context, using arrays instead of other collections may bring benefits such as improved efficiency.

  • Links to the glossary: array; testing.

Feedback

Please note that this section must be completed individually. Even if you worked on this chapter with a pair, each of you should submit the form separately.

Credits

Thousands of students have given feedback and so contributed to this ebook’s design. Thank you!

The ebook’s chapters, programming assignments, and weekly bulletins have been written in Finnish and translated into English by Juha Sorva.

The appendices (glossary, Scala reference, FAQ, etc.) are by Juha Sorva unless otherwise specified on the page.

The automatic assessment of the assignments has been developed by: (in alphabetical order) Riku Autio, Nikolas Drosdek, Kaisa Ek, Joonatan Honkamaa, Antti Immonen, Jaakko Kantojärvi, Onni Komulainen, Niklas Kröger, Kalle Laitinen, Teemu Lehtinen, Mikael Lenander, Ilona Ma, Jaakko Nakaza, Strasdosky Otewa, Timi Seppälä, Teemu Sirkiä, Joel Toppinen, Anna Valldeoriola Cardó, and Aleksi Vartiainen.

The illustrations at the top of each chapter, and the similar drawings elsewhere in the ebook, are the work of Christina Lassheikki.

The animations that detail the execution Scala programs have been designed by Juha Sorva and Teemu Sirkiä. Teemu Sirkiä and Riku Autio did the technical implementation, relying on Teemu’s Jsvee and Kelmu toolkits.

The other diagrams and interactive presentations in the ebook are by Juha Sorva.

The O1Library software has been developed by Aleksi Lukkarinen, Juha Sorva, and Jaakko Nakaza. Several of its key components are built upon Aleksi’s SMCL library.

The pedagogy of using O1Library for simple graphical programming (such as Pic) is inspired by the textbooks How to Design Programs by Flatt, Felleisen, Findler, and Krishnamurthi and Picturing Programs by Stephen Bloch.

The course platform A+ was originally created at Aalto’s LeTech research group as a student project. The open-source project is now shepherded by the Computer Science department’s edu-tech team and hosted by the department’s IT services; dozens of Aalto students and others have also contributed.

The A+ Courses plugin, which supports A+ and O1 in IntelliJ IDEA, is another open-source project. It has been designed and implemented by various students in collaboration with O1’s teachers.

For O1’s current teaching staff, please see Chapter 1.1.

Additional credits appear at the ends of some chapters.

a drop of ink
Posting submission...