Laziness vs Eagerness in Dart

This article is inspired by some of the weirdness we faced when developing my and Marcelo's immutable collections package, package. Both the examples in the first section were actually created by Marcelo Glasberg, all I did was add some more explanations and rice them up.

What's discussed in this article is specially relevant if you happen to "mix" Iterable and List, Set, etc. Mixing those types can be useful, for example, when you have an object that could accept more than one type of collection. At any rate, many other reasons, as well as pros and cons, will be discussed throughout the article.

If I presented you with the following code, what would your expectations be? What would be your bets for each of the counters' values?

Note the .where method returns an Iterable, even when applied to a List or Set.

var lazyCounter = 0;
var eagerCounter = 0;

var lazyOddFilter = [1, 2, 3, 4, 5, 6, 7].where((i) {
  lazyCounter++;
  return i % 2 == 0;
});

var evenFilterEager = [1, 2, 3, 4, 5, 6, 7].where((i) {
  eagerCounter++;
  return i % 2 == 0;
}).toList();

print("\n\n---------- Init ----------\n\n");

lazyOddFilter.length;
lazyOddFilter.length;
lazyOddFilter.length;

evenFilterEager.length;
evenFilterEager.length;
evenFilterEager.length;

print("\n\n---------- Lazy vs Eager ----------\n\n");

print("Lazy: $lazyCounter");
print("Eager: $eagerCounter");

print("\n\n---------- END ----------\n\n");

Do the results below suprise you?

Lazy: 21
Eager: 7

How would you go about explaining them, even if you did get them right?

Quite frankly, I would bet 8:2 you got it wrong — as did I — at least on the first try. The language used in the example also contributes to the misunderstanding: who wouldn't simply trivially assume that the eager version would not be the one overdoing things?

The reason for what happened above comes from the fact that the lazy filter actually doesn't return a thing, it returns an object that carries the operation to return a thing. So, every time you really want to know the result of the operation, the lazy object has to (re)evaluate the operation, it has no memory and it doesn't evaluate the operation at first. The eager version, on the other hand, does everything right away and does remember the results.

Ok, that was weird. But, most of the time, performance isn't really the problem, specially since computers have got so fast. Debugging and readability tend to be much more relevant and recurrent problems during development. So why should we care?

The when things happen can affect the order in which things happen. Debugging will certainly be more difficult if you're not aware of what's happening when, and readability will suffer if what you write doesn't clearly represent 100% of what's going on.

Let's take a look at the following example. Try predicting the order in which each method will be called in each case:

int eagerCounter = 0;
int lazyCounter = 0;

List<int> removeOdd_eager(Iterable<int> source) {
  return source.where((i) {
    eagerCounter++;
    print("removeOdd_eager");
    return i % 2 == 0;
  }).toList();
}

List<int> removeLessThan10_eager(Iterable<int> source) {
  return source.where((i) {
    eagerCounter++;
    print("removeLessThan10_eager");
    return i >= 10;
  }).toList();
}

Iterable<int> removeOdd_lazy(Iterable<int> source) {
  return source.where((i) {
    print("removeOdd_lazy");
    lazyCounter++;
    return i % 2 == 0;
  });
}

Iterable<int> removeLessThan10_lazy(Iterable<int> source) {
  return source.where((i) {
    print("removeLessThan10_lazy");
    lazyCounter++;
    return i >= 10;
  });
}

var list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];

print("\n\n---------- Init ----------\n\n");

Iterable<int> eager = removeLessThan10_eager(removeOdd_eager(list));

Iterable<int> lazy = removeLessThan10_lazy(removeOdd_lazy(list));

print("\n\n---------- Lazy ----------\n\n");

print(lazy);

print("\n\n---------- Eager ----------\n\n");

print(eager);

Note that Iterables print () when logged, while Lists print [], and, Sets, {}.

Here is what the code above prints out:

---------- Eager ----------

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeOdd_eager

removeLessThan10_eager

removeLessThan10_eager

removeLessThan10_eager

removeLessThan10_eager

removeLessThan10_eager

removeLessThan10_eager

removeLessThan10_eager

---------- Lazy ----------

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

removeOdd_lazy

removeLessThan10_lazy

removeOdd_lazy

You can run all the code from , but, long story short, you will realize from the log above that the eager version of the compound operations above will first solve all of the inner operations and then pass everything to the outer function; while the lazy version will do things only whenever they're needed. All that means is: if you're going for the lazy version, when debugging, you might end up with logs and results of both functions mixed together, which will very likely make things more difficult to solve. Lazy functions and methods are much more easilly debugged atomically than when put together with other lazy abstractions, but that's only my opinion, maybe you will develop different intuition with respect to this topic.

Laziness seems to have been historically very appealing to academia and mathematicians. Typically, those individuals don't care much about ~~concrete results~~ numerical results, only that their shapes are correct. For instance, functional languages, like Haskell, are lazy by default.

The book Programming in Haskell, by Prof. Graham Hutton, features a whole chapter only on laziness (chapter 15). There he explains pros and cons of laziness not only with respect to Haskell but in general.

After all of that headache you might have had reading the last section, you might be thinking: why in the world would someone even use laziness, it complicates things so much... However, it has its use cases.

The simplest use case to justify laziness is obviously not having to calculate stuff up front. Imagine you want to deliver the user the UI, making only the necessary calculations. If your language doesn't programmatically support laziness, you will have to manually monitor every single entity to ensure it is so. But, if you were working with Iterables or in Haskell, you would be sure it had not been triggered unless totally necessary. Flutter isn't really technically a lazy framework, but its declarativeness does offer many of the same features.

This topic has also been discussed in this popular StackOverflow question.

Pagination is yet another use case. Through the pagination mechanism, you ensure that only the data that fits the user screen — or whatever UX you're delivering — will be queried and delivered. For that, overarching the design with Iterable might be the way to go.

That all being said, I really do think it is better to discourage the default use of Iterables. Since it is very unlikely that you're going to perform an operation on a sequence only once, it is much wiser to use List as the default than the other way around.

One of the beauties of laziness is that infinity becomes a feasible abstraction. For example, modeling a list that's tied to the database could become much more seamless because querying would then be done on-demand, there would be no need to recreate the iterable for every batch, i.e., the iterable itself — partially — represents the connection to the database.

In Dart, infinite Iterables are also called synchronous generators, but, if you're going for something asynchronous, you could use Streams. In fact, the on-demand, potentially infinite connection to the database in Dart is done through streams on top of socket layers.

/// On-demand generator.
Iterable<int> naturalsFunc() sync* {
  int k = 0;
  // Infinite loop!
  while (true) yield k++;
}

var naturalsIter = naturalsFunc();

print("\n\n---------- Init ----------\n\n");

print("The infinite list/iterable was created, but not evaluated.");

print("\n\n--------------------\n\n");

print("\n\n---------- takeWhile ----------\n\n");

print("It's possible to work with it,"
    "but it's necessary to add a method to "
    "stop the processing at some point");

var naturalsUpTo10 = naturalsIter.takeWhile((value) => value <= 10);

print("Naturals up to 10: $naturalsUpTo10");

print("\n\n---------- END ----------\n\n");

One of the most mundane but very practical benefits of implementing Iterable is the possibility of iterating over it through the very readable for (... in ...) {} pattern. It seems very minor, but readability improves dramatically with it, and it is totally worth it taking the time to implement Iterable or any of its subclasses, if you're at a low dependency level in your application that is.

That being said, if you end up having to implement any of Iterable's gigantic interfaces, it will surely come in handy to be aware of some built-in Dart mixins, like IterableMixin, ListMixin, SetMixin and MapMixin. Choose the most specific mixin out of them, because each of them feature appropriate implementations for each of their default respective targets.

for (value in values) {
  print(value);
}

Most of Iterable's core methods will internally create an Iterator to handle their logic. For example, the length method has to iterate over the whole iterable to finally compute the collection's length.

As you might have already realized, that isn't very efficient. That's why Dart features the EfficientLengthIterable abstract class, which is supposed to be implemented by classes that offer a more efficient way of calculating the length of collections, like List and Set.

One of the main problems with (ab)using recursion in most programming languages is that, when disassembling the recursion stack, the language will create whole layers for each of its stages, when most of them might just be plain return calls or very simple computations. This can easily cause memory leaks for complex computations, and some languages — specially functional ones, like Haskell, which doesn't even feature loops, only recursion — have implemented ways of compiling only the necessary operations to have the recursion going, and that feature or abstaction is typically called Tail Call Optimization (TCO).

So... does Dart feature TCO? I've asked about it on StackOverflow and fortunately got an answer from lrn himself, one of the main developers of the Dart language at Google.

Long story short, Dart doesn't support the feature and there are no plans of incorporating it. And that's apparently very much due to JavaScript. Nevertheless, there's an ongoing, open issue on Github about it.

Again, according to lrn, the answer seems to be that, mainly, having an overarching interface for many collections helps in satisfying design constraints, if the programmer ever faces them.

Besides that, I personally believe that a lazy layer is easier to implement underneath an eager one than the other way around.

Anyhow, I was still curious about why Dart chose this specific design. Is there something I'm missing here? I ended up asking another question on StackOverflow. This time I got some more info, like the fact that, had MapEntry been around before or at the creation of Map, then maybe Map would have been an Iterable<MapEntry>, I'm not sure. Anyhow, I'm still not 100% convinced I know the answer to the questions surrounding this topic — and I apparently don't even know how to ask the question either, given the feedback I've received so far.

The whole discussion on Reddit can be found here. To my surprise, one of Dart's team members, Robert Nystrom or @munificent read the post and gave me feedback. Please do read his comments, he points out many of my imprecisions.

However, two feedbacks I've received so far stand out:

As Marcelo pointed out to me later on in an indirect fashion, the real annoyance behind the Iterable topic is that programmers will expect it to only be an encompassing interface for multiple different collections. Much to their demise, after they've implemented a bunch of stuff, will they stumble upon the fact that there's "hidden", different behavior to it.
Nystrom accidentally answered one of the questions that was most puzzling me: if the fundamental operations are eager, there is no way to opt out of that behavior.