Javascript Objects Considered Unsafe

Every year, I like to teach myself a new language by working through the Advent of Code problems. This year, I wanted to dive into typescript. I’ve written a fair amount of JS before, and I like having explicit typechecking in my code, so TS felt like a good fit.

Everything was going great until I hit Day 15, and my solution worked well for the small case but was painfully slow in the large case.

You should definitely click through and read the problem statement, but to summarize:

You start off with a comma-separated list of numbers. This is your problem input. The challenge is to continue the sequence of numbers. Each number in the sequence (after the starting set) depends on the number immediately previous to it. If that previous number has never appeared in the sequence before, our new number is 0. If it has appeared in the sequence, our new number is the number of steps between when it last appeared in the sequence and now.

For example, let’s say we start with 1,2,3 as our input. The next numbers would be:

  • To determine the fourth number, we’d look at the third number (3). 3 has never appeared in the sequence before, so the fourth number is 0.
  • To determine the fifth number, we’d look at the fourth number (0). 0 has never appeared in the sequence before, so the fifth number is 0.
  • To determine the sixth number, we look at the fifth number (0). 0 has appeared in the sequence before, as the fourth number, so we take the difference (position 5 - position 4 = 1) and that difference is our sixth number: 1.
  • To determine the seventh number, we look at the sixth number (1). 1 has appeared in the sequence before, so we subtract and get 5.

We can continue this pattern forever.

(Numberphile has a wonderful overview of this sequence that goes into a bit more detail)

Part 1 of the challenge is relatively simple: determine the 2020th number in the sequence. I decided to brute force it:

The spoken object is a Record<number, number> where the keys are numbers and the values are the last index where that number appeared in the sequence. (Why "spoken"? See the Advent of Code description). On each iteration, we check if our current number has been spoken before, and set our next number accordingly. On the next iteration of the loop, current becomes next, and so we repeat this process for as many iterations as we need.

So far, so good, but let’s talk efficiency, because part 2 of the challenge is to calculate the 30 millionth number in the sequence. Performance-wise, I’d claim that this runs in O(n) time; the run time scales linearly with the number of values we want to compute. ie. Computing 2000 numbers takes twice as long as computing 1000 numbers. This is because we use that spoken object to do fast lookups on the last index a number appeared in.

This is because we generally treat insertion and lookups in an object as “cheap” or constant time.

The drawback is that we’re using tons of memory. We’d describe it as using O(n) memory; the amount of space we use grows linearly with the number of values we want to compute. (I have a proof for this but it’s too large to fit in this margin…just kidding. As per the Numberphile video I referenced earlier, this sequences seems like it grows linearly, but no one has proven that yet.)

Since we’re brute forcing the result, at the very least, we need to do one loop iteration per number we compute, so this O(n) runtime is the fastest we can get. If we had a magic math formula that could spit out the number at an arbitrary index, that would be much faster, but I don’t have one (and as per that Numberphile video, neither do the mathematicians!).

So, it sounds like we’re very close to our theoretical peak performance. Let’s try and compute the 30 millionth number then.

And…when I do, it hangs. For upwards of 5 minutes.

Sounds like something is wrong. Very very wrong. I’m not sure where, though, so let’s do some profiling.

Basically, we’re timing how long this method takes for successively larger and larger amounts of numbers to generate. If our solution was O(n), as I claimed above, we should see times that grow linearly; if 2000 takes 1ms, then 20,000 should take 10ms and 200,000 should take 100ms.

Instead, these are the results I measured:

🤯🤯

The jump from 200,000 to 2,000,000 is particularly interesting: it took us 95 times longer to compute 10x the numbers! And going from 2,000,000 to 20,000,000 is also pretty bad: 10x the work took 41 times longer.

(Before I go any further, a quick note about performance metrics: all of these numbers were collected on my 2016 MacBook Pro running macOS 11.2, with a 3.1GHz i5 in it. I’m using node version 14.15.1 and tsc 4.1.2. The absolute values of these numbers isn't important but what is important is the proportions.)

So, what’s going on with these numbers? When I first wrote this code and saw these, I was extremely surprised. We already said it looks O(n), so why are we seeing huge non-linear spikes? Spoiler alert: my code wasn’t O(n).

Yeah, that’s right. It’s not O(n). Why not, you ask?

Well, let’s find out. Time to do some more profiling:

We do two things with our spoken lookup object: we read a value out and write a value in. Here I've added code to time how long each part takes, in total, across the entire operation. The results:

Logs can be hard to read, so let’s dump this into a table:

As you can see, read performance is fine, roughly linear. Write performance is drastically worse as our object gets bigger and bigger.

There isn’t much more we can do here; we’ve gathered as much data as we can. We’re observing something in the behavior of the language and runtime we’re using, so now it’s time to go see if anyone else has observed similar behavior, or if we’ve found a bug in node.

Armed with this information, off to the interwebs we go. Searching for “js object vs map performance” turns up a number of results of people profiling objects and maps, but I didn’t find anything along the lines of what I was seeing. Then, I came across GitHub user jngbng, who did similar experiments that suggest objects perform faster than Sets when there is a high number of common elements in the data, but much slower than Sets when most of the elements in the data are unique.

That led me to a more subtle insight here about our code. Object performance is demonstrably worse when there is a high variance in the keys…and as we know, the sequence we’re computing grows continuously, which means we must be constantly seeing new numbers and adding them to our object. Thus, we must have a high variance in the keys we’re adding into our object.

This insight led me to an article by Camillo Bruni, an engineer working on the V8 JavaScript engine at Google, on how V8 handles objects gaining new properties. He also linked to an article by Vyacheslav Egorov, a compiler engineer at Google, on how V8 handles property lookups on objects.

I will confess, a lot of the details in those two posts went over my head, but the summary is this: the V8 runtime is optimized for cases where you have many objects with the same sets of properties on them. Constantly adding new keys to our object breaks V8’s property caches and then forces it to rebuild them each time we add a property to the object, which makes inserting new keys really slow. This is exactly what jngbng found: objects with a small number of keys (or rarely-changing sets of keys) perform faster than Sets with the same keys.

In our scenario, we are adding new keys (the numbers we compute) to our object very frequently, meaning we very quickly get into the range where we frequently defeat the V8 Object property lookup caches!

We can actually confirm this with some more profiling:

Now, instead of timing all of the write operations as one block, we measure the insert operations (ie. the cases where we write to new keys) separately from updates. We also need to track the number of inserts and updates we do, to make sure that if inserts are taking longer, it isn’t because we just did more of them.

Sure enough, our numbers confirm our theory:

And as a table:

We can actually watch the average insertion time grow steadily as our object grows. Interestingly, although we seem to consistently update existing values 10x more than we insert new ones, by the time we generate 200,000 values, we’re spending more time on inserts than on updates!

And notice how the average update time stays relatively constant? That’s V8’s property caching optimization in action. When the object doesn’t change “shape” (ie. gain new keys), reads and writes are constant time.

So what’s our fix? Simple: swap the {} for a Map:

This implementation performed much much better:

Now the 10x growth from 200,000 to 2,000,000 only took 7x longer, and the 10x growth from 2,000,000 to 20,000,000 took 15x longer. Both much more reasonable multipliers than what we were seeing before.

If we make a similar change to time the individual sections of the code, we get logs like this:

Add these new numbers to our table from before:

That performance on a Map is much much better, and much more even. Notice how the difference in average insertion time and average update time barely differs on the Map, no matter how large we get? That’s the very definition of O(1) performance.

We can even go further with this — we could swap in a pre-sized array instead of a Map and get even faster performance. Leave a comment below if you know what this would look like :)

Anyways..what’s our takeaway here? What did we learn?

Number 1 is the most obvious: for data sets with large numbers of keys, prefer Map over {}.

But it’s not that simple: it’s important to verify our assumptions. When I first wrote my function, performance analysis suggested that we had written Performant™️ code, but when we ran it, it choked. And despite the mental model that I had around objects, which I think is pretty common among JS programmers, the authors of the JS implementation you’re using might make different tradeoffs that break those mental models.

In our case, the V8 runtime is optimized for the “many objects with the same keys” case instead of the “single object with many keys” case, and that tradeoff made our code much slower. So our second takeaway here is this: our mental models of the world aren’t always accurate, and it’s important to verify those models.

Third, everything is fast for small n. Our object-based algorithm ran just fine on the small case (computing ~2000 values) but fell apart for the larger case (~30 million values). And this was in a case where we actively were thinking about performance!

Have you run into cases like this before? Found any other accidentally n² algorithms? Let me know on twitter or in the comments below.

Originally published at https://dev.to on February 23, 2021.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store