Sign in
Log inSign up
Functional JavaScript: Five ways to calculate an average with array reduce

Functional JavaScript: Five ways to calculate an average with array reduce

James Sinclair's photo
James Sinclair
·May 30, 2019

This article was originally published at https://jrsinclair.com/articles/2019/five-ways-to-average-with-js-reduce/

Array iteration methods are like a 'gateway drug'. They get many people hooked on functional programming. Because they're just so darn useful. And most of these array methods are fairly simple to understand. Methods like .map() and .filter() take just one callback argument, and do fairly simple things. But .reduce() seems to give people trouble. It’s a bit harder to grasp.

I wrote an earlier article on why I think reduce gives people so much trouble. Part of the reason is that many tutorials start out using reduce only with numbers. So I wrote about the many other things you can do with reduce that don't involve arithmetic. But what if you do need to work with numbers?

A common application for .reduce() is to calculate the average of an array. It doesn't seem so hard on the surface. But it’s a little tricky because you have to calculate two things before you can calculate the final answer:

  1. The total of the items, and
  2. The length of the array.

Both are pretty easy on their own. And calculating averages isn’t that hard for an array of numbers. Here's a simple solution:

function average(nums) {
    return nums.reduce((a, b) => (a + b)) / nums.length;
}

Not that complicated, is it? But it gets harder if you have a more complicated data structure. What if you have an array of objects? And you need to filter out some objects? And you need to extract some numeric value from the object? Calculating the average in that scenario gets a little bit harder.

To get a handle on it, we’ll solve a sample problem (inspired by this Free Code Camp challenge). But, we’ll solve it five different ways. Each one will have different pros and cons. The five approaches show how flexible JavaScript can be. And I hope they give you some ideas on how to use .reduce() for real-world coding tasks.

A sample problem

Let's suppose we have an array of, say, Victorian-era slang terms. We'd like to filter out the ones that don't occur in Google Books and get the average popularity score. Here's how the data might look:

const victorianSlang = [
    {
        term: 'doing the bear',
        found: true,
        popularity: 108,
    },
    {
        term: 'katterzem',
        found: false,
        popularity: null,
    },
    {
        term: 'bone shaker',
        found: true,
        popularity: 609,
    },
    {
        term: 'smothering a parrot',
        found: false,
        popularity: null,
    },
    {
        term: 'damfino',
        found: true,
        popularity: 232,
    },
    {
        term: 'rain napper',
        found: false,
        popularity: null,
    },
    {
        term: 'donkey’s breakfast',
        found: true,
        popularity: 787,
    },
    {
        term: 'rational costume',
        found: true,
        popularity: 513,
    },
    {
        term: 'mind the grease',
        found: true,
        popularity: 154,
    },

];

Data sourced from Google Ngram Viewer

So, let's try 5 different ways of finding that average popularity score…

1. Not using reduce at all (imperative loop)

For our first attempt, we won't use .reduce() at all. If you're new to array iterator methods, then hopefully this will make it a little clearer what's going on.

let popularitySum = 0;
let itemsFound = 0;
const len = victorianSlang.length;
let item = null;
for (let i = 0; i < len; i++) {
    item = victorianSlang[i];
    if (item.found) {
        popularitySum = item.popularity + popularitySum;
        itemsFound = itemsFound + 1;
    }
}
const averagePopularity = popularitySum / itemsFound;
console.log("Average popularity:", averagePopularity);

If you're familiar with JavaScript, this shouldn't be too difficult to understand:

  1. We initialise popularitySum and itemsFound. The first variable, popularitySum, keeps track of the total popularity score. While itemsFound (surprise, surprise) keeps track of the number of items we've found.
  2. Then we initialise len and item to help us as we go through the array.
  3. The for-loop increments i until we've been around len times.
  4. Inside the loop, we grab the item from the array that we want to look at, victorianSlang[i].
  5. Then we check if that item is in the books collection.
  6. If it is, then we grab the popularity score and add it to popularitySum
  7. And we also increment itemsFound
  8. Finally, we calculate the average by dividing popularitySum by itemsFound

Whew. It may not be pretty, but it gets the job done. Using array iterators could make it a bit clearer. Let's see if we can clean it up…

2. Easy mode: Filter, map, and sum

For our first attempt, let's break this problem down into smaller parts. We want to:

  1. Find the items that are in the Google Books collection. For that, we can use .filter().
  2. Extract the popularity scores. We can use .map() for this.
  3. Calculate the sum of the scores. Our old friend .reduce() is a good candidate here.
  4. And finally, calculate the average.

Here's how that might look in code:

// Helper functions
// ----------------------------------------------------------------------------
function isFound(item) {
    return item.found;
};

function getPopularity(item) {
    return item.popularity;
}

function addScores(runningTotal, popularity) {
    return runningTotal + popularity;
}

// Calculations
// ----------------------------------------------------------------------------

// Filter out terms that weren't found in books.
const foundSlangTerms = victorianSlang.filter(isFound);

// Extract the popularity scores so we just have an array of numbers.
const popularityScores = foundSlangTerms.map(getPopularity);

// Add up all the scores total. Note that the second parameter tells reduce
// to start the total at zero.
const scoresTotal = popularityScores.reduce(addScores, 0);

// Calculate the average and display.
const averagePopularity = scoresTotal / popularityScores.length;
console.log("Average popularity:", averagePopularity);

Pay special attention to our addScores function and the line where we call .reduce(). Note that addScores takes two parameters. The first, runningTotal, is known as an accumulator. It tracks the running total. It's updated each time around the loop when we call return. The second parameter, popularity, is the individual array item that we're processing. But, on the first time around the loop, we haven't called return yet to update runningTotal. So, when we call .reduce(), we give it an initial value to set runningTotal at the start. This is the second parameter we pass to .reduce().

So, we've applied array iteration methods to our problem. And this version is a lot cleaner. To put it another way, it's more declarative. We're not telling JavaScript how to run a loop and keep track of indexes. Instead, we define small, simple helper functions and combine them. The array methods, .filter(), .map() and .reduce(), do the heavy lifting for us. This way of doing things is more expressive. Those array methods tell us more about the intent of the code than a for-loop can.

3. Easy mode II: Multiple accumulator values

In the previous version, we created a bunch of intermediate variables: foundSlangTerms, popularityScores. For this problem, there's nothing wrong with that. But what if we set ourselves a challenge? It would be nice if we could use a fluent interface. That way, we could chain all the function calls together. No more intermediate variables. But there's a problem. Notice that we have to grab popularityScores.length. If we chain everything, then we need some other way to calculate that divisor. Let's see if we could change our approach so that we do it all with method chaining. We'll do it by keeping track of two values each time around the loop.

// Helper functions
// ---------------------------------------------------------------------------------
function isFound(item) {
    return item.found;
};

function getPopularity(item) {
    return item.popularity;
}

// We use an object to keep track of multiple values in a single return value.
function addScores({totalPopularity, itemCount}, popularity) {
    return {
        totalPopularity: totalPopularity + popularity,
        itemCount:       itemCount + 1,
    };
}

// Calculations
// ---------------------------------------------------------------------------------

const initialInfo    = {totalPopularity: 0, itemCount: 0};
const popularityInfo = victorianSlang.filter(isFound)
    .map(getPopularity)
    .reduce(addScores, initialInfo);

// Calculate the average and display.
const {totalPopularity, itemCount} = popularityInfo;
const averagePopularity = totalPopularity / itemCount;
console.log("Average popularity:", averagePopularity);

In this approach, we've used an object to keep track of two values in our reducer function. Each time around the loop in addScores(), we update both the total popularity and the count of items. But we combine them into a single object. That way we can cheat and keep track of two totals inside a single return value.

Our addScores() function is a little more complex. But, it means that we can now use a single chain to do all the array processing. We end up with a single result stored in popularityInfo. This makes our chain nice and simple.

If you're feeling sassy, you could remove a bunch of intermediate variables. With some adjustment of variable names, you might even be able to stick everything on a single line. But I leave that as an exercise for the reader.

4. Point-free function composition

Note: Feel free to skip this section if you're new to functional programming or find it at all confusing. It will help if you're already familiar with curry() and compose(). If you'd like to learn more, check out 'A Gentle Introduction to Functional JavaScript'. See part three in particular.

We're functional programmers. That means we like to build our complicated functions out of small, simple functions. So far, at each step along the way, we've been reducing intermediate variables. As a result, our code has become simpler. But What if we took that to an extreme? What if we tried to get rid of all the intermediate variables? And even some parameters too?

It's possible to build our average-calculation function using only compose(); with no variables. We call this style 'point-free', or 'tacit' programming. But to make it work, we need a lot of helper functions.

Seeing JS code written this way sometimes freaks people out. This is because it's a really different way of thinking about JavaScript. But I have found that writing in point-free style is one of the fastest ways to learn what FP is about. So try it on a personal project, but perhaps not on code that other people will need to read.

So, on with building our average calculator. We'll switch to arrow functions here to save space. Ordinarily, using named functions would be better. It provides better stack traces when something goes wrong. (Kyle Simpson has a very good discussion about this in his article I Don't Hate Arrow Functions).

// Helpers
// ----------------------------------------------------------------------------
const filter  = p => a => a.filter(p);
const map     = f => a => a.map(f);
const prop    = k => x => x[k];
const reduce  = r => i => a => a.reduce(r, i);
const compose = (...fns) => (arg) => fns.reduceRight((arg, fn) => fn(arg), arg);

// The blackbird combinator.
// See: jrsinclair.com/articles/2019/compose-js-fu…
const B1 = f => g => h => x => f(g(x))(h(x));

// Calculations
// ----------------------------------------------------------------------------

// We'll create a sum function that adds all the items of an array together.
const sum = reduce((a, i) => a + i)(0);

// A function to get the length of an array.
const length = a => a.length;

// A function to divide one number by another.
const div = a => b => a / b;

// We use compose() to piece our function together using the small helpers.
// With compose() you read from the bottom up.
const calcPopularity = compose(
    B1(div)(sum)(length),
    map(prop('popularity')),
    filter(prop('found')),
);

const averagePopularity = calcPopularity(victorianSlang);
console.log("Average popularity:", averagePopularity);

Now, if none of the above code made any sense to you, don't worry about it. I've included it as an intellectual exercise, not to make you feel bad.

In this case, we do all the heavy lifting in compose(). Reading from bottom up, we start by filtering on the found property. Then we extract the popularity score with map(). And then we use the magical blackbird (B1) combinator to make two calculations for the same input. To explain what's going on, we'll spell it out a little more.

// All the lines below are equivalent:
const avg1 = B1(div)(sum)(length);
const avg2 = arr => div(sum(arr))(length(arr));
const avg3 = arr => ( sum(arr) / length(arr) );
const avg4 = arr => arr.reduce((a, x) => a + x, 0) / arr.length;

Again, don't worry if this doesn't make sense yet. It's just demonstrating that there's more than one way to write JavaScript. That's part of the beauty of the language.

5. Single pass with cumulative average calculation

All the solutions above work fine (including the imperative loop). The ones using .reduce() have something in common. They all work by breaking the problem down into smaller chunks. Then they piece those chunks together in different ways. But you'll notice that we traverse the array three times in those solutions. That feels inefficient. Wouldn't it be nice if there was a way we could process the array just once and pop an average out at the end? There's a way to do that, but it involves a little bit of mathematics.

To calculate the average in one pass, we need a new approach. We need to figure out a way to calculate a new average, given the old average and a new number. So let's do some algebra. To get the average of n numbers, we use this formula:

Screen Shot 2019-05-31 at 08.59.41.png

To get the average of n + 1 numbers we use the same formula, but with a different notation:

2019-05-31 at 09.01.png

But that's the same as:

2019-05-31 at 09.02.png

And also the same as:

2019-05-31 at 09.03.png

With a bit of rearranging, we get:

2019-05-31 at 09.01.png

Don’t worry if that didn’t make sense. The summary is, with this formula, we can keep a running average. So long as we know the previous average and the number of items, we can keep updating each time around the loop. And we can move most of the calculations inside our reducer function:

// Average function
// ----------------------------------------------------------------------------

function averageScores({avg, n}, slangTermInfo) {
    if (!slangTermInfo.found) {
        return {avg, n};
    }
    return {
        avg: (slangTermInfo.popularity + n * avg) / (n + 1),
        n:   n + 1,
    };
}

// Calculations
// ----------------------------------------------------------------------------

// Calculate the average and display.
const initialVals       = {avg: 0, n: 0};
const averagePopularity = victorianSlang.reduce(averageScores, initialVals).avg;
console.log("Average popularity:", averagePopularity);

This approach gets us the average in a single pass through the array. The other approaches use one pass to filter, another to extract, and yet another to add the total together. With this approach, we do it all in a single traversal.

Note that this doesn't necessarily make the calculation more efficient. We end up doing more calculations this way. We multiply and divide each found item to keep the running total, instead of doing a single divide at the end. But, it is more memory efficient. Since there are no intermediate arrays, we only ever store an object with two values.

But this memory efficiency has a cost. We’re now doing three things in one function. We’re filtering, extracting the number and (re)calculating the average all together. This makes that single function more complicated. It’s harder to see at a glance what’s going on.


So which of our five approaches is better? Well, it depends. Maybe you have really long arrays to process. Or maybe your code needs to run on hardware that doesn’t have much memory. In these cases, then using the single-pass approach makes sense. But if performance isn’t a problem, then the more expressive approaches are fine. You need to decide what works best for your application. And what is appropriate for your specific circumstances.

Now… some clever people might be wondering: Is there a way we could have the best of both worlds? Could we break down the problem into smaller parts, but still do it in a single pass? And there is a way to do that. It involves using something called a transducer. But that’s a whole other article and will have to wait for next time…

Conclusion

So, we've looked at five different ways of calculating an average:

  1. Not using reduce at all;
  2. Easy mode I: Filter, map, and sum;
  3. Easy mode II: Multiple accumulator values;
  4. Point-free function composition; and
  5. Single pass with a cumulative average calculation

Which one should you use? Well, that’s up to you. But if you’re looking for some guidance, then here’s my opinion on how to decide:

  • Start by using the approach you understand best. If that works for you, then stick with it.
  • If there’s another approach you don’t understand, but you want to learn, then give it a go.
  • And finally, if you run into memory problems, try the single-pass approach.

Thanks for reading this far. If you found any of this helpful (or if you found it confusing), I’d love to know. Send me a tweet. It would be great to hear from you. And if you haven't done so already, grab your copy of the Civilised Guide to JavaScript Array Methods. It will help you work out when .reduce() is the right method to use.