Prefix Sum and Array Manipulation

#programming #dsa

In the HackerRank problem Array Manipulation, you’re presented with a challenge to perform a series of additions on a subsection of a 1-indexed array of zeroes. Each query specifies a starting index, an ending index, and the number to be added between those indices inclusively. For example, given an array of 7 zeroes, after a query of [2, 4, 3] the resulting state would be [0, 3, 3, 3, 0, 0, 0]. After another query of [4, 6, 2], the state would be [0, 3, 3, 5, 2, 2, 0].

The naive brute force approach is to maintain this array and loop over each range of cells, adding the specified number. Then you can determine the max value from the resulting array. However, this approach has a time complexity of O(n^2), which is not ideal given the constraints on the challenge.

An interesting solution to this is to leverage a rather simple prefix sum algorithm.

Prefix Sum

You might look at the prefix sum algorithm and think, “Is this really an algorithm?” It seems pretty obvious as an algorithm however combined with this sliding window technique as you’ll see in the next section, it’s actually a very powerful concept that verges into dynamic programming territory.

To perform a prefix sum over an array of numbers, simply initialize an array of the same length as the input array. Rather than iterate over the array for each sum (i.e. sum[i] = input[0] + input[1] + ... + input[i], the output value at each index is just the sum of the value at the input index and the value of the previous sum (sum[i] = sum[i - 1] + input[i]).

function prefixSum(input) {
  output = [];
  for (let i = 0; i < input.length; i++) {
    const prevSum = i > 0 ? output[i] : 0;
    output[i] = prevSum + input[i];
  }
  return output;
}

Optimization

Keeping prefix sum in mind, now instead of looping through the range between the start and end indices, let’s simply place modifications at two points in the array for each query, one at the start of the range, and one immediately following the end of the range. Why would we do that? Let’s look at the array again and take note of the changes.

[0, 0, 0, 0, 0, 0, 0]
   +3       -3
[0, 3, 3, 5, 2, 2, 0]
         +2       -2

If we can use the queries in a way that only gives us the deltas, then we only have two operations per query, and then a final prefix sum pass through the array of changes. You might be thinking, this doesn’t seem like it saves a ton of time, but imagine an array of a couple of million numbers and each query operates on hundred thousand number ranges. Reducing those passes down to two operations gives us a time complexity of O(n + k), k being the number of queries.