This article will go through the Codility training exercise Dominator. Please refer to the link for a detailed exercise description – I cannot reproduce it do to copyright. The gist of the exercise is as follows:
Given an array \(A\) of \(N\) integers, a dominator of \(A\) is the element that appears more than half of the time. Given an array of integers \(A\), find its dominator and return any index of the dominator, or return \(-1\) if no dominator exists.
A naive algorithm could for example work like this:
- For every element \(x\) of the array, traverse the whole array and count the number of occurrences of \(x\). Keep the most occurring value at every iteration. At the end, check whether the most occurring item indeed occurs more than half of the time.
The above algorithm has time complexity \(O(N^2)\) and space complexity \(O(1)\).
We can do better, by for exampling using a hashmap to store the count of every integer. This would change the complexity to \(O(N)\) time and \(O(N)\) space, which is a marked improvement over the last algorithm. We can still do better though.
1 Optimal Solution
The problem is known in the literature as the majority vote problem. The optimal solution for the majority vote problem has been proposed by Bob Boyer and J Moore 1; you can view the solution here. The gist of the algorithm is as follows:
- We sweep through the sequence by keeping a pointer to an element. Start off pointing to no element.
- As we sweep, we maintain a current candidate, and a counter. The counter is initialized to zero, and the current candidate is initialized to the first element.
- For each element \(e\), we do the following:
- If the element is the current candidate, increment counter;
- Otherwise, if the counter is 0, set it to 1 and change current candidate to \(e\). If the counter isn’t 0, decrement counter.
Read and re-read the algorithm outline above to make sure you get the general idea.
I will try to use an example to illustrate how this works. Imagine you have three groups of people, and each group is identified by its t-shirt color: We have the red, green, and blue groups. Our task is to determine which group has the most people, using \(O(N)\) time and \(O(1)\) space. Consider this:
Take all members of all groups, and make them stand on the side of a large field. Then, take the first person, and tell them to go to the middle of the field. Then, for every next person \(p\):
- If there is no one in the middle, let \(p\) come into the middle.
- If \(p\) is on the same team as group of people in the middle, let \(p\) join the group in the middle.
- Otherwise, take \(p\) and one person from the group in the middle, and make them go to another corner of the field. We will not need them again.
At the end of this procedure, the group with the largest members of players is left standing in the middle. This works for any number of groups and people. I hope that this exemplified well how this algorithm works.
The implementation is very straightforward. As a final piece for this Codility challenge, we also need to check whether the returned element is actually a dominator element. If it’s not, then there is no dominator element and we can return -1. The Java code might look like this:
The above scores 100/100 on the Codility website.
As a visualization aid for this algorithm, here is my attempt and showing the progression of the algorithm on a certain array. It is my first time using D3.js, but nevertheless I hope the result might of use to some of you.
Max Element: ?
Count: 0
Moore’s first name really is the one character letter J↩