You are given an array of 2n + 1 numbers. Out of these 2n + 1 numbers, 2n are duplicate. You are supposed to find the number which is not duplicate.
Clearly a naive brute-force algorithm is possible, which will scan through the list for each number and see whether its duplicate exists. This is a O(n2) algorithm and require only O(1) space. We can avoid O(n2) time complexity by using additional space in the form of a hash. Hence we will have O(n) time complexity and O(n) space complexity. However there exists a better approach -- one that has O(n) time complexity and O(1) space complexity.
Lets try to have an intuitive understanding of the problem. If we think deeply, O(n2) time complexity gives us lot of computing power and we can do lot more than this trivial stuff. After all we can sort the list in O(nlogn) and then do a linear pass over the list to discover the duplicate and this can be done in O(1) space complexity. (However this only makes sense if we don't mind the original array getting destroyed). Further we cannot go below O(n) as we have to examine each element at least once.
Let us try to concentrate on solving the problem in O(n). Clearly this implies that for each element, we can do some O(1) processing. Before we try to prove or disprove whether such a O(n) is possible or not, let us consider another problem given below-
Suppose we are given an array having 2n +1 elements such that among these 2n + 1 elements, we have n pairs of x and -x. Only one number y is there whose minus is not there. Find the number y.
Clearly for above problem, we can easily find the number by simply adding together all the numbers. Here when we add together all the elements the x and -x cancel out and what remains is the element y.
Now keeping above example in mind, we find that it is similar to our original problem. It differs only in the fact that we had to find a function f such that f(x,x) generates a number z suzh that for other some other number a f(z, a) = a. To put in more simple words, we want a function which when operated on two equal numbers reduces them to the identity element of the operation. Clearly one such operation is xor.
So the solution boils down, to
Just do the xor of all the elements in the array. Whatever remains is the number which has no duplicate.
Following is a python solution for the problem.
1 def find_non_duplicate(array):
2 result = array[0]
3 for i in array[1:]:
4 result ^= i
5 return result
And clearly a C version is not too difficult and it also follows:
1 int find_non_duplicate(int *array, int size)
2 {
3 int result = array[0];
4 int i;
5 for (i = 1; i < size; i++)
6 result ^= array[i];
7 return result;
8 }