I’ve been working on adding word search capabilities to the AJAX Bible, and in the process I had to come up with a way to find the shared values between two sets, or in my case with Javascript, two arrays. There aren’t many algorithms published online that I could find, so here is the most efficient method I found.
(You can view an example of the set compare algorithm here. View source to see the javascript.)
/// <summary>
/// Takes two javascript arrays and returns an array
/// containing a set of values shared by arrays.
/// </summary>
/// <param name="x" type="array" />
/// <param name="y" type="array" />
function ReturnSharedSet(x, y) {
// declare iterator
var i = 0;
// declare terminator
var t = (x.length < y.length) ? x.length : y.length
// sort the arrays
x.sort();
y.sort();
// in this loop, we remove from the arrays, the
// values that aren't shared between them.
while ( i < t) {
if (x[i]== y[i]) {
i++;
}
if (x[i]< y[i]) {
x.splice(i, 1);
}
if (x[i]> y[i]) {
y.splice(i, 1);
}
t = (x.length < y.length) ? x.length : y.length;
// we have to make sure to remove any extra values
// at the end of an array when we reach the end of
// the other.
if (t == i && t < x.length) { x.splice(i);}
if (t == i && t < y.length) { y.splice(i);}
}
// we could return y, because at this time, both arrays
// are identical.
return x;
}
The basic idea is to take two arrays and remove any values that aren’t shared by the other. I sort the arrays first, then iterate through them. If x > y, then I remove that value from y. If y > x, then I remove that value from x. When I get to the bottom of the shortes array, I chop the rest of the values from the longer array, and return one of the arrays as my result (it doesn’t matter which, because at this point, they are both the same).
Note: It works with strings and numbers. The javascript sort function can be customized, and javascript comparison operators work with both strings and numbers. If you impliment it in another language, make sure to be aware of how the comparison operators work, i.e. "a" may not be < "b" in every language.