Prev: Unification Algorithm
Next: Software Configuration Management: Continuous Integration Widely Adopted
From: Virgil on 22 Nov 2009 01:38 I > >>>I was posed the following question in a technical interview for a > >>>Software Engineering position by a major multinational NASDAQ company: > >>> > >>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers. > >>>One int value x occurs 500,001 times or more in the array. Specify an > >>>algorithm to determine x." Find, to the nearest integer, the average value.
From: Willem on 22 Nov 2009 03:33 Virgil wrote: ) I )> >>>I was posed the following question in a technical interview for a )> >>>Software Engineering position by a major multinational NASDAQ company: )> >>> )> >>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers. )> >>>One int value x occurs 500,001 times or more in the array. Specify an )> >>>algorithm to determine x." ) ) Find, to the nearest integer, the average value. Doesn't work. Consider the case of 500,001 times 0 and 499,999 times 2^31-1 Perhaps you are confused with 'find the median value' ? SaSW, Willem -- Disclaimer: I am in no way responsible for any of the statements made in the above text. For all I know I might be drugged or something.. No I'm not paranoid. You all think I'm paranoid, don't you ! #EOT
From: Richard Heathfield on 22 Nov 2009 03:47 In <7mr6n5F3j1be1U1(a)mid.individual.net>, Axel Vogt wrote: > Andrew Tomazos wrote: >> I was posed the following question in a technical interview for a >> Software Engineering position by a major multinational NASDAQ >> company: >> >> [Paraphrasing] "You are given an array of 1,000,000 32-bit >> [integers. >> One int value x occurs 500,001 times or more in the array. Specify >> an algorithm to determine x." >> >> The assumption being extra points for efficiency. > <snipped> > > Being a bit stupid I first would ask "Why? What do you do with it?" > and then I would pick on random. I am almost certain, that even at > a low number of draws the chance to get the very integer is higher > than implementing an algo without coding errors. I thought of that too, but quickly saw the flaw - if for large N you have (N/2)+1 occurrences of X, and (N/2)-1 occurrences of Y, you can't settle the matter (is it X, or is it Y?) satisfactorily without reading every element. -- Richard Heathfield <http://www.cpax.org.uk> Email: -http://www. +rjh@ "Usenet is a strange place" - dmr 29 July 1999 Sig line vacant - apply within
From: Willem on 22 Nov 2009 03:50 Andrew Tomazos wrote: ) Dubuque is referring to the solution that Woeginger more lucidly ) described above. Both it and the bit counting method are ) asymptotically equivalent solutions to the original problem. I'm sure ) either of these solutions provided on the spot would have received ) "full marks". I'm not so sure. I wouldn't be surprised if you only get "full marks" for the solution that the interviewer has in mind, even if there are other solutions that are equivalent or better. NB: The counting method is actually better, because you don't need O(k * log n) space (where k is the size of an individual key). Or, practically speaking, it also works if the items are strings. A very simple proof of the 'other' method would be that whenever c reaches 0, the part of the array you processed can't have more than half of the 'majority' item. (Exactly half of it is one single item, and the other half is not that single item) So the remaining part *must* have more than half, i.e. the problem is reduced to the same problem for the remaining part. SaSW, Willem -- Disclaimer: I am in no way responsible for any of the statements made in the above text. For all I know I might be drugged or something.. No I'm not paranoid. You all think I'm paranoid, don't you ! #EOT
From: Moi on 22 Nov 2009 06:32
On Sun, 22 Nov 2009 08:47:19 +0000, Richard Heathfield wrote: > In <7mr6n5F3j1be1U1(a)mid.individual.net>, Axel Vogt wrote: > >> Andrew Tomazos wrote: >>> I was posed the following question in a technical interview for a >>> Software Engineering position by a major multinational NASDAQ company: >>> >>> [Paraphrasing] "You are given an array of 1,000,000 32-bit [integers. >>> One int value x occurs 500,001 times or more in the array. Specify an >>> algorithm to determine x." >>> >>> The assumption being extra points for efficiency. >> <snipped> >> >> Being a bit stupid I first would ask "Why? What do you do with it?" and >> then I would pick on random. I am almost certain, that even at a low >> number of draws the chance to get the very integer is higher than >> implementing an algo without coding errors. > > I thought of that too, but quickly saw the flaw - if for large N you > have (N/2)+1 occurrences of X, and (N/2)-1 occurrences of Y, you can't > settle the matter (is it X, or is it Y?) satisfactorily without reading > every element. With a kind of weight-balanced tree, ( := rotate in such a way that the node with the higher reference count is closer to the root) you can terminate reading and inserting the values once there is a clear "winner". I think, something similar can be done for the bitcounting method: if all bins have a value with (abs(value) > number_of_remaining_items) you can terminate the inspection / insertion. AvK |