Prev: Unification Algorithm
Next: Software Configuration Management: Continuous Integration Widely Adopted
From: A.G.McDowell on 21 Nov 2009 13:46 In article <6a8340b8-19b0-4fda-96d5-e744aead1bd7(a)m26g2000yqb.googlegroup s.com>, Andrew Tomazos <andrew(a)tomazos.com> writes >I was posed the following question in a technical interview for a >Software Engineering position by a major multinational NASDAQ company: > >[Paraphrasing] "You are given an array of 1,000,000 32-bit integers. >One int value x occurs 500,001 times or more in the array. Specify an >algorithm to determine x." > >The assumption being extra points for efficiency. > >I initially came up with a linear space, linear time solution. With >some prompting and hints from the interviewer we worked our way to a >smaller constant space and linear time algorithm. That being >basically: > This sort of problem is covered by articles on Data Stream Processing in CACM Oct 2009. (CACM is a lot more interesting these days than it was some years ago). There are some very neat ideas in there, of which the algorithm "MAJORITY" matches the question reasonably well. Proving that it works under interview conditions would be extremely impressive, though. This is not the first time that I have heard of interview questions that discuss issues recently covered in the computing literature. I am unable to tell whether these come from a desire to know if the candidate keeps themselves abreast of the subject, or from the interviewer grasping the first thing that comes to hand when they are trying to think up a question. The few times that I have posed interview questions, I have tried to find evidence in the candidate of a knowledge of basic theory or mathematics that I could show was relevant to the job for which we were trying to recruit. -- A.G.McDowell
From: Bill Dubuque on 21 Nov 2009 16:33 cri(a)tiac.net (Richard Harter) wrote: >Andrew Tomazos <andrew(a)tomazos.com> wrote: >> >>I was posed the following question in a technical interview for a >>Software Engineering position by a major multinational NASDAQ company: >> >>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers. >>One int value x occurs 500,001 times or more in the array. Specify an >>algorithm to determine x." >> >>The assumption being extra points for efficiency. [snip code] The idea >>is to count the frequency of each of the 32 bits in the array separately, >>knowing that these bit counts will be dominated by the mode. [...] >> >>My question about this is as follows. I, perhaps boldly, claim to >>employers to have a "masters-equivilant" experience/knowledge of >>computer science and math. Should I have been able to come up with >>this solution without prompting or help? >>Would you expect someone with a CompSci Masters or PhD from some major >>ivy league university to be able to come up with this solution without >>help? If so in what course or from which book would you expect to learn >>the required knowledge to come up with the above solution? > > That's an interesting question with an interesting presupposition. > The first thing to understand is that this is a puzzle rather than > a programming problem. I disagree. Saying that it's a puzzle rather than a programming problem seems to imply that the solution is ad-hoc, rather than a special case of some more universal technique. But that is not true here. Namely, the proposed solution follows immediately from the obvious fact that majority elts on product sets are preserved by component projections. So the solution is simply an instance of well-known divide-and-conquer techniques for product objects. Such reductions are ubiquitous in both mathematics and computer science. So I would expect a good student to find this solution given enough time. I'd also expect a student from a top-tier school to discover the more optimal solution, viz. bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S) via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic again, assuming enough time. But that assumption is highly problematic when it comes to designing tests that quickly measure various skills. It's impossible to test such in the short time span of an interview. It remains difficult even in much longer timed tests. For example many of the most successful mathematicians did not perform well on the Putnam competition, and some of those who did well didn't turn out to be exceptional mathematicians. Thus there isn't necessarily much correlation between intelligence and random test scores. Marilyn vos Savant is a prime example. The woman who supposedly has the "world's highest IQ" published a Parade magazine column [1] and book [2] on Wiles proof of FLT. Her nonsensical arguments proved beyond a doubt that she has absolutely no clue about higher math and, worse, doesn't have the intelligence to know that (even after many experts pointed out gaping flaws in her very naive arguments). So much for IQ tests. --Bill Dubuque [1] sci.math, 11/29/1993, Full Text of Savant FLT Parade Article http://google.com/group/sci.math/msg/8cb44534c63372ad http://google.com/groups?selm=WGD.93Nov29055015%40martigny.ai.mit.edu [2] Boston, Nigel; Granville, Andrew. Review of: The World's Most Famous Math Problem (The Proof of Fermat's Last Theorem and Other Mathematical Mysteries) by Marilyn vos Savant The American Mathematical Monthly, 102, 5, 1995, 470-473 http://www.dms.umontreal.ca/~andrew/PDF/VS.pdf http://www.jstor.org/stable/2975048
From: Alf P. Steinbach on 21 Nov 2009 16:53 * Bill Dubuque: > > I disagree. Saying that it's a puzzle rather than a programming problem > seems to imply that the solution is ad-hoc, rather than a special case > of some more universal technique. But that is not true here. Namely, > the proposed solution follows immediately from the obvious fact that > majority elts on product sets are preserved by component projections. > So the solution is simply an instance of well-known divide-and-conquer > techniques for product objects. Such reductions are ubiquitous in both > mathematics and computer science. So I would expect a good student > to find this solution given enough time. I'd also expect a student > from a top-tier school to discover the more optimal solution, viz. > > bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S) > > via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic Hiya. Could you please explain the above more optimal solution. I'm unfamiliar with the notation and not from a top-tier school, but in my experience anything that's not nonsense can be visualized or explained in simple terms (e.g., Albert Einstein did that beautifully with his book on special relativity, which except for one little proof in an appendix -- I didn't yet know anything about solving sets of equations with multiple variables -- I found eminently grokkable as a teenager, so should also be possible for the above, yes?). Cheers & TIA., - Alf
From: Axel Vogt on 21 Nov 2009 17:07 Andrew Tomazos wrote: > I was posed the following question in a technical interview for a > Software Engineering position by a major multinational NASDAQ company: > > [Paraphrasing] "You are given an array of 1,000,000 32-bit integers. > One int value x occurs 500,001 times or more in the array. Specify an > algorithm to determine x." > > The assumption being extra points for efficiency. <snipped> Being a bit stupid I first would ask "Why? What do you do with it?" and then I would pick on random. I am almost certain, that even at a low number of draws the chance to get the very integer is higher than implementing an algo without coding errors.
From: Andrew Tomazos on 21 Nov 2009 18:45
On Nov 21, 10:53 pm, "Alf P. Steinbach" <al...(a)start.no> wrote: > * Bill Dubuque: > > > > > I disagree. Saying that it's a puzzle rather than a programming problem > > seems to imply that the solution is ad-hoc, rather than a special case > > of some more universal technique. But that is not true here. Namely, > > the proposed solution follows immediately from the obvious fact that > > majority elts on product sets are preserved by component projections. > > So the solution is simply an instance of well-known divide-and-conquer > > techniques for product objects. Such reductions are ubiquitous in both > > mathematics and computer science. So I would expect a good student > > to find this solution given enough time. I'd also expect a student > > from a top-tier school to discover the more optimal solution, viz. > > > bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S) > > > via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic > > Hiya. Could you please explain the above more optimal solution. Dubuque is referring to the solution that Woeginger more lucidly described above. Both it and the bit counting method are asymptotically equivalent solutions to the original problem. I'm sure either of these solutions provided on the spot would have received "full marks". I guess what I am curious about is exactly what percentage of, say... CS PhD students at tier one universities, would be able to come up with either of these solutions on the spot. 80%, 50% or 20% ? I guess only someone that has interviewed many people with these sort of problems has the necessary data to answer my question. -Andrew. |