From: Ben Bacarisse on
gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:

> In comp.theory Ben Bacarisse <ben.usenet(a)bsb.me.uk> wrote:
> # gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:
> #
> # > There is an old analysis of this problem by Fischer and Salzberg.
> # > M.J. Fisher and S.L. Salzberg (1982)
> # > Finding a majority among n votes.
> # > Journal of Algorithms 3, pp 143-152.
> #
> # I can't track this down. Have I got the right Journal of Algorithms?
> #
> # http://www.informatik.uni-trier.de/~ley/db/journals/jal/jal3.html
> #
> # does not list that paper. Of course, this table of contents might be
> # wrong. The horrendous link:
> #
> # http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%236839%231982%23999969995%23518142%23FLA%23&_cdi=6839&_pubType=J&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2c8bc9013d6025d08b7d24d3207dfe18
> #
> # appears to be for the journal itself. The paper does not appear
> # there, either.
>
>
> I gave the wrong page numbers. It should be
> M.J. Fisher and S.L. Salzberg (1982)
> Finding a majority among n votes.
> Journal of Algorithms 3, pp 375-379
>
> And it is not an independent paper, but the discussion of
> problem 81-5 in the problems column of Leo Guibas.

Ah! I can't get access to the text (no library anymore) so I could
not find it. I thought it might be in the "Problems" column, but the
page numbers did stop me checking that.

[Rant: Is it reasonable to ask $20 for a PDF of a 27 year old paper?
It would cost me that to get to a library, so I suppose so. I'd be
happier if the authors, editors and reviewers got a cut of that, but I
know they won't.]

> I also messed up the statement of the main result of Fisher and Salzberg:
> The 3k-2 comparisons are for the problem where you have to decide
> whether there exists some majority element.

Ah (again). I now get it. Their algorithm has two phases. What you
describe is phase one. In the particular case where we know (from the
definition of the problem) that there *is* a majority we can stop
there. Because we don't need their phase two, there is no need for
the list they maintain in phase one.

I think their original problem is more interesting. Determining if
there is a majority at all (and what that majority element is) is more
subtle than the rather fake 500,001 of 1,000,000 element problem.

When you know a majority exists, you can find it in n-1 candidate
comparisons (that is, excluding the housekeeping tests like count ==
0). This is less than the 3n/2 - 2 (= 3k - 2 in the posted interview
problem) required if it is not known that there is a majority.

<snip>
--
Ben.
From: jbriggs444 on
On Nov 23, 5:27 am, James Kanze <james.ka...(a)gmail.com> wrote:
> On Nov 22, 4:51 pm, Chip Eastham <hardm...(a)gmail.com> wrote:
>
>
>
>
>
> > On Nov 21, 7:04 pm, Andrew Tomazos <and...(a)tomazos.com> wrote:
> > > On Nov 22, 1:01 am, Andrew Tomazos <and...(a)tomazos.com> wrote:
> > > > On Nov 21, 6:29 pm, gwo...(a)figipc78.tu-graz.ac.at (GJ Woeginger)
> > > > wrote:
> > > > > In comp.theory Andrew Tomazos <and...(a)tomazos.com> wrote:
> > > > > > I was posed the following question in a technical
> > > > > > interview for a Software Engineering position by a
> > > > > > major multinational NASDAQ company:
> > > > > > [Paraphrasing]  "You are given an array of 1,000,000
> > > > > > 32-bit integers.  One int value x occurs 500,001 times
> > > > > > or more in the array.  Specify an algorithm to
> > > > > > determine x."
> > > > > > The assumption being extra points for efficiency.
> > > > > There is an old analysis of this problem by Fischer and Salzberg.
> > > > >   M.J. Fisher and S.L. Salzberg  (1982)
> > > > >   Finding a majority among n votes.
> > > > >   Journal of Algorithms 3, pp 143-152.
> > > > > If 2k elements contain a majority element (= an element
> > > > > that occurs at least k+1 times), then you can find it
> > > > > with 3k-2 element comparisons (and some small overhead).
> > > > > The basic idea in their algorithm is that whenever you
> > > > > find two distinct elements, then you can destroy both
> > > > > without changing the majority element among the
> > > > > remaining elements.  This yields:
> > > > >  Run once through the array, and maintain a
> > > > >  majority-candidate and a counter.  The
> > > > >  majority-candidate is initialized as the first element,
> > > > >  and the counter (counting how often you have seen the
> > > > >  candidate) is initialized at 1.
> > > > >  If you hit the current candidate again, increment the counter.
> > > > >  If you hit another element, decrement the counter.
> > > > >  If the counter becomes 0, drop the current candidate and start from
> > > > >    scratch with the remaining part of the array.
> > > > > Fischer and Salzberg also show that this bound 3k-2 is
> > > > > best possible in the worst case (and that's the main
> > > > > part of their paper).
> > > > If I understand your description than it would look like:
> > > > int findmode(int* p, int n)
> > > > {
> > > >     int x = p[0];
> > > >     int c = 1;
>
> > > >     for (int i = 1; i < n; i++)
> > > >     {
> > > >        if (c == 0)
> > > >        {
> > > >           x = p[i];
> > > >           c = 1;
> > > >        }
> > > >        else if (p[i] == x)
> > > >            c++;
> > > >        else
> > > >            c--;
> > > >     }
> > > >     return x;
> > > > }
> > > > It seems that this could only produce at maximum (2k-1)
> > > > comparisions in the worst case, not (3k-2) as you claim?
> > > Oh wait... the (c==0) counts as a comparison.  I see it now.
> > > Ignore me.
> > I don't think the (c==0) counts as a comparison.
>
> Of course it does.  Why wouldn't it?

Because the statement in question was:

"[...] then you can find it with 3k-2 element comparisons"

Index comparisons and element comparisons are not at all the same
thing when one generalizes the problem away from arrays of 32 bit
values to arrays of arbitrary abstract values.
From: Bruce Wheeler on
On Mon, 23 Nov 2009 13:37:49 +0000, Ben Bacarisse
<ben.usenet(a)bsb.me.uk> wrote:

>gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:
>
>> In comp.theory Ben Bacarisse <ben.usenet(a)bsb.me.uk> wrote:
>> # gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:
>> #
>> # > There is an old analysis of this problem by Fischer and Salzberg.
>> # > M.J. Fisher and S.L. Salzberg (1982)
>> # > Finding a majority among n votes.
>> # > Journal of Algorithms 3, pp 143-152.
>> #
>> # I can't track this down. Have I got the right Journal of Algorithms?
>> #
>> # http://www.informatik.uni-trier.de/~ley/db/journals/jal/jal3.html
>> #
>> # does not list that paper. Of course, this table of contents might be
>> # wrong. The horrendous link:
>> #
>> # http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%236839%231982%23999969995%23518142%23FLA%23&_cdi=6839&_pubType=J&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2c8bc9013d6025d08b7d24d3207dfe18
>> #
>> # appears to be for the journal itself. The paper does not appear
>> # there, either.
>>
>>
>> I gave the wrong page numbers. It should be
>> M.J. Fisher and S.L. Salzberg (1982)
>> Finding a majority among n votes.
>> Journal of Algorithms 3, pp 375-379
>>
>> And it is not an independent paper, but the discussion of
>> problem 81-5 in the problems column of Leo Guibas.
>
>Ah! I can't get access to the text (no library anymore) so I could
>not find it. I thought it might be in the "Problems" column, but the
>page numbers did stop me checking that.
>
>[Rant: Is it reasonable to ask $20 for a PDF of a 27 year old paper?
>It would cost me that to get to a library, so I suppose so. I'd be
>happier if the authors, editors and reviewers got a cut of that, but I
>know they won't.]
>

Note: it is Fischer (sp) and Salzberg.

Doing a Google search on fischer salzberg majority produced the
following, a scanned version of the paper.

www.cs.yale.edu/publications/techreports/tr252.pdf

Regards,
Bruce Wheeler
From: Bruce Wheeler on
On Mon, 23 Nov 2009 17:20:15 +0100, Bruce Wheeler
<bswheeler39(a)hotmail.com> wrote:

>On Mon, 23 Nov 2009 13:37:49 +0000, Ben Bacarisse
><ben.usenet(a)bsb.me.uk> wrote:
>
>>gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:
>>
>>> In comp.theory Ben Bacarisse <ben.usenet(a)bsb.me.uk> wrote:
>>> # gwoegi(a)figipc78.tu-graz.ac.at (GJ Woeginger) writes:
>>> #
>>> # > There is an old analysis of this problem by Fischer and Salzberg.
>>> # > M.J. Fisher and S.L. Salzberg (1982)
>>> # > Finding a majority among n votes.
>>> # > Journal of Algorithms 3, pp 143-152.
>>> #
>>> # I can't track this down. Have I got the right Journal of Algorithms?
>>> #
>>> # http://www.informatik.uni-trier.de/~ley/db/journals/jal/jal3.html
>>> #
>>> # does not list that paper. Of course, this table of contents might be
>>> # wrong. The horrendous link:
>>> #
>>> # http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%236839%231982%23999969995%23518142%23FLA%23&_cdi=6839&_pubType=J&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2c8bc9013d6025d08b7d24d3207dfe18
>>> #
>>> # appears to be for the journal itself. The paper does not appear
>>> # there, either.
>>>
>>>
>>> I gave the wrong page numbers. It should be
>>> M.J. Fisher and S.L. Salzberg (1982)
>>> Finding a majority among n votes.
>>> Journal of Algorithms 3, pp 375-379
>>>
>>> And it is not an independent paper, but the discussion of
>>> problem 81-5 in the problems column of Leo Guibas.
>>
>>Ah! I can't get access to the text (no library anymore) so I could
>>not find it. I thought it might be in the "Problems" column, but the
>>page numbers did stop me checking that.
>>
>>[Rant: Is it reasonable to ask $20 for a PDF of a 27 year old paper?
>>It would cost me that to get to a library, so I suppose so. I'd be
>>happier if the authors, editors and reviewers got a cut of that, but I
>>know they won't.]
>>
>
>Note: it is Fischer (sp) and Salzberg.
>
>Doing a Google search on fischer salzberg majority produced the
>following, a scanned version of the paper.
>
>www.cs.yale.edu/publications/techreports/tr252.pdf
>
>Regards,
>Bruce Wheeler

I see that this is a reference to the technical report you (Ben
Bacarisse) mentioned a few messages before, so you can ignore my
previous post (but note the spelling).

Regards,
Bruce Wheeler
From: Paul E. Black on
On Saturday 21 November 2009 11:12, Andrew Tomazos wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>
> ... [PEB]
>
> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.

I often ask questions to understand an applicant's approach, not for
their specific knowledge. (I asked one candidate to write a
quicksort, and he replied he would look it up. I was not impressed.)

> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>
> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?
>
> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix
> and match the "tricks"?

I took an algorithms or analysis of algorithms class. We solved a ton
of problems like "design an O(n^2) algorithm for ..." (or n log n or
whatever). After a semester of this, we were pretty good at coming up
with stuff. Note: just "knowing" the complexity of a solution went a
long way as a hint to how it might be solved.

So yes, studying how to come up with algorithms should help people in
similar circumstances.

Did *I* solve the problem? hmm, uh, its been a coupla' decades, and
I'm, uh, kinda' busy ...

-paul-
--
Paul E. Black (p.black(a)acm.org)