From: Andrew Tomazos on
On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...(a)yahoo.co.uk>
wrote:
> > int findmode(int* p, int n)
> > {
> >     int count[32];
> >     for(int i = 0; i < 32; i++)
> >         count[i] = 0;
>
> >     for (int i = 0; i < n; i++)
> >          for (int j = 0; j < 32; j++)
> >                if (p[i] & (1 << j)) // if bit j is on
> >                     count[j]++;
> >                else
> >                     count[j]--;
>
> >     int x = 0;
> >     for (int i = 0; i < 32; i++)
> >         if (count[i] > 0)
> >             x = x | (1 << i);
>
> >     return x;
> > }
>
> > The idea here is to count the frequency of each of the 32 bits in the
> > array separately, knowing that these bit counts will be dominated by
> > the mode.
>
> What solution? The above exhibits undefined behaviour.

If you spot a bug you might bother saying what it is. I wrote this
quickly and haven't tested it. Are you referring to the assumption
that int is 32 bit, or is there a typo or some other problem?
-Andrew.
From: Danio on
On Dec 20, 11:08 pm, Andrew Tomazos <and...(a)tomazos.com> wrote:
> On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...(a)yahoo.co.uk>
> wrote:
>
>
>
>
>
> > > int findmode(int* p, int n)
> > > {
> > >     int count[32];
> > >     for(int i = 0; i < 32; i++)
> > >         count[i] = 0;
>
> > >     for (int i = 0; i < n; i++)
> > >          for (int j = 0; j < 32; j++)
> > >                if (p[i] & (1 << j)) // if bit j is on
> > >                     count[j]++;
> > >                else
> > >                     count[j]--;
>
> > >     int x = 0;
> > >     for (int i = 0; i < 32; i++)
> > >         if (count[i] > 0)
> > >             x = x | (1 << i);
>
> > >     return x;
> > > }
>
> > > The idea here is to count the frequency of each of the 32 bits in the
> > > array separately, knowing that these bit counts will be dominated by
> > > the mode.
>
> > What solution? The above exhibits undefined behaviour.
>
> If you spot a bug you might bother saying what it is.  I wrote this
> quickly and haven't tested it.  Are you referring to the assumption
> that int is 32 bit, or is there a typo or some other problem?
>   -Andrew.

I tried this with:
* int p[9] = {4, 4, 4, 5, 6, 7, 5, 5, 4};
* int mode = findmode(p, 9);
and it returns 5.
The bit-counting method correctly returns 4.

It looks to me like you missed Phase 1 from the Fischer&Salzberg
algorithm, namely
arrange the elements so that no two adjacent elements are the same.
http://www.cs.yale.edu/publications/techreports/tr252.pdf
From: Danio on
On Dec 22, 6:23 pm, jbriggs444 <jbriggs...(a)gmail.com> wrote:
> On Dec 22, 11:16 am, Danio <daniothef...(a)hotmail.com> wrote:
> > On Dec 20, 11:08 pm, Andrew Tomazos <and...(a)tomazos.com> wrote:
> > > On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...(a)yahoo.co.uk>
> > > wrote:
> > > > What solution? The above exhibits undefined behaviour.
>
> > > If you spot a bug you might bother saying what it is. I wrote this
> > > quickly and haven't tested it. Are you referring to the assumption
> > > that int is 32 bit, or is there a typo or some other problem?
> > > -Andrew.
>
> > It looks to me like you missed Phase 1 from the Fischer&Salzberg
> > algorithm, namely
> > arrange the elements so that no two adjacent elements are the same.http://www.cs.yale.edu/publications/techreports/tr252.pdf-Hide quoted text -
>
> > - Show quoted text -
>
> I think you missed the definition of the problem in this thread. It
> is one of the givens of the problem that the array has a strict
> "majority" value that appears in over half ot its entries.

Sorry my mistake - strict majority truned into modal value in my head
in between the original description and the latter discussions in this
thread.

> You go on to claim that the code quoted above gives an answer of 5 but
> that the "bit-counting method" returns 4. What you fail to realize is
> that the code above _is_ the bit-counting method
> and that it returns 4, not 5.

Again a mis-reading of Phil Carmody's quote - assumed that he would be
quoting
the most recent code posting of Andrew's not the initial one.

> I actually ran the code. Did you?
Yes

> In summary, in spite of your accusation of ill-defined behavior, in
> spite of your providing the algorithm with non-compliant input and in
> spite of your claim of incorrect output, the fact of the matter is
> that the algorithm got what you consider to be the right answer.

Well it wasn't my accusation of ill-defined behaviour: that was Phil
Carmody.
I was merely trying to find out what he didn't like about the
algorithm,
unfortunately with a horribly flawed test case and the wrong
algorithm...
From: jbriggs444 on
On Dec 22, 11:16 am, Danio <daniothef...(a)hotmail.com> wrote:
> On Dec 20, 11:08 pm, Andrew Tomazos <and...(a)tomazos.com> wrote:
>
>
>
>
>
> > On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...(a)yahoo.co.uk>
> > wrote:
>
> > > > int findmode(int* p, int n)
> > > > {
> > > > int count[32];
> > > > for(int i = 0; i < 32; i++)
> > > > count[i] = 0;
>
> > > > for (int i = 0; i < n; i++)
> > > > for (int j = 0; j < 32; j++)
> > > > if (p[i] & (1 << j)) // if bit j is on
> > > > count[j]++;
> > > > else
> > > > count[j]--;
>
> > > > int x = 0;
> > > > for (int i = 0; i < 32; i++)
> > > > if (count[i] > 0)
> > > > x = x | (1 << i);
>
> > > > return x;
> > > > }
>
> > > > The idea here is to count the frequency of each of the 32 bits in the
> > > > array separately, knowing that these bit counts will be dominated by
> > > > the mode.
>
> > > What solution? The above exhibits undefined behaviour.
>
> > If you spot a bug you might bother saying what it is. I wrote this
> > quickly and haven't tested it. Are you referring to the assumption
> > that int is 32 bit, or is there a typo or some other problem?
> > -Andrew.
>
> I tried this with:
> * int p[9] = {4, 4, 4, 5, 6, 7, 5, 5, 4};
> * int mode = findmode(p, 9);
> and it returns 5.
> The bit-counting method correctly returns 4.
>
> It looks to me like you missed Phase 1 from the Fischer&Salzberg
> algorithm, namely
> arrange the elements so that no two adjacent elements are the same.http://www.cs.yale.edu/publications/techreports/tr252.pdf- Hide quoted text -
>
> - Show quoted text -

I think you missed the definition of the problem in this thread. It
is one of the givens of the problem that the array has a strict
"majority" value that appears in over half ot its entries.

You gave a 9 member array whose mode appears in less than half of the
elements. That violates the givens of the problem. The algorithm is
required to deliver correct results for input data that is within
spec. It is permitted to engage in undefined behavior for input data
that is out of spec.

However, unless you're going to snipe about implementation defined
precision for "int", I don't see any undefined behavior even for the
spec-violating input that you provided.

You go on to claim that the code quoted above gives an answer of 5 but
that the "bit-counting method" returns 4. What you fail to realize is
that the code above _is_ the bit-counting method
and that it returns 4, not 5.

I actually ran the code. Did you?

$ ./a.out
mode is 4

In summary, in spite of your accusation of ill-defined behavior, in
spite of your providing the algorithm with non-compliant input and in
spite of your claim of incorrect output, the fact of the matter is
that the algorithm got what you consider to be the right answer.

What were you complaining about again?
From: Ostap S. B. M. Bender Jr. on
On Nov 21, 8:12 am, Andrew Tomazos <and...(a)tomazos.com> wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing]  "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array.  Specify an
> algorithm to determine x."
>
> The assumption being extra points for efficiency.
>
> I initially came up with a linear space, linear time solution.  With
> some prompting and hints from the interviewer we worked our way to a
> smaller constant space and linear time algorithm.  That being
> basically:
>
> int findmode(int* p, int n)
> {
>     int count[32];
>     for(int i = 0; i < 32; i++)
>         count[i] = 0;
>
>     for (int i = 0; i < n; i++)
>          for (int j = 0; j < 32; j++)
>                if (p[i] & (1 << j)) // if bit j is on
>                     count[j]++;
>                else
>                     count[j]--;
>
>     int x = 0;
>     for (int i = 0; i < 32; i++)
>         if (count[i] > 0)
>             x = x | (1 << i);
>
>     return x;
>
> }
>
> The idea here is to count the frequency of each of the 32 bits in the
> array separately, knowing that these bit counts will be dominated by
> the mode.
>
> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.
>
> My question about this is as follows.  I, perhaps boldly, claim to
> employers to have a "masters-equivilant" experience/knowledge of
> computer science and math.  Should I have been able to come up with
> this solution without prompting or help?
>
> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>
> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?
>
> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix
> and match the "tricks"?  If so, what is a similar commonly known
> algorithm(s) on which the above solution could have been based?
>
> Or, is the ability to invent such a solution simply a matter of IQ?
> Some people have the talent to solve the puzzle, see the pattern and
> come up with the solution - and some just don't?
>

How badly would the following simple algorithm work:

Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")

Then go to the second column. Determine which digit occurs more - 0 or
1 - among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

..
..
..

Then go to the i'th column. Determine which digit occurs more - 0 or 1
- among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

At the end, return any integer not marked by "R".