statistics folly [Design]

Prev: MsPacMan
Next: Notice of assessment

From: Michael Robinson on 16 Jul 2010 14:02

More options Jul 16, 1:24 pm
Newsgroups: sci.math
From: gearhead <nos...(a)billburg.com>
Date: Fri, 16 Jul 2010 10:24:41 -0700 (PDT)
Local: Fri, Jul 16 2010 1:24 pm
Subject: stats/probability question
Reply | Reply to author | Forward | Print | Individual message | Show
original | Remove | Report this message | Find messages by this author

I'm an engineering undergrad in an intro stats course. We had a
question in the book that's really dumb.

problem as stated:

Your candidate has 55% of the votes in the entire school. But
only
100 students will show up to vote. What is the probability that the
underdog (the one with 45% support) will win? To find out, set up a
simulation.
a) Describe how you will simulate a component and its outcomes.
b) Describe how you will simulate a trial.
c) Describe the response variable.

The answer in the back of the book says using a two digit random
number to determine each vote (00-54 for your candidate, 55-99 for the
underdog) you would run a string of trials with 100 votes to each
trial.

Now, this is one misconceived exercise. Let me explain why.

Say the school has 1000 students. If all of them show up, the
underdog has 0% chance of winning. If exactly one voter shows up,
underdog has 45% chance of winning. In an election where 100 voters
show up, underdog's chance of winning the election HAS to lie
somewhere between 0% and 45%. No ifs, ands or buts.
The probability of a win for underdog can never exceed 45%. When the
exercise asks "how often will the underdog win" I interpret that as
meaning what are his chances, i.e., the probability that he will win.
But if you run a simulation, you can get anything, including results
above 45%. I don't think simulating has any validity here, at least
the procedure suggested in the answer key. That is a lot of
simulating to do by hand, 100 per trial, but it is nowhere close to
even starting to answer the actual question. You would first of all
have to know the population of the school and then do some very
demanding simulations that would only be practical on a computer.

Practical considerations aside, the question is
meaningless unless know something about the magnitude of the
school population.
Consider: if the total population is 108, the underdog cannot win,
because he only has 49 (48.6 rounded up) supporters total. Chance of
winning 0%. Period. "Underdog" has NO CHANCE of winning the
election. But if you run a simulation the way the book suggests, he's
going to win some. In fact he wins about half.
I'm saying the book is wrong.
Back to our school of 1000 students, out of whom 450 would vote for
"underdog." If only 100 students vote, what are his chances of
winning? Simulation will send you on the wrong track here unless
you're ready for some head scratching and a big grind on the computer. I'm
sure this problem has a neat theoretical solution.

In class today I saw this problem and just was mystified until I worked out
the implications, and now it's clear that it's just incredibly stupid. How
would you convince the teacher of that? If I point out that it's
impossible to get any answer above 45%, she might say, well this isn't
theoretical, we're just running simulations, which is the whole point of
the game. To convince her I might have to work out the actual correct
simulation methodology, which is likely a very big headache and something I
don't have time for. So I may just let it slide and not even bring it up.
But I'm still interested in the theoretical solution, if anybody can cough
it up. It's a probability problem now, not empirical statistics.

---------------------------------------
Posted through http://www.Electronics-Related.com

From: Joel Koltner on 16 Jul 2010 14:18

They're really just wording the question kinda poorly (and they're also
assuming the student population is very, very large -- as you point out, if
there are only 100 kids at the school, you can come up with very definitive
answers). What they really mean is something like:

-- You're performing sampling where 45% of the time you get answer A (someone
votes for the underdog), and 55% of the time you get answer B (a vote for the
other guy). If you perform 100 random samples, what's the likelihood that
you'll get more than 50 'A' answers?

This is a standard statistics question, along the lines of, "If you roll a
fair dice 100 times, what's the likelihood you'll get '3' 20 or more times?"

Part of engineering is figuring out what your "customer" really wants when
their own description is kinda flaky. :-)

---Joel

From: Joerg on 16 Jul 2010 14:27

Joel Koltner wrote:
> They're really just wording the question kinda poorly (and they're also
> assuming the student population is very, very large -- as you point out,
> if there are only 100 kids at the school, you can come up with very
> definitive answers). What they really mean is something like:
>
> -- You're performing sampling where 45% of the time you get answer A
> (someone votes for the underdog), and 55% of the time you get answer B
> (a vote for the other guy). If you perform 100 random samples, what's
> the likelihood that you'll get more than 50 'A' answers?
>
> This is a standard statistics question, along the lines of, "If you roll
> a fair dice 100 times, what's the likelihood you'll get '3' 20 or more
> times?"
>

Quite common in medical: "This new procedure has a 15% success rate!"
(applause) ... "How many candidates were in the patient pool for the
study?" ... "Twenty" (silence)

> Part of engineering is figuring out what your "customer" really wants
> when their own description is kinda flaky. :-)
>

One question I always pondered is, why are they teaching this in
engineering school anyhow? When I started at university it was all
engineering stuff. Plus math, chemistry, mechanical engineering, but all
pretty well geared towards us becoming EEs some day.

--
Regards, Joerg

http://www.analogconsultants.com/

"gmail" domain blocked because of excessive spam.
Use another domain or send PM.

From: Michael Robinson on 16 Jul 2010 15:13

>They're really just wording the question kinda poorly (and they're also
>assuming the student population is very, very large -- as you point out,
if
>there are only 100 kids at the school, you can come up with very
definitive
>answers). What they really mean is something like:
>
>-- You're performing sampling where 45% of the time you get answer A
(someone
>votes for the underdog), and 55% of the time you get answer B (a vote for
the
>other guy). If you perform 100 random samples, what's the likelihood that

>you'll get more than 50 'A' answers?
>
>This is a standard statistics question, along the lines of, "If you roll a

>fair dice 100 times, what's the likelihood you'll get '3' 20 or more
times?"
>
>Part of engineering is figuring out what your "customer" really wants when

>their own description is kinda flaky. :-)
>
>---Joel
>
>
If the school population is many, many magnitudes larger than the number of
voters, the chance that underdog will win just reduces to 45% (the same as
the underdog's chance of winning if only one student votes).
And in the case where the school population is relatively small, the
simulation methodology suggested is so bad it's not even wrong. Sampling
will always return about 45%, and we have seen that the chances of the
underdog winning can range as low as zero. The exercise is meaningless.
I think I should go for a walk.

---------------------------------------
Posted through http://www.Electronics-Related.com

From: George Herold on 16 Jul 2010 15:45

On Jul 16, 3:13 pm, "Michael Robinson"
<kellrobinson(a)n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote:
> >They're really just wording the question kinda poorly (and they're also
> >assuming the student population is very, very large -- as you point out,
> if
> >there are only 100 kids at the school, you can come up with very
> definitive
> >answers). What they really mean is something like:
>
> >-- You're performing sampling where 45% of the time you get answer A
> (someone
> >votes for the underdog), and 55% of the time you get answer B (a vote for
> the
> >other guy). If you perform 100 random samples, what's the likelihood that
> >you'll get more than 50 'A' answers?
>
> >This is a standard statistics question, along the lines of, "If you roll a
> >fair dice 100 times, what's the likelihood you'll get '3' 20 or more
> times?"
>
> >Part of engineering is figuring out what your "customer" really wants when
> >their own description is kinda flaky. :-)
>
> >---Joel
>
> If the school population is many, many magnitudes larger than the number of
> voters, the chance that underdog will win just reduces to 45% (the same as
> the underdog's chance of winning if only one student votes).
> And in the case where the school population is relatively small, the
> simulation methodology suggested is so bad it's not even wrong. Sampling
> will always return about 45%, and we have seen that the chances of the
> underdog winning can range as low as zero. The exercise is meaningless..
> I think I should go for a walk.
>
> ---------------------------------------
> Posted throughhttp://www.Electronics-Related.com- Hide quoted text -
>
> - Show quoted text -

Hmm that's not right, as long as there are more than 100 students at
the school the answer should be the same. The standard deviation from
a sample goes as the square root of the number of samples. 100
samples means 10% is about the error. Since 10% is about what the 45%
person needs to win I would guess that this happens about one standard
deviation of the time.. about 13%. What is the answer in the book?

George H.
(Of course this is a 'back of the envelope' calculation and there may
be factors of 2 or pi floating around)

| Next | Last
Pages: 1 2 3 4
Prev: MsPacMan
Next: Notice of assessment