From: Alan Chalker on
"Alan Chalker" <alancNOSPAM(a)osc.edu> wrote in message <hs4710$4gk$1(a)fred.mathworks.com>...

>
> -There is no 1st place final entry according to the ranks. There is a 2nd place entry: #48, with a score of 166203. The 3rd entry of the actual contest (id 53) beat that score by a lot (119888 vs 166203), but ran in less than a second versus the 97 seconds it took for #48 to run. Thus I assume they were using a different test suite. I ran the entry against the sample suite and the results didn't match up either.
>

I just discovered that soon after the main contest started, Ned resubmitted an entry from the internal contest (#12) to the queue which became entry #58. They manually removed it from view and the stats, but it is in the data set I uploaded and has score (33479.0) that put it temporarily in the lead during darkness (at least for the first 40 or so entries.

IF YOU ARE DOING AN ANALYSIS OF THE DATASET, PLEASE ZERO OUT ENTRY #58! Otherwise it'll mess things up.

Ned: I did a search and couldn't find any other entries that you or the team snuck in during the main contest which might mess up stuff for us. Is that a correct assessment?
From: Helen Chen on
"Hannes Naudé" <naude.jj+matlab(a)gmail.com> wrote in message <hs3gd9$f7c$1(a)fred.mathworks.com>...
Interestingly if you create the links for earlier entries manually eg.
> http://www.mathworks.com/matlabcentral/contest/contests/2/submissions/1
> Then you get submissions from before the contest began. Looks like this was the internal Mathworks contest. Care to tell us who won that and what the best score was? :-)

Nice catch Hannes! Yes, we run an internal version of each contest so that MathWorkers can test both our contest infrastructure and the game. That gives us that chance to tweek if necessary. This year, since the contest application is new, we ran both a regular contest but also a contest for the most creative failures to check the error trapping. As Alan notes Teja (http://www.mathworks.com/matlabcentral/fileexchange/authors/66822) won the internal contest. He is an application engineer from Japan. :-)

Helen
From: Jan on
So, I could upload the result from running the submissions against the test suite also on the FileExchange: http://www.mathworks.com/matlabcentral/fileexchange/27554-matlab-sensor-contest-data-set-run-on-test-data

It uses a similar database to what Alan has also uploaded but also contains it so that one can check the results directly.

As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.

Cheers
Jan
From: Yi Cao on
"Jan" <thisis(a)notanemail.com> wrote in message <hsb2vn$hjs$1(a)fred.mathworks.com>...
> As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.

For any contest problem, analytic solution does not exist. Therefore, whatever how good an algorithm is, tweaking is necessay. However, we wish the tweaking to be as general as possible. In computer modelling, we normally use three different data sets to ensure the genericity of the model to be developed, i.e. training, validation and testing sets. We can design a similar scheme to encourage code genericity:

1. Sample data set is tranparent to all contesters for them to develop algorithms and to tuning parameters.
2. Test data set is grey to contesters. Contesters do not what actual the data set is but know the results of submitted code.
3. Generic data set is dark to contesters. All submissions will be tested on this data set. Results will be weighted with a coefficient and add to the final score for the ranking. However, the actual results of this set and the coefficient will never be known. This 'dark' part of the total score makes a prob to the score coefficients and any overfitting to the generic data set impossible.

The final score should also include a term for weighted results based on the sample data set. This will encourage any tweaking to be test on the sample data set before submission.

This scheme shoul encourage genericity of code developement, also may discourage massive submission.

Yi
From: srach on
"Yi Cao" <y.cao(a)cranfield.ac.uk> wrote in message <hsbbn4$n4a$1(a)fred.mathworks.com>...
> "Jan" <thisis(a)notanemail.com> wrote in message <hsb2vn$hjs$1(a)fred.mathworks.com>...
> > As already postet Sergey Y. submitted some pretty good algorithms with his DeepCat9 series and my personal opinion is that in darkness and twilight generalization took place, in daylight then a mixture and on the last day almost purely tweaking.
>
> For any contest problem, analytic solution does not exist. Therefore, whatever how good an algorithm is, tweaking is necessay. However, we wish the tweaking to be as general as possible. In computer modelling, we normally use three different data sets to ensure the genericity of the model to be developed, i.e. training, validation and testing sets. We can design a similar scheme to encourage code genericity:
>
> 1. Sample data set is tranparent to all contesters for them to develop algorithms and to tuning parameters.
> 2. Test data set is grey to contesters. Contesters do not what actual the data set is but know the results of submitted code.
> 3. Generic data set is dark to contesters. All submissions will be tested on this data set. Results will be weighted with a coefficient and add to the final score for the ranking. However, the actual results of this set and the coefficient will never be known. This 'dark' part of the total score makes a prob to the score coefficients and any overfitting to the generic data set impossible.
>
> The final score should also include a term for weighted results based on the sample data set. This will encourage any tweaking to be test on the sample data set before submission.

Nice idea. How about leaving the "dark" data set entirely dark until the contest ends (i.e., no influence on the contest score via the coefficent) and then base the grand price entirely on the ranking with regard to the "dark" data set? (In other words: computing the final ranking based on an entirely new testsuite).

This would remove any tweaking results from the grand prize, while leaving enough fun for tweakers during the mid contest prizes.

If technically feasible, the contest machine could compute the results from the "gray" data sets with priority and whenever the queue is empty, it could compute the "dark" data set

Regards
srach