From: Alan Chalker on
"Hannes Naudé" <naude.jj+matlab(a)gmail.com> wrote in message <hs1m2r$4dl$1(a)fred.mathworks.com>...
> Alan, glad to see you're testing the placing of comments on the site. I take this to mean you are keen on the proposed idea. Are you going to go it alone or are you keen on volunteers to write modules for the system?
>
> P.S. Thanks for introducing me to cURL. Am playing with it now (which is how I noticed your test comments. Got going a lot quicker than I did with either WaTiN or System.Web. I'm guessing that's because it does not require me to learn a whole new language. ;-)
>
> Cheers
> H

Hannes: Yes I'm very keen and yes I'd like as much involvement from others as possible, both with regards to modules and suggestions.

Regarding cURL.. here are some tips:
- Make sure you get the SSL enabled version and have OpenSSL installed.
- This is the main commandline parameters I use are: -s -L -b cookies.txt -c cookies.txt -k
- Everything on the contest page is a 'post' style form, not a 'get' style form.
From: Alan Chalker on
"Helen Chen" <helenc(a)mathworks.com> wrote in message <hs1mhc$4e2$1(a)fred.mathworks.com>...

> Shari beat me to this one. Your file has been published.

Thanks Helen. Here's the direct link to it for everyone:
http://www.mathworks.com/matlabcentral/fileexchange/27530
From: Hannes Naudé on
Jan: "So, inspired by Alans description I will try to run all passed algorithms over the weekend on the testset (3000-4000 entries times ~1 minute makes 2-3 days running time) using urlopen and some hand crafted matlab script."

Alan: "This will be an interesting experiment. To help you, I've already downloaded all of the entries and data that goes with them and uploaded a compressed version along with a small 'helper' function to the file exchange. I'm note sure when it will get approved (Helen and team perhaps you can expediate it?), but you can check my profile to see if it has been. Alternatively you can directly email me and I'll send it to you."

Hannes: Also grabbing it now, thanks. I see there's allready six downloads in the few minutes since it was released, so there's a fair bit of interest in this.

Alan:"1. The comment field is currently limited to 1000 characters. Is there any way the Contest team can increase that? If not, we won't be able to post the full code but could post other valuable info. However the system also allows for multiple comments, so I could break longer pieces into smaller chunks and post them."

Ooh bummer. I wouldn't be keen on breaking the annotated code into pieces, since that would make copy-pasting tricky. But maybe this is a blessing in disguise. Maybe the best solution is for the bot to just post a link to a site where all the annotated code appears. This opens up three possibilities.

Firstly, the annotated code could be dynamic. I've wondered before about how to handle the case where new comments are received after an annotated entry has allready been posted. You could just place another comment but we don't want to completely flood the comments either. With a link to an externally hosted page that problem goes away since the link remains valid but the page changes.

Secondly, we now have the opportunity to have graphics associated with each entry. One example would be some of the scatterplots from the stats page, but with this specific entry shown in another colour, giving in immediate impression of where it sits on the result/runtime tradeoff for example. I believe the code used to generate the stats page is on the file exchange, so we won't be starting from scratch. Another example is contest specific diagrams. For example in this contest, it might have been useful to see the solvers reconstruction of one of the smaller images from the sample testsuite. Personally I also modified my runcontest function to show me the blocks used to get to this reconstruction and a diff image showing where the largest residuals are. While the former is a little specific to the block based approach, one could easily roll such code from competitors into a module
during the contest and deploy it. This also makes the contest friendlier for spectators.

Lastly, it could allow users to choose which annotations they wan't/don't want and download a "customized" code version. As an example I mentioned that the annotated code could credit each line to it's creator. While I'd like to have the option to see this, I don't want the code I paste into my editor to be littered with comments in this way.

Lastly (I'm serious this time) the site could have a facility for users to mark a specific submission as interesting, a hacking attempt, an obfuscated entry or anything else one might want to be alerted about.

Personally I'm interested in tracing the genealogy of functions in a little more detail than is currently possible and representing the submission cloud graphically with similar entries clustering together, hopefully allowing us to identify families of solvers. I'll be using the data you posted, so if you keep your internal data format the same this should be fairly simple to roll into a module.

I'd like to encourage others who are looking at this to declare what they intend working on so we don't duplicate too much effort. Also, when you've got something working, please announce it here so we can all admire your handiwork.

H
From: Robert Macrae on
Alan

> I know people have varying views on my approaches to the contest.. but as I've said before I like to be involved in a certain way since I can't compete with the 'algorithm experts'. But I do always abide by the explicit rules the contest team puts in place.
....
> For the last day of the competition, I had planned on developing code to auto create new accounts and automatically switch to the new accounts when the 10 minute limit was reached, in order to be able to 'tweak bomb' during the final rush
....
> I ended up submitting about 185 entries in that 17 minute window

8-)

I've enjoyed watching (and reading about) your optimisation against the constraints, but I do think those constraints should be designed so the resultant delays for other competitors are not excessive. I agree with your comment on the importance of providing a challenge interesting to a wide range of Matlab users, and long delays don't help.

I would suggest a more explicit "Not sockpuppets" rule; also 10 submissions in 10 minutes / 20 in 60 minutes / 30 in 180 minutes, but even then only 2 people would be able to tie up 100% of the BW so I think we also need to modify the queue concept. I'd suggest that time priority is used, but that after executing each code all of that user's remaining queued programs are dropped to the back of the queue. The result would be moderate the delays for occasional users even when there are several autosubmitters in play.

Maybe also cut maximum runtime to 1 minute and reduce the test set size; there is nothing magic about 3 minutes. On which...


Sergey

> I suspect that currently tweaking results in overfitting of the most valuable examples at the expense of less valuable.

I also suspect this. The scoring is such that high-contrast images with large numbers of pixels dominate, making small and smooth images largely irrelevant ( and I suspect the same applies to previous competitions). I'd suggest that the competition would be just as interesting with a smaller set of problems but more equal weighting on their importance to the final score.


Hannes

> In previous contests I developed a mechanism for copy protection that would harm performance (typically by dropping out of the local minimum) if the code was edited or even if it was resubmitted identically.

I stand in awe and amazement 8-)

And echo Alan's comment that it would be against the rules >-(


Yi Cao

> We have to consider a workable scheme to make future contest more general. Contesters against contesters is a good idea.

I agree.


Nathan

> I doubt if there is any practicable way the organisers can forcibly eliminate the various methods that many competitors dislike, even if they were willing to. However there is a community ethos here that's stronger than our individual competitive streaks. There's lots of evidence for this, notably the respect for the ban on wholesale probing, and the fact that code scrambling has not has much impact.

Here here! I don't think the problem is in enforcement, but in making clear rules... as illustrated by


Alan

> What I did was essentially delay the queue by ~30 minutes (10 entries * 3 mins per entry). I see the viewpoint of some that this might be 'overwhelming the queue', but in my opinion it's not since it's no more of a delay

So a good area for a clearer rule?


Alan

> I think a more telling stat is that out of 151 accounts that submitted entries, 111 of them submitted less than 10 entries (I removed my alternate accounts from these totals).
....
> Why do more than 2/3rds of people just submit a couple times instead of being involved more?

I am included in that <10 entries category, but with many 10s of hours and around 100 hand-coded solutions I do feel quite involved! You cannot measure involvement by number of submissions.

> I think a good question to ask is why do we always see the same general set of names showing up as winners and what can be done to level the playing field for novices and experts alike so that the contest is more inclusive?

That is an easy one. To win you have to be extremely smart, and extremely motivated to do just fractionally better than the other 10s of extremely smart people who are looking at exactly the same problem. I remember The Cyclist's account from a past problem, hunting through the code for a function call he could tweak to shave a vital few seconds. This time we have Hannes with a lovely scheme for both orthogonalising the tweaking *and* masking his results, and of course both his and Alan's web wizardry. It takes a very particular set of skills and outlook to win. Most Matlab users (certainly I) don't have them.

Fortunately to make the contest more inclusive, we don't need new winnners -- just more interested participants. Set an interesting challenge and make it reasonably easy to understand how it is being solved and many people will continue to look in and have a go.


Amitabh

> At the end of Twilight the top 5-10 contestant be given the chance to form a team with each one as a mentor to their team. If someone has time issues, he/she could delegate a willing successor.

What a great idea. I do aim to comment my code, if only because I am easily confused, and am delighted to discuss it if anyone is interested -- I'll try to remember to add a header to that effect if I write anything next session.


Robert Macrae
From: Alan Chalker on
One thing I didn't do is capture all the existing submission comments in the entry dataset I uploaded. If anyone thinks those would be of value I'd be happy to rerun the collection.