how to? peer review homebrew contest. Calling maths and CS

Homebrew Talk - Beer, Wine, Mead, & Cider Brewing Discussion Forum

Help Support Homebrew Talk - Beer, Wine, Mead, & Cider Brewing Discussion Forum:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.

megaman

Active Member
Joined
Aug 25, 2008
Messages
26
Reaction score
0
Any insight on how to hold a peer reviewed homebrew beer contest?
Assume a single category: pale ale.
Assume 100 entries (hence 100 judges, so show that it can scale)

Obviously, the judges can't each drink 100 beers.

How to do it? How to score it?
How many beers are required for an entry?

The judges can't be expert judges - so ideally they would be given two beers at a time and asked which one is better.
 
This sounds like you want to do some kind of random lottery like a secret santa system to me. Each judge is randomly assigned 1 or more entries (however many would work, up to # entrants - 1) to judge that is not their own beer.


If you are looking at completely novice judges who are just picking their favorite of N beers, that's a bit tougher. Because you would then want each entry to have enough independent judges to form a statistically valid result, and for that sample size is pretty important.

Personally, I think sending 10+ beers for judging would suck.


I suppose you could pull something off with maybe 2-5 judges sampling your beer IF they were just doing it ala BJCP and grading based on a system which is supposed to be consistent across all entries, even though in reality it doesn't quite work out that way. At least it would be better than "I like beer A better than beer B". For that you'd definitely need a lot of samples.
 
independent judges wouldn't work. maybe have a judges contest on the side - which judge most accurately picked the winners (i.e. if you picked the sour pale ale over the eventual winner, you were a poor judge).

I am thinking of it in terms of chess scores - chess players are ranked, but I haven't played the best player. Its different though, since chess games are all the same. More like a war, with each judge being a battle field.
 
Well, I think you could work out a system if each judge gets three bottles. That way each bottle gets judged by three different people, and every judge tries three beers.

Randomly assign each judge a unique ID number and tie this number to the beer that they send in (i.e. Judge 1 = Beer 1). Rather than try to randomize the beers that they get, send beers 2, 3, and 4 to judge 1, beers 3, 4, and 5 to judge 2, beers 1, 2, 3 to judge 100, etc. This will ensure that everyone gets a beer that is different from what they sent in and avoids the hassle of trying to randomize more than you have to.

Develop a consistent score sheet (or just use the BJCP sheets...) and have each judge rank all beers. Send the score sheets in to the coordinator, make a spreadsheet, and take the average/SD for each beer.

If you want to be super fancy you can throw out harsh and gentle judges by finding the mean and SD of the category subscores. If the mean rating for that category given by a judge is +/-2 SD away from the mean for all beers as a whole, throw the category scores given by that judge out and replace it with the mean value from the other two judges. Since each beer is judged three times, that should still leave you with 2 independent judges for each category for each beer if you have to throw a few out.

This would probably be easier to analyze in a statistics program (STATA/SAS/SPSS/R), but with a bit of doing it could work in Excel. Hope this helps!

Edit: Made an example spreadsheet, but can't attach it. Spreadsheet is a fake example with 5 beers/judges, will take averages, find the score, and flag any judge who is consistently more than 1 SD out of range for the subscore. It's definitely possible to extend this spreadsheet to 100 entries, but I don't the time or programming skills to do it quickly. PM me if interested.
 
10 flights of 10 beers. Have 5 judges for each flight. The top two in each flight advance to the next round.

2 flights of 10 beer. Again, 5 judges Top 3 from each flight advance.

Finally 6 beers left to judge and win 1st - 6th. Have actual BJCP judges judge this. Of 100 brewers, you should have at least 3 BJCP trained judges, I hope. If not... have your 3 most experienced and most knowledgeable brewers judge and fill out a BJCP form for the 6 beers. The final judges should refraind from judging the preliminary rounds.

You should be able to do this with 2 bottles from each entrant.

Good luck.
 
I think its a stats problem; with each beer have an unknown prior score and each judge having an unknown false positive probability.
 
I think its a stats problem; with each beer have an unknown prior score and each judge having an unknown false positive probability.

I'm not sure what you mean by each judge having an unknown false positive probability - since the beers are judged on a scale, one judge could be consistently high or low, which is why I would recommend tossing any judges who are more than 2SD away from the mean in any category, but false positive implies saying "yes" when the answer is "no".

There could be false positives in the detection of specific compounds, but I'm not sure that there is any simple way to correct for this unless you plan on using laboratory confirmation to detect the presence of diacetyl (as an example).

Each beer does have an unknown "true" value, but having three judges for each beer should allow you estimate the true score by taking the mean of these scores, assuming consistent and unbiased judging. It's not the greatest assumption in the world but the only way to avoid it is to have the same panel of judges judge all of the beers.
 
Back
Top