Do "professional" brewers consider brulosophy to be a load of bs?

Denny · Aug 10, 2017

I believe the p value should be discarded in these types of trials.

applescrap · Aug 10, 2017

bwarbiany said:
Sadly, there wasn't enough "blindness" in the testing process, and there were other confounding variables (I kegged; the other brewer bottle-conditioned), etc. So I cannot claim it is truly scientific, not anywhere near the level of Brulosophy.

But I tasted both beers and definitely perceived a difference lol...

Hi friend, that last post came off wrong and too aggressive, and i am sorry for that. I appreciate your opinions and your contributions to this forum very much.

applescrap · Aug 10, 2017

Denny said:
Marshall is a dear friend and I respect and appreciate what he's done. But too many homebrewers take it as the last word, rather than a single data point. The key to science is repeatability. Someone does an experiment, then others do it to verify the results. If there's only one trial, then you can't really draw a conclusion. At Experimental Brewing, we try to get around that with multiple testers and a lot more tasters, but that has its own problems. In short, look at these experiments as a starting point for your own exploration. Trying to convince another brewer, whether homebrewer of commercial, that they're the last word is not only misleading, it's not how any of us intend the experiments to be used.

I love your show! Thanks.

mongoose33 · Aug 10, 2017

ajdelange said:
I think we do agree about qualifying the panel in most cases.

Apparently not as much as you might think.

The thing that you don't seem to be able to grasp

I'll let the insult slide. This is not the first time you've decided to take the low road, my friend.

is that if you are trying to see how a proposed change in your beer will effect its sales

OK, I think we're done here. It's pretty much a universal truth that when your interlocutor decides to change the discussion to another argument, it's a sign he/she doesn't have much with which to respond to the original argument.

and assay to do that with a taste panel then that panel had better reflect your market

Now you want this to be about the market? I noted this whole issue a long time ago when I pointed out the inability to know to whom the sample of tasters is generalizable.

Further, now you want this to be about the market and not about whether the beers are different?

<Snip silliness in the context of the argument>

I will also point out, again, that noise is inevitable - even with a 'qualified' panel and that the power of a triangle (or quadrangle or....) test is that it improves signal to noise ratio. See my post on ROC's.

If you want to argue that "noise is inevitable" without understanding that there are ways to reduce it and the desirability of doing so, then there's not much point in continuing.

Our discussion is at an end, AJ. If you're going to present a moving target, there's no point.

You can have the last word.

ericbw · Aug 10, 2017

bwarbiany said:
Because I did a similar experiment. I brewed 15 gallons of an IPA. I kept 10 gallons for myself, fermented in a temperature-controlled chamber. I gave 5 gallons to a fellow homebrew club member, which he fermented without any temperature control. We then presented the two beers to our homebrew club at a meeting, and had them evaluate them to BJCP guidelines.

The temp-controlled beer had an average score 11 points higher than the non-controlled beer.

How many tasters?

Morrey · Aug 10, 2017

mongoose33 said:
View attachment 410318

Not only is that not clear from the way they do it, you point out one of the elemental difficulties with having a one-shot guess "qualify" tasters for the preference test.

Show me you can pick the odd-one-out three or more times in a row, and I'll believe you can detect a difference....and you are qualified to go to the next level.

Guessers cannot tell the difference; why would anyone want them judging preference, and guessing on that too?

Bingo

betarhoalphadelta · Aug 10, 2017

applescrap said:
Hi friend, that last post came off wrong and too aggressive, and i am sorry for that. I appreciate your opinions and your contributions to this forum very much.

No worries. I love to debate. I don't take things personally... As a wise man [me] once coined a phrase:

Offense can never be given; it can only be taken.

ericbw said:
How many tasters?

We had about 10-11 tasters giving informal feedback. Only about 4 filled out BJCP sheets. So as I said, not exactly scientific, and the confounding variables were an issue. It wasn't anywhere near the level of what Brulosophy does.

ajdelange · Aug 11, 2017

dmtaylor said:
My argument remains, that a few more false alarms might not be such a terrible thing, if it might encourage more of us to run even more xbmts on our own to support/refute/learn for ourselves.

And I agree with this enthusiastically. As there was a benefit for the doctor in a higher false alarm rate (in that he can perform additional and potentially even more expensive tests and avoid a lawsuit) so is there a potential benefit for brewers in raising the acceptable level of p if it drives more investigation.

Denny said:
I believe the p value should be discarded in these types of trials.

But not so much as all that. Surely this comment was made in jest. Without consideration of p the test is useless. The results are like the statistics you see in the news.

thunderwagn · Aug 11, 2017

bwarbiany said:
No worries. I love to debate. I don't take things personally... As a wise man [me] once coined a phrase:

Offense can never be given; it can only be taken.

Ha! How true! That's one of the best phrases in this entire thread. Thanks man.

ajdelange · Aug 11, 2017

On No. 205 I presented some ROC curves illustrative of the effectiveness of binary and triangle tests and mentioned that the triangle is more powerful than the binary because it improves signal to noise ratio. In No. 206 I stated that a quad test should be even more powerful than a triangle test because it increases SNR even more. Since posting the curves I have run a Monte Carlo on a quad test for the same conditions as the other tests i.e. a panel of 20 out of whom 10 qualify and 6 prefer the new beer. As expected, the quad test performs better than the triangle for those conditions. The ROC curve lies about midway between the curves with the squares and the curve with the triangles. The curve with triangles represents a triangle test with 40 panel members half of whom qualify and 60% prefer. Thus a quadrangle test with 20 panelists gives slightly better performance than a triangle test with 40 equally qualified panelists. In an earlier post I had suggested that perhaps the reason we use triangle tests instead of quadrangle tests is that, while the quad test is a better test it is not better by enough to justify the extra effort required to juggle 4 samples as opposed to three. It appears for this particular case adding the extra cup is slightly better than doubling the panel size!

ericbw · Aug 11, 2017

bwarbiany said:
We had about 10-11 tasters giving informal feedback. Only about 4 filled out BJCP sheets. So as I said, not exactly scientific, and the confounding variables were an issue. It wasn't anywhere near the level of what Brulosophy does.

That makes Brulosophy look like robust, rigorous science!

ajdelange · Aug 11, 2017

mongoose33 said:
You can have the last word.

Here it is:

mongoose33 said:
I'll let the insult slide.

bwarbiany said:
Offense can never be given; it can only be taken.

mongoose33 said:
This is not the first time you've decided to take the low road, my friend.

If basing my position on sound principles, supporting my conclusions with examples and data (though it be simulated) and explaining them to the best of my ability be the low road I'll take it. And probably get to Scotland a'fore ye too with Scotland representing a fuller understanding of triangle testing than I've ever had before. That's why I find it so disappointing that you wish to withdraw based on what are clearly misunderstandings of my posts.

To whit:

mongoose33 said:
OK, I think we're done here. It's pretty much a universal truth that when your interlocutor decides to change the discussion to another argument, it's a sign he/she doesn't have much with which to respond to the original argument.

I have never changed the argument. The central theme of all my posts, was stated in my first post in this thread (#61) as

...the selection of the panel which must be driven by what one is trying to measure.

and this was repeated many times in others perhaps phrased differently but I feel it should have been clear that the design of the test depends greatly on the nature of the investigation. Whether a particular reader is able to grasp that or not is immaterial as long as most of the readers do.

mongoose33 said:
Now you want this to be about the market? I noted this whole issue a long time ago when I pointed out the inability to know to whom the sample of tasters is generalizable.

My use of the market as an example of a case where we are interested in the verity of the null hypothesis, rather than its alternate is hardly new to the recent posts. In No. 61 I said

Then we get into questions of how well these 20 guys represent the general public (or whatever demographic the experimenter is interested in - presumably home brewers).

I hope that you will grant me that a brewery's market is included in "whatever demographic".

mongoose33 said:
Further, now you want this to be about the market and not about whether the beers are different?

No. I want it to be, as I have said all along, about whatever the investigator is interested in investigating.

mongoose33 said:
<Snip silliness in the context of the argument>

If you think something silly please say why it is silly.

mongoose33 said:
If you want to argue that "noise is inevitable" without understanding that there are ways to reduce it and the desirability of doing so, then there's not much point in continuing.

In No. 205 I presented ROC curves for binary, and triangle tests and described in 206 where a quad test would fall on that chart. I explained that the higher order tests perform better because they increase signal to noise ratio and mentioned in earlier posts that this is because they reduce noise. You must not have read any of these posts as I don't see how you could possibly conclude that I don't understand noise and ways to reduce it if you had. For background I'll point out that I spent the better part of 45 years characterizing noise and figuring out how to combat it.

If you did read those posts and based on them conclude that I don't know anything about noise reduction then, alas, I fear there is no basis for further dialogue. But there seem to be some others here learning something about this kind of testing from my posts and I'll continue to post anything additional findings or insights for their (and my) benefit.

bbohanon · Aug 11, 2017

bleme said:
This sounds awful and I'm glad the breweries near me aren't this way. We have 4 local breweries and all 4 of them are actively involved with the local homebrew club. They sell us grain at wholesale prices, they each host us at least once a year, they attend 1 or 2 meetings a year, and they sponsor homebrew competitions where the winning beer is brewed on their system.

It really is awful..I wish more local breweries here took more of a "hand in hand" approach with the local homebrew scene. There are only 1 of them that actively year after year stand with the homebrewers with contests and such.

I wish it were more..I have often thought that having a brewery with a brewery swag shop that also provides basic homebrew supplies and grain at bulk prices (and even clone kits of one or 2 of their beers) along with some "guest" homebrewer batches/classes would be brewing utopia. I know I would be a loyal patron.

Its one of my gameplans to incorporate this idea if I ever pull the trigger on my nano. <insert trademark here>
:mug:

Kee · Aug 11, 2017

ajdelange said:
Thus a quadrangle test with 20 panelists gives slightly better performance than a triangle test with 40 equally qualified panelists. . . It appears for this particular case adding the extra cup is slightly better than doubling the panel size!

And the tester saves 40 cups of beer.

applescrap · Aug 11, 2017

bbohanon said:
It really is awful..I wish more local breweries here took more of a "hand in hand" approach with the local homebrew scene. There are only 1 of them that actively year after year stand with the homebrewers with contests and such.

I wish it were more..I have often thought that having a brewery with a brewery swag shop that also provides basic homebrew supplies and grain at bulk prices (and even clone kits of one or 2 of their beers) along with some "guest" homebrewer batches/classes would be brewing utopia. I know I would be a loyal patron.

Its one of my gameplans to incorporate this idea if I ever pull the trigger on my nano. <insert trademark here>

We had one of these in Denver. Dry Dock Brewing. I think they closed the Homebrew side.

ajdelange · Aug 11, 2017

Kee said:
And the tester saves 40 cups of beer.

Good point. And with respect to management of the samples its question of fiddling with 20*4 = 80 vs. 40*3 = 120 cups of beer which has got to be easier so we wonder why there is no quadrangle test. Before we get too excited lets keep in mind that this result represents one particular set of circumstances (panel size of of 20, probability of qualification 50%, probability of preference 60%). Perhaps if we examine a wider range of circumstances we would not find the gain so great. Something to look into though.

triethylborane · Aug 11, 2017

applescrap said:
We had one of these in Denver. Dry Dock Brewing. I think they closed the Homebrew side.

The Brew Hut (Dry Dock) is still open.

dmtaylor · Aug 11, 2017

I like cheese.

betarhoalphadelta · Aug 11, 2017

bbohanon said:
I wish it were more..I have often thought that having a brewery with a brewery swag shop that also provides basic homebrew supplies and grain at bulk prices (and even clone kits of one or 2 of their beers) along with some "guest" homebrewer batches/classes would be brewing utopia. I know I would be a loyal patron.

Ballast Point in San Diego does this (to an extent; they have a homebrew store in downtown San Diego, but I think the amount of equipment/supplies at the actual brewery is very limited). Phantom Ales in Orange County does more of what you're describing as them being two actual linked businesses.

I think it makes tremendous sense. They're a brewery, so they're already buying ingredients in bulk and have economies of scale there. It gives homebrewers reason to pop in to Phantom when they might not otherwise to buy ingredients and have 1 [or 4] pints and some food while they're there.

ajdelange said:
Good point. And with respect to management of the samples its question of fiddling with 20*4 = 80 vs. 40*3 = 120 cups of beer which has got to be easier so we wonder why there is no quadrangle test. Before we get too excited lets keep in mind that this result represents one particular set of circumstances (panel size of of 20, probability of qualification 50%, probability of preference 60%). Perhaps if we examine a wider range of circumstances we would not find the gain so great. Something to look into though.

I wonder if it's partly due to how hard it is sometimes to even qualify the panel and achieve statistical significance with a triangle test. As has already been discussed, the tasting panels are not exactly perfectly chosen per your guidelines (i.e. if you're testing for diacetyl, pre-qualify the panel to determine who is sensitive to diacetyl).

With a quadrangle test, yes, you'd require fewer testers to correctly pick the odd beer to achieve significance, but I would worry that with a small panel that you'd find even fewer experiments achieving significance. I don't know the math on this, but think of this as an example:

Triangle: 24 testers, you need 13 correct for p<0.05
Quadrangle: 24 testers, you need 11 correct for p<0.05

This seems better, but when you think of the guessing scenario, pure guessing would suggest 8 tasters in a triangle test would blindly get it right. Pure guessing would suggest 6 tasters in a quadrangle test would blindly get it right. Which means in both cases you need 5 people above and beyond pure guessing to achieve significance, and they've already proven that's hard to achieve with a triangle test.

I think it would be really cool to see Brulosophy use the same experimental batches on two different panels of testers, run once as a triangle and once as a quadrangle.

My gut instinct (which isn't statistics, I know lol) is that although the implications of statistical significance would be stronger if the panel qualified at p<0.05 than it would in a triangle test, the likelihood of achieving significance is lower because a quadrangle is IMHO a more difficult selection than a triangle.

applescrap · Aug 12, 2017

Imo, the path to better brew is the water. Sitting around mathematically working and justifying the, to me, obvious reaults does not make sense to me. They dont "prove" anything, but to an eager and open mind they demonstrate great information for the commercial and home brewer.

ajdelange · Aug 12, 2017

bwarbiany said:
I wonder if it's partly due to how hard it is sometimes to even qualify the panel and achieve statistical significance with a triangle test. As has already been discussed, the tasting panels are not exactly perfectly chosen per your guidelines (i.e. if you're testing for diacetyl, pre-qualify the panel to determine who is sensitive to diacetyl).

Though I have apparently not made it clear the main theme in my posts has been that what you do depends on what you are trying to measure. In cases where the object is to see if the process change has decreased diacetyl then it seems that we would want panelists who are sensitive to diacetyl. If the object is to detect whether the process change effects preference for the beer among some group of people then the panel does not need to be qualified other than to make sure that it is representative of the group you are trying to measure.

Calibrating a panel and going to a quadrangle rather than a triangle do they same thing. They increase the number of qualified votes. Increasing the number of qualified votes by rejecting the votes of unqualified tasters is beneficial but it does, as you point out, reduce the number of votes counted. Going to the quadrangle is justified if the benefit of a higher percentage of qualified votes offsets the loss from fewer total votes counted. Calibrating (pre-qualifying ) the panel is like voir dire. You remove a juror who you think will give the wrong answer but replace him with one you like better. You still have the same number of votes.

bwarbiany said:
With a quadrangle test, yes, you'd require fewer testers to correctly pick the odd beer to achieve significance, but I would worry that with a small panel that you'd find even fewer experiments achieving significance. I don't know the math on this, but think of this as an example:

Triangle: 24 testers, you need 13 correct for p<0.05
Quadrangle: 24 testers, you need 11 correct for p<0.05

This seems better, but when you think of the guessing scenario, pure guessing would suggest 8 tasters in a triangle test would blindly get it right. Pure guessing would suggest 6 tasters in a quadrangle test would blindly get it right.

Keep in mind that those are average numbers that would be obtained by averaging the results from many tests. There is a finite probability that none of the panelists or all of the panelists might correctly pick the odd beer. It is much more likely, however, that 7, 8 or 9 will than 0 or 24.

At this level of consideration the math is complicated enough that you really have to run numbers to gain insight (or, at least, I do - a good mathematician/statistician might be able to give more robust explanations). The ROC represents, IMO, a great way to characterize the worth of a test and apparently, though first invented for RADAR design finds, lots of application in today's world for much more exotic explorations in artificial intelligence, data mining etc. I a previous post I introduced them and put up a chart with a coupe of them on it. In that post I mentioned that a perfect test plots in the upper left hand corner and a test where no information is gained traces out the diagonal dashed line. Intermediate cases ROCs are curves that go though the lower left hand corner and upper right hand corner and which are bowed towards the upper left hand corner. Rather than plot more curves I've come up with a Figure of Merit (FOM) which is 100* (1 - distance_of_closest_approach_to_corner/0.707). 0.707 is the closest approach to the corner for the dashed line. Thus an ROC that lies close to the dashed line (weak test) has an FOM near 0 and one that approaches perfection has an FOM approaching 100.

For your examples:
Triangle test Npanel: 24, Prob(qual): 0.50; Prob(Pref): 0.50; p best perf.: 0.0613, Dist from (0,1): 0.3529, FOM: 50.1 (100 = perfect).
Quadrangle test Npanel: 24, Prob(qual): 0.50; Prob(Pref): 0.50; p best perf.: 0.0316, Dist from (0,1): 0.1793, FOM: 74.6 (100 = perfect)

These data show that for a panel of 24 whose members are only able to pick the odd beer correctly half the time when the beer is only preferred at the 50% level (meaning that 50% of tasters who can tell it is different like it better) the loss of 2 votes (on average) is more than offset by the gain from having the accepted votes be from qualified tasters. The FOM improves from 50 to 75. But now let's suppose that we have a panel that is better qualified in the sense that 60% of them can correctly pick the odd beer as opposed to 50%.

Triangle test Npanel: 24, Prob(qual): 0.60; Prob(Pref): 0.50; p best perf.: 0.0316, Dist from (0,1): 0.1743, FOM: 75.3 (100 = perfect).

This shows that only a small improvement in the proficiency of the panel can get us to the same FOM as going to a quadrangle test so perhaps that suggests that in the types of tests where we are interested in, for example, diacetyl, getting a better qualified panel may be preferable to the added complexity of dealing with 4 samples per panelist.

bwarbiany said:
I think it would be really cool to see Brulosophy use the same experimental batches on two different panels of testers, run once as a triangle and once as a quadrangle.

It would be interesting for sure but I'm not sure what one could conclude from a single test like that.

Note that in coming up with the ROC curves I do exactly what you propose but do it in a computer and do it thousands of times (100,000 to be exact). Each 'panelist' picks the odd beer with the probability I specify and the ones that coose the right one then get to pick one or the other of the two beers as the preferred. The three numbers (number of panelists, number that chose correctly and the number that preferred) then go into the confidence calculation and confidence level statistics are accumulated.

bwarbiany said:
My gut instinct (which isn't statistics, I know lol) is that although the implications of statistical significance would be stronger if the panel qualified at p<0.05 than it would in a triangle test, the likelihood of achieving significance is lower because a quadrangle is IMHO a more difficult selection than a triangle.

Because the quadrangle is a more stringent test the probability of a given result is lower by random guessing than it would be for a less stringent test (triangle). That's where the quadrangle attains its apparent advantage. The Monte Carlo test runs for your example confirm this. For the triangle test the average confidence level was p = 0.048 whereas for the triangle it was p = 0.016.

triethylborane · Aug 12, 2017

applescrap said:
Imo, the path to better brew is the water. Sitting around mathematically working and justifying the, to me, obvious reaults does not make sense to me. They dont "prove" anything, but to an eager and open mind they demonstrate great information for the commercial and home brewer.

This is all over the place. Ignore the statement regarding water.

You say that examining the results in a scientific ("mathematically") does not make sense.

You say the results are obvious.

You say the results don't prove anything.

You say that an open mind will understand the results and provide great information.

How would anyone know what information is to be gleaned from the "exbeerments" if they aren't examined?

How could you even say that water is the pathway to better beer without gathering data and interpreting data? You can't.

applescrap · Aug 12, 2017

The near entirety of what you put in your mouth is water. Period. Ill double down on the importance of water. As far as the rest, I feel there is some misunderstanding.

ajdelange · Aug 12, 2017

Water is one of several stops (often the last one) on the way to very good beer but other things are just as important or more so. Good water is a sine qua non for good beer but good water is easily obtained by adding 1 gram of calcium chloride to each gallon of RO water used (mash and sparge). Brew with water like that to let you get a handle on grist design and preparation, mashing, hopping and fermentation. Then when those are under control you can come back and tweak the water if you like but don't expect dramatic differences.

If a result is obvious (for example adding roast barley to grist darkens beer) then you don't need a statistical test. The value of panel tests is only realized when the result is not obvious or you suspect conclusions are being driven by confirmation bias. In those cases a triangle test can help you determine whether what you have observed supports the hypothesis you are testing or not. It's always easier to make that determination by increasing the sample size but increasing the sample size isn't always an option. And it doesn't remove confirmation (or any other kind of) bias.

eric19312 · Aug 12, 2017

ajdelange said:
If the object is to detect whether the process change effects preference for the beer among some group of people then the panel does not need to be qualified other than to make sure that it is representative of the group you are trying to measure.

I really appreciate the scientific rigor you are trying to bring to this debate. I do have a question and a comment.

Question: How many of the brulosphy experiments have you read (more than the headline and results)...actually read the full write up? Actually this question goes for other posters in this thread too...I see a lot of comments regarding the experimental design that don't seem to have really read too the reports.

In reading the vast majority of the experiments it seems to me that Marshal and crew are focused on detecting whether process or ingredient changes result in a perceptible difference. Everything else is intended to guide thinking about future experiments.

While I am a firm believer in fermentation temperature control I do find it interesting that their experiments show that some other things I took as largely irrelevant may lead to perceptible changes in the beer that are easier for typical bomebrew drinkers to detect than control of fermentation temperature.

Take for example the tasters were able to distinguish between beer brewed with WLP001 and US05. But tasters were not able to distinguish between beer brewed with Galaxy and Mosaic hops. Tasters saw a difference between glass carboy and corney keg fermentation. But did not see difference between chocolate malt and carafa special 2 in a Schwartzbier. In all of these examples I am much more impressed about whether people could detect a difference than whether the qualified group preferred one over the other. When I design a recipe it is my preference that counts, but I will likely pay more attention to choice of WLP001 vs US05 in the future.

Yes these are all single experiments and I'd also prefer to see them repeated before changing tried and true processes. That said all of them are presented with sufficient detail in the reports than any of us could take on the challenge to try to repeat.

Denny · Aug 12, 2017

eric19312 said:
Yes these are all single experiments and I'd also prefer to see them repeated before changing tried and true processes. That said all of them are presented with sufficient detail in the reports than any of us could take on the challenge to try to repeat.

That's exactly what Brulosophy and Drew and I at EB expect people to do. We're not presenting what we feel are scientific conclusions in the way AJ is describing. We're presenting the results of our experiments as starting points for further exploration. We're all frustrated that people take them otherwise.

seabrew8 · Aug 12, 2017

Yes these are all single experiments and I'd also prefer to see them repeated before changing tried and true processes. That said all of them are presented with sufficient detail in the reports than any of us could take on the challenge to try to repeat.

Thats what i take from them. For instanced i used 34/70 in warmer temps then traditional lagers and i'm very very happy with the results.

Science is all good and stuff but i'm just trying to make better beer! Most of the real science is already done by the people who make the ingredients.

ajdelange · Aug 12, 2017

eric19312 said:
Question: How many of the brulosphy experiments have you read (more than the headline and results)...actually read the full write up? Actually this question goes for other posters in this thread too...I see a lot of comments regarding the experimental design that don't seem to have really read too the reports.

Not a single one! And why not? Because my goal has not, up to this point, been to critique what Brulosphy did or didn't do but rather to point out things like the importance of designing the test according to the information sought and the demographic it is sought from (a triangle test is a test of the panel and the beer) and some of the pit falls such as failure to mask a differentiating parameter that is not being investigated and possible procedural errors (e.g. failure to isolate panelists from one another and failure to randomize the order of presentation of the cups. The only comment I made about Brulosphy in particular was that if any criticism of what they did was justified it was probably in one of those areas and I'd say that about anybody I knew was doing a triangle test.

I doubtless will read some of their reports in detail at some point in time but thus far my posts are about triangle testing; not Brulosophy's skill in implementig them.

eric19312 said:
In reading the vast majority of the experiments it seems to me that Marshal and crew are focused on detecting whether process or ingredient changes result in a perceptible difference. Everything else is intended to guide thinking about future experiments.

That is certainly a reasonable application. It has been suggested here that when a difference is 'detected' but with poor confidence that it sends the message that further testing is warranted. That is certainly a valid interpretation. Rather than comment on Brulosophy's selection of a particular confidence level for a declaration of 'detection' at this point I would rather emphasize that we can't detect that there is a difference but only estimate the probability that there is no difference given the data that was observed and that the other equally important part of the question is "By whom?".

eric19312 said:
While I am a firm believer in fermentation temperature control I do find it interesting that their experiments show that some other things I took as largely irrelevant may lead to perceptible changes in the beer that are easier for typical bomebrew drinkers to detect than control of fermentation temperature.

Take for example the tasters were able to distinguish between beer brewed with WLP001 and US05. But tasters were not able to distinguish between beer brewed with Galaxy and Mosaic hops. Tasters saw a difference between glass carboy and corney keg fermentation. But did not see difference between chocolate malt and carafa special 2 in a Schwartzbier.

Given this, if a saw a test that purported to test WLP001 vs US05 and compared beers brewed with them one of which was done in glass and one in SS I'd call foul on that test. Everything but the parameter of interest must be the same or masked. But it is not always possible to do that. These are definitely things that must be considered in planning and evaluating a triangle test.

eric19312 said:
In all of these examples I am much more impressed about whether people could detect a difference than whether the qualified group preferred one over the other. When I design a recipe it is my preference that counts,

A key theme in all my posts has been that the test needs to be designed to reflect what the investigator is interested in. If you are not interested in the preferences of anyone but yourself then there is little point in asking the preference question. Except that we noted that a second question helps to reduce p thus increasing the confidence that the apparent difference is real.

seabrew8 · Aug 12, 2017

I doubtless will read some of their reports in detail at some point in time but thus far my posts are about triangle testing; not Brulosophy's skill in implementig them.

You should do that and maybe try some of the experiments.

What the hell is "Virginia/Quebec" by the way? Quebec is in Canada.

seabrew8 · Aug 12, 2017

That is certainly a reasonable application. It has been suggested here that when a difference is 'detected' but with poor confidence that it sends the message that further testing is warranted. That is certainly a valid interpretation. Rather than comment on Brulosophy's selection of a particular confidence level for a declaration of 'detection' at this point I would rather emphasize that we can't detect that there is a difference but only estimate the probability that there is no difference given the data that was observed and that the other equally important part of the question is "By whom?".

interesting stuff no doubt! Why is there a earth?

Do "professional" brewers consider brulosophy to be a load of bs?

Help Support Homebrew Talk:

Supporting Member

Be the ball!

Be the ball!

Supporting Member

Well-Known Member

Well-Known Member

Supporting Member

Supporting Member

Well-Known Member

Supporting Member

Well-Known Member

Supporting Member

Well-Known Member

Supporting Member

Be the ball!

Supporting Member

Well-Known Member

Lord Idiot the Lazy

Supporting Member

Be the ball!

Supporting Member

Well-Known Member

Be the ball!

Supporting Member

Supporting Member

Supporting Member

Well-Known Member

Supporting Member

Well-Known Member

Well-Known Member

Similar threads