With all due respect, I don't know AJD or where his data comes from .....
Yes, who is this guy anyway?
The "data set" is my brewing logs. I have been checking pH with a meter for years and so when I say you do not need to add any chalk to most beers to establish proper mash pH, including dark ones, I have measurements to back that statement up. In fact the opposite is true. You usually need to add acid. I've seen that said in so many words in a professional brewing text but of course I can't remember which one. The only time in years of brewing that I have ever had to add chalk to a mash was when I got overly enthusiastic with the acid first.
As for where the data John used to develop his model came from I have no idea. But I can give an example of the general technique I think he must have used which illustrates the idea and some of its pitfalls. At
http://www.pbase.com/image/127869369 I've put picture of the Moody Baa bond rate plotted against the LIBOR (London Interbank Offering Rate) for the same period. There is definitely a correlation. The more interest the Brits charge the more we charge over here. But the correlation is not as strong as we might like. Historically you cannot look at the LIBOR and say what the Baa will be. With such a data set it is common to find the simplest model (a linear relationship) which best fits the data by minimizing the error between what the straight line predicts and what the actual data show. Graphically this means placing a straight edge through the data so that the dots appear to be about equally distributed on either side of the line. The straight line on the plot at the picture site is the best fit and it says the
trend in Baa rate is such that the Baa is 0.332 times the LIBOR rate + 6.32%. Thus, if the LIBOR is 4%, the Baa would be distributed around 7.6%. But as the data show, when the LIBOR has been 4% the Baa has been as low as 6% and as high as 9%. LIBOR is not a particularly good predictor of Baa and the number, on the plot, r = 0.586 is a measure of how good (or in this case, bad) it is. r= 1 would mean that every data point would fall right on the line. r=0 means that there is no correlation between the two variables at all.
John has to have come up with a similar "scatter plot". How I don't know but he did mention in one post that he used a color model, rather than measured color data, for the SRM part. I suppose one could call breweries, ask them for a water report and a description of the beer they brew with it, calculate the color and make the plot that way and I suspect that this is what he may have done but can't confirm. However he did it it is plain that the data spread must resemble the Baa/LIBOR plot in it's dispersion. Yes, there is a relationship between color and SRM (just as there is between Baa and LIBOR) but it is not a strong enough one. r, the correlation coefficient isn't large enough. Beyond this it is clear that the slope of the regression (that's what the straight line is called) is way to high in the SRM/RA model. This suggests either that RA was badly miscalculated for dark beers, that SRM was badly underestimated for dark beers or that no dark beers were measured. Extrapolation outside the measured data interval can be done but it is done at your peril. You would take a huge risk in predicting the Baa from a 10% LIBOR rate, for example) because the model is not based on data in that range. Again, I don't know how the SRM vs RA regression was obtained (or even that it is linear).
I think I'm on pretty solid ground here. I have published papers in peer reviewed journals on RA and on beer color. More to the point is that other people are beginning to report findings consistent with what I've seen in my own brewing.