I certainly am not looking for an argument here but it is pretty clear that you are missing something as it is not possible to attain a sequence of measurements that are consistently better than the capabilities of the measurement system and yet you have done that. As an engineer this should be very disquieting to you and you should be trying to find out why it is you are getting impossible results.
Somewhere in engine school you should have been introduced to the concept of a 'Error Budget' which attempts to quantify how much error is contributed by each of the components of a system to its overall accuracy. This isn't the place to go over the whole nine yards but perhaps a couple of comments will trigger some memory.
You've got two major components here. First is a pH value predicted by your algorithm and the second is a measured pH. Call them pP and pM. If you run the algorithm against a mash and measure its pH you will find a difference between the two and this is the error for that run
E = pM - pP. Obviously in pM and pP are identical the prediction was perfect and the measurement was perfect and E = 0. But neither is perfect. pM = pMt + Em where pMt is the measured pH if the measurement is perfect and Em the measurement error. Also pP = pPt + Ep where pPt is the correct predicted pH and Ep is the prediction error. Clearly pPt is numerically equal to pMt and so when one differnces a predicted and measured pH the difference is E = Ep - Em. Can Ep ever equal Em. Yes, that is possible but the probability of it happening is very small.
Clearly to assess the value of this (or any) spreadsheet we can't rely on one run. We must make many and publish statistics on E. The world will expect you to publish the rms value which is the square root of the average square of E. As for a single run we have E^2 = Ep^2 + Em^2 + 2Ep*Em the rms error will be Erms = sqrt(Ep_^2 + Em_^2). Ep_ and Em_ represent, respectively, the average rms errors in prediction and measurement. The cross term is gone because Ep and Em are both assumed to be 0 on average. If they aren't your measurements are biased (and you will fix that) or your algorithm is biased (and you need to fix that).
With careful study of pH measurement we determine that, with the best equipment and practices, we can expect at mash pH (and the accuracy we can get does, in fact, depend on the pH range being measured relative to the buffer pH) that we can get slightly better than the buffer tolerance which is, for most of us ±0.02. Calling it ±0.02 allows a bit for imperfect practice and the fact that most of us are not using meters that cost $1000 sitting on a lab bench.
That means that Erms = sqrt( (0.02)^2 + Ep_^2) ) > 0.02. Thus the lower bound on measured errors in attempting to asses the accuracy of your (or any spreadsheet) is 0.02 pH rms.
You consistently get measurements better than the lower bound so something is wrong. I don't know what it is but you need to find out.
At this point we have shown that there is a measurement problem but have not addressed accuracy issues with the pH estimation algorithm itself which will give us a WAG at Ep_. To do this we note that the pH shift from an acid addition A is dpH = A/B where B is the buffering capacity of the mash. The buffering capacity of the mash is m1*(a1*D1 + b1*D1^2 + c1*D1^3) + m2*(a2*D2 + b2*D2^2 + c2*D2^3) + ... in which m1 is the mass of the first malt, a1, b1, c1 its titration coefficients and D the difference between its DI mash pH and the target pH. This is clearly getting out of hand fast so I won't give more detail but will note that the rms error in estimated pH from this addition alone is
m1*(Ea1*D1 + Eb1*D1^2 + Ec1*D1^3 + a1*ED1 + 2*b1*ED1^2 + 3*c1*ED2^3) + m2*(...)
in which Ea1 is the error in knowledge of a1 etc and ED1 is the error in knowledge of the DI mash pH because D = pHz - pHdi where pHz is the target pH. As before we will square this term and take the square root with the result that the rms error is going to be
sqrt(m1^2*Ea1_^2*D1^2 + ... m1^2*a1*ED1_^2 +....)
In addition to errors introduced by all these malt factors we need to recognize that if you think you are mashing 10 lbs of malt and you are actually mashing 10.1 lbs that's going to effect the pH estimate too.
There's not much point in doing more algebra here. This should be enough to make it clear where the errors are coming from. What we can do at this point is put together a spreadsheet that predicts mash pH perfectly accurately if we give it perfect data and the a, b, c model is perfect, induce small errors in the various parameters and see how big and error in pH prediction it makes for a particular grist. After doing a little of this it's pretty clear that the DI mash pH for the malts is a big contributor to estimated pH error. For a typical pale malt mash of 90% base and 10% 20L grains the pH estimation error from this component is about equal to the DI pH error. Another big contributor would be the combined error of measurement and knowledge of strength of an acid addition. A 5% error in those would lead to about .01 error in the estimate. There are dozzens of other errors from, for example, not including b and c or having the wrong value for a. A ten percent error in knowledge of a leads to about 0.005 rms error in estimated pH.
Now let's suppose we have the best possible knowledge of DI mash pH for the base malt. That would be the DI mash pH we measured ourselves. Since we know that we can't measure pH to better than ±0.02 we put that into my spreadsheet and find that it induces about 0.0175 error in the pH estimate. With some other findings:
Due to error of ±0.02 in measurement of base malt DI pH 0.0175
Due to 10% error in a1 0.005
Error due to ignoring b1 0.001
Error due to ignoring c1 0.006
5% error in strength/measurement of lactic acid 0.011
There are lots of other errors but they are all going to be small relative to 0.0175. The rms of these listed errors is
sqrt(0.0175^2 + 0.005^2 + 0.001^2 + 0.006^2 + 0.011^2) = 0.022119
clearly dominated by the error in measurement of the base malt DI mash pH. Yes, they do add up but the DI mash pH is ruling and that is why allowing the brewer to add a measured DI mash pH is such a plus here.
Now if we go back to the original Erms = sqrt( (0.02)^2 + Ep_^2) ) > 0.02 we can put in 0.022 for the prediction rms and have Erms = sqrt( (0.02)^2 + (0.022)^2 ) = 0.029 as an estimate of what we could reasonably expect from the experiments you are doing.
So clearly if you get 0.01 as an rms error it is not to be believed. Clearly something is rotten in the state of Denmark and you should try to find out what it is. I hope the above rambling is helpful. Nonetheless the concept is demonstrably sound to the point where you should release it to people and let them see how well the predictions compare to their measurements. They won't get ±0.01 but they should get better than they do with the other spreadsheets that estimate DI pH from color.