Rebel Brewer Giveaway!
 Home Brew Forums > Beer Data Sets

01-12-2013, 01:28 AM   #21
ajdelange
Senior Member
Feedback Score: 0 reviews
Recipes

Join Date: Aug 2010
Location: McLean/Ogden, Virginia/Quebec
Posts: 4,708
Liked 382 Times on 309 Posts
Likes Given: 12

85% of them were commercial beers.

One could conceivably do what you suggest and as an academic excercise you wouldn't need to have access to any of the beers. Just the spectra. In fact this could be an excellent illustration as to how to pick a basis set. Obviously you need a set of beers whose eigencharacterization spans the space of beers you wish to model/emulate and if it doesn't you can only strive to match a projection onto that space.

To come up with an arbitrary spectrum by blending beers not only must the spectrum you wish to simulate lie in the hyperplane spanned by the set of eigenspectra you choose but these eigenspectra would have to be absorption spectra, not transmission spectra as I used. I chose transmission spectra because absorption spectra vary a lot from one another in regions that don't contribute much to their colors. I can get an accurate color description with fewer SDC's by using transmission eigenspectra. For the synthesis problem one needs absorption spectra because Beers law works on absorption spectra. A = sigma(e1c1 + e2c2 ...)L where e1, e2 etc are the extinction coefficients, c1, c2 are the concentrations of the coloring materials and L is the path length. The spectrum for a mix of beers is A = w1*SRM1(I + SDC1*V1,1 * SDC2*V1,2 +...) +w2*SRM2(I + SDC1*V1,2 * SDC2*V2,2 +...) ... where I is a vector of all 1's, SDC1 etc are the spectral deviation coefficients and V1, V2 etc are the eigenspectra. This can be put into a matrix form suitable for Moore-Penrose solution which gives the best, in the rms sense, approximation to a specified target. Of course the best solution would not necessarily have weights which summed to 1 and , of course , any solution vector would have to be normalized so that the weights do sum to one. Lots of details like that last one should make this an excellent teaching opportunity. Just sort of musing here so don't hold me to any of the details but yes, interesting problem.

__________________

01-14-2013, 04:07 PM   #22
bdh
Feedback Score: 0 reviews
Recipes

Join Date: Feb 2012
Location: Baltimore, MD
Posts: 157
Liked 18 Times on 9 Posts

Quote:
 Originally Posted by ajdelange 85% of them were commercial beers. One could conceivably do what you suggest and as an academic excercise you wouldn't need to have access to any of the beers. Just the spectra. In fact this could be an excellent illustration as to how to pick a basis set. Obviously you need a set of beers whose eigencharacterization spans the space of beers you wish to model/emulate and if it doesn't you can only strive to match a projection onto that space.
Sure, but the nice thing about having commercial beers available (and constructing the spectra using the raw spectra from the commercial beers as your basis) is that it gives a more 'interpretable' solution. An eigenspectra gives a nice mathematical construct, but you can't necessarily pour a beer with an eigenspectra into a glass to see what it looks like. With the 'real beer' library of spectra if someone really really wanted to see what a particular beer looked like you could mix the appropriate amounts of the commercial beers into a glass and recreate it.

Quote:
 To come up with an arbitrary spectrum by blending beers not only must the spectrum you wish to simulate lie in the hyperplane spanned by the set of eigenspectra you choose but these eigenspectra would have to be absorption spectra, not transmission spectra as I used.
Isn't absorption just the -log of transmission?

Quote:
 This can be put into a matrix form suitable for Moore-Penrose solution which gives the best, in the rms sense, approximation to a specified target.
Right, but Moore-Penrose will, in general, try to use all of the elements of a basis in the reconstruction. To get sparse solutions (using only a few elements of the basis in the reconstruction) you have to manually limit the rank of the basis (i.e. select which elements to use). If instead you solve the problem using something like L1-regularization (essentially just add a penalty term equal to the absolute value of the spectral weights) the solution will automatically select which elements to use (from potentially very large basis sets - like every commercial beer you've measured) that reconstructs the spectra to within a given rms error threshold.

For example, if your basis of spectra is the matrix S and you're trying describe a target spectra x, then you end up solving the minimization (with respect to A) of

min sum(abs(A)) such that |x-S*A|^2 <= (rms threshold)^2

Where A is constrained to be non-negative

__________________

01-14-2013, 07:53 PM   #23
ajdelange
Senior Member
Feedback Score: 0 reviews
Recipes

Join Date: Aug 2010
Location: McLean/Ogden, Virginia/Quebec
Posts: 4,708
Liked 382 Times on 309 Posts
Likes Given: 12

Quote:
 Originally Posted by bdh Sure, but the nice thing about having commercial beers available (and constructing the spectra using the raw spectra from the commercial beers as your basis) is that it gives a more 'interpretable' solution. An eigenspectra gives a nice mathematical construct, but you can't necessarily pour a beer with an eigenspectra into a glass to see what it looks like. With the 'real beer' library of spectra if someone really really wanted to see what a particular beer looked like you could mix the appropriate amounts of the commercial beers into a glass and recreate it.
If you really wanted to do this you would have to be able to measure spectra in order to see how close you came. Rather than having me tell you that I used Prima Pils and here's the spectrum I measured for Prima Pils (3 years ago) it would be better for you to go buy a hand full of beers, characterize them and use them as your basis. Given that my ensemble is representative you should be able to characterize any beer you buy in terms of SRM plus 2 - 5 coefficients. If you can't then either your beer is a real outlier or my eigenspectra don't span the space I think they do. When it comes to experimenting with blends then you would want to use beers in hand so you can see the results of your experiments.

Quote:
 Originally Posted by bdh Isn't absorption just the -log of transmission?
Indeed it is but the eigencharacterization of the absorption spectra cannot be obtained by taking the log of the transmission spectra eigencharacterization. The point is that given the measured spectra you can come up with 4 eigen characterizations: linear, log, linear-average and log-average. In terms of expressing visible color with the fewest numbers linear - average is the best (I tried all four).

Quote:
 Originally Posted by bdh Right, but Moore-Penrose will, in general, try to use all of the elements of a basis in the reconstruction. To get sparse solutions (using only a few elements of the basis in the reconstruction) you have to manually limit the rank of the basis (i.e. select which elements to use). If instead you solve the problem using something like L1-regularization (essentially just add a penalty term equal to the absolute value of the spectral weights) the solution will automatically select which elements to use (from potentially very large basis sets - like every commercial beer you've measured) that reconstructs the spectra to within a given rms error threshold.
There is only one practical way that I can think of to come up with the pseudo inverse and that to use Singular Value Decomposition in which case (or at least it is the case with most routines) the singular values (and eigenvectors) will be sorted by the singular value magnitudes. Thus if you want the mmse solution of rank m you simply turn off all but the first m singular values.
__________________

01-14-2013, 11:38 PM   #24
bdh
Feedback Score: 0 reviews
Recipes

Join Date: Feb 2012
Location: Baltimore, MD
Posts: 157
Liked 18 Times on 9 Posts

I don't think I'm explaining my proposal clearly. PCA is fine for raw dimensionality reduction (at least assuming the data is multivariate gaussian). What I'm proposing is actually a representation in a potentially higher dimensional space, but with the nice properties that the axes of this space are real beers (so it's more physically interpretable than an eigenspace) and (for any particular beer) the number of non-zero elements in the representation is small.

For example, you've got a library of spectra for 99 beers measured at 81 points. If you convert all these spectra into absorption so that you can add them and then pack them into the (81 x 99) matrix S you've now got an over-complete dictionary for representing beer spectra (assuming that at least 81 of the beers in your library are linearly independent).

Now, if there were no constraints on your representation coefficients then you could exactly represent any spectra x by just solving for the (99 element) vector A

x = S*A

which has multiple solutions since S is over-complete.

If we're going for a solution that's physically realizable though, we need to constrain the coefficients in 'A' to be non-negative and summing to 1, so we get

min |x-S*A|^2 such that sum(abs(A)) = 1 and A>=0

Now that it's constrained we can exactly represent spectra that are in the convex hull of S and get close solutions for spectra that aren't. This is a (non-negatively constrained) version of L1 (or Lasso)-regression in statistics (why I mentioned the idea for the OP). A nice property of L1-regression is that it promotes coefficient vectors with only a few non-zero elements due to the non-differentiablity of the absolute value at the origin (here though the absolute value is redundant since we already have a non-negative constraint, but the non-differenetiability still holds). Also, since this problem is convex it can be solved efficiently.

__________________

01-15-2013, 10:04 PM   #25
ajdelange
Senior Member
Feedback Score: 0 reviews
Recipes

Join Date: Aug 2010
Location: McLean/Ogden, Virginia/Quebec
Posts: 4,708
Liked 382 Times on 309 Posts
Likes Given: 12

I think we're over looking (or at least I did in an earlier post) the fact that the problem isn't linear because the absorption at wavelength l in a path L of a mix of beers in volumes v1, v2... with extinction coefficients e1 at that wavelength and chromatophor concentrations c1, c2... would be

Abs(l) = e(l)*L*(c1*v1 + c2*v2....)/(v1 + v2 + ....)

because beer 2 dilutes beer 1 etc. This assumes that Beer's law applies and that when v1 ml of beer 1 are added to v2 ml of beer 2 the resulting volume is v1 + v2. Neither of these is exactly true but the Beer's law part is nearly so and given that the beers have approximately equal true extracts the volume part should be pretty close. We can write Abs = B*v where Abs is an 81 element vector of absorbtions, i.e. the spectrum, v a vector whose elements are the desired volumes and B a matrix with elements Abs(beer j at wavlenth l) /(v1 + v2 + ...). If Abs is some desired spectrum then the set of volumes in the v vector which minimizes the norm of Abs - B*v is a 'solution'. There is only one best solution but there may be many other solutions which give nearly the same result. Now there is no constraint that the elements of v sum to any particular value. We can add 10 mL of beer 1 to 20 ml of beer 2 to 30 ml of beer 3 or 100 mL of beer 1 to 200 mL of beer 2 and 300 mL if beer 3. We'll get the same spectrum in either case but the constraint that the volumes all have to be non negative does apply. Thus the problem is quite non linear and we will have to linearize it to find solutions. The problem is there are, because beer spectra are so close to one another, in shape, so many and they are so close together.

Because B is a function of v we cant use Moore-Penrose to minimize |Abs - B*v| but what we can do is hypothesize a solution, say 10 mL of each beer, find the gradient and use Newton's method to refine the hypothesized solution, and repeat with the refined solution until no further improvement in |Abs - B*v| is seen. The difficulty is that the problem is so ill conditioned. In matching the target (green) spectrum in the first figure below by the iterative technique the condition number for the gradient matrix is 1.9e+16. The three columns of numbers below are three blends that try to match that green curve spectrum. The first two were obtained with Newton's method and the quality of the each is about the same: rmse = 0.12. The third column is the global solution. It's quality is an order of magnitude better - rmse = 0.01. The second and third solutions are plotted on the second graph.

15.9% ----11.1% ---- 53.1%
34.6 ----29.6 ---- 0
4.3-----7.5 ----- 46.9
0 --------0 -------- 0
41.7 ----32.7 ----- 0
23.8 ----16.4 ----- 0
0 -------2.6 ------ 0

So now you know how I wasted my day.

__________________

01-16-2013, 02:08 PM   #26
bdh
Feedback Score: 0 reviews
Recipes

Join Date: Feb 2012
Location: Baltimore, MD
Posts: 157
Liked 18 Times on 9 Posts

I'm not following the argument as to why this needs to be non-linear. If instead of working with the component volumes you just solve the equation for the percentage of each beer in the final mix (and hence the sum to 1 constraint) this is a pure linear equation isn't it?

__________________

01-16-2013, 04:49 PM   #27
ajdelange
Senior Member
Feedback Score: 0 reviews
Recipes

Join Date: Aug 2010
Location: McLean/Ogden, Virginia/Quebec
Posts: 4,708
Liked 382 Times on 309 Posts
Likes Given: 12

We are trying to find a set of numbers, v, which minimizes |Abs - B*v| where B has elements whose values are absorptions divided by the sum of the v's i.e. they depend on the things we are solving for. We could write |Abs -C*f| where f are fraction and the elements of C are just the absorbtions but if I solve |Abs -C*f| by Moore-Penrose there is no guarantee that the elements of f will sum to 1. Instead I have to solve |Abs - B*v| iteratively and, if the problem weren't so ill conditioned, that would work very nicely. But it is and that's where the potential value of LASSO comes in. The constraint that all the coefficents add to 1 is a constraint on their L1 norm and I suppose it would be very easy to add to all the other constraints on the L1 norm that in my very limited understanding are a part of the solution process, at least by some methods.

__________________

01-16-2013, 06:43 PM   #28
bdh
Feedback Score: 0 reviews
Recipes

Join Date: Feb 2012
Location: Baltimore, MD
Posts: 157
Liked 18 Times on 9 Posts

Quote:
 Originally Posted by ajdelange We are trying to find a set of numbers, v, which minimizes |Abs - B*v| where B has elements whose values are absorptions divided by the sum of the v's i.e. they depend on the things we are solving for. We could write |Abs -C*f| where f are fraction and the elements of C are just the absorbtions but if I solve |Abs -C*f| by Moore-Penrose there is no guarantee that the elements of f will sum to 1. Instead I have to solve |Abs - B*v| iteratively and, if the problem weren't so ill conditioned, that would work very nicely. But it is and that's where the potential value of LASSO comes in. The constraint that all the coefficents add to 1 is a constraint on their L1 norm and I suppose it would be very easy to add to all the other constraints on the L1 norm that in my very limited understanding are a part of the solution process, at least by some methods.
Oh, well if it's just an implementation issue, the pseudo-code for this isn't too bad. Using your notation from above where,

Abs = spectrum we're trying to match (M x 1 vector)
C = matrix of spectra for all the various beers (M x N matrix)
f = fraction of each beer in C in our final mix (N x 1 vector)

pos(x) = function that sets any negative element of the vector x to 0.

Initizalize f = 0 (vector of all zeros).
Calculate L = Lipchitz constant of the squared error term with respect to f.
In this case L = largest singular value of C'*C (where C' is the transpose of C).

Loop until convergence
\\calculate the gradient step of the solution
q = f - (1/L)*C'*(C*f-Abs)

\\Project this back onto the constraints.
\\Namely, project onto the positive L1 ball of radius 1/L.
\\See http://www.cs.berkeley.edu/~jduchi/projects/DuchiShSiCh08.html for more details

q = pos(q) \\set any elements of q that are less than 0 to 0
u = q sorted in descending order

r = largest value of k=1, 2, .... length(u) such that
u(k) > [(sum of first k elements of u) - 1/L]/k

T = [(sum of first r elements of u) - 1/L)]/r

f = pos(q-T)
end loop

There are faster ways for solving this problem, but for the relatively small data set here this should be fine.

edit: realized I forgot a few 1/L terms in there
__________________

01-16-2013, 06:48 PM   #29
brettwasbtd
Awesomeness Award Winnner
Feedback Score: 0 reviews

Recipes

Join Date: Apr 2010
Location: Damascus, MD
Posts: 1,231
Liked 25 Times on 24 Posts
Likes Given: 40

White labs quality control day has some real interesting statistcs! http://www.whitelabs.com/qcday-2011.html

__________________
Shegogue Brew | Pronounced "Shuh·goo" | http://shegoguebrew.blogspot.com

01-16-2013, 06:51 PM   #30
ajdelange
Senior Member
Feedback Score: 0 reviews
Recipes

Join Date: Aug 2010
Location: McLean/Ogden, Virginia/Quebec
Posts: 4,708
Liked 382 Times on 309 Posts
Likes Given: 12

If I'm reading you right that's what I did to obtain the multiple solutions. I used annealing to obtain the global one.

__________________