Hypothesis #1: A yeast pack close to/just past its "best by" date can still be used to ferment a challenging brew (OG > 1.050, 50F fermentation temperatures) if you use a starter. The same wort fermented just with the smack pack and no starter is not viable due to inability to start at 50F, and requiring intervention (higher temperature).
Let's try this:
Hypothesis: When pitching expired lager yeast at 50°F in high OG wort, making a starter reduces lag time to under 48 hours, and without a starter it lags greater than 48 hours.
Use the same lot of yeast, identical wort, and identical fermentation conditions. Measure the lag time (time until gravity decreases 3+ points, confirmed via hydrometer, especially at 48 hours) for both batches without making any interventions. Report lag times.
If the direct pitch lags over 48 hours and starter lags under 48 hours, then the starter saved the yeast.
I use 48 hours as a typical acceptable lag time for lagers pitched cold.
I'd really try to get yeast from the same lot. They're made at the same time and have likely been handled/transported under the same conditions. Ritebrew lets you pre-order Wyeast and Omega products for example; I'm fairly sure they'd be the same lot.
Store the yeast yourself until it's as old as you want.
Hypothesis #2: Even a fermentation with an at/past "best by" yeast pack can ferment a challenging brew (50F fermentation temperatures) better than the same wort fermented by a yeast pack within its best buy date used without a starter.
Let's try this:
Hypothesis: Pitching lager yeast at 50°F in high OG wort using a starter produces preferable flavor to directly pitching a non-expired yeast.
You have two experimental variables in your hypothesis, which is a little weird. I would definitely use yeast packs the same age and get two packs from the same lot. See whether only making a starter changes flavor. Use expired yeast for both batches and you can then test both hypotheses with a single experiment since the experimental variable would be the same.
Besides the experimental variable(s), keep as much as possible between the control batch and experimental batch the same.
Make sure wort composition is the same for both batches (e.g. split a batch after the boil and only transfer clear wort to the fermenters). Alternate between fermenters every quart if needed.
Ferment both batches under the same conditions,
no matter what happens with lag time. Lag time is expected to be longer with lower cell count!
Base ramping/diacetyl rest on a SG level that you pre-determine.
Package the same and within a reasonably similar timeframe after each batch completes primary and diacetyl rest.
Collecting data
Measure flavor preference based on blind testing, as many participants as you can find.
Triangle tests are good to objectively determine whether people can detect a difference. Provide stats if you do collect triangle test results.
Try to have participants cleanse palates before each sample. It's best if the server is also blinded. If possible, pre-determine the sample ordering to evenly distribute the experimental sample to remove ordering bias (fine-tuning brulosophy's method).
Enter samples into competitions and/or ask a judge (blinded to the experiment) to provide tasting notes or scores.
Because the preference result is highly subjective, the more blinded and unbiased tasting data you can provide, the better.
Since you're trying to present a conclusion, you need to provide meaningful data to support it.
It doesn't
need to be super scientific. A minimalistic approach is simply take both kegs to a gathering (don't reveal the experimental variable) and see which one kicks first. This sort of result is still somewhat useful, even though not scientifically rigorous or necessarily conclusive.
Mongoose33's suggestion relates to external validity -- when possible, try to pick a process commonly used so that the results are applicable to as many other brewers as possible.
Hope this helps