District 2 Cabernet Sauvignon Update
Lake County Cabernet Sauvignon saw an enormous price increase of about 17% this past year, while yields and Brix levels held steady. With very little opportunity to increase vineyard acreage in Napa, many are turning to Lake County for wine to blend into Napa SKUs. For this reason, Cabernet Sauvignon leads Lake County in terms of both pricing and total yields. It accounts for over one-third of all winegrapes grown in Lake County.
In 2013, Lake County Cabernet sold for an average of $1725.79 per ton. I had predicted that the average price per ton would remain essentially flat at $1730.20 in 2014. Prices actually rose to $1990.38. Yeah, I have been dreading writing this post. In fact, I’ve been pretty upset that of all the forecasts I have done, that I decided to include this one in the sub-set that I made public, since all of the rest look pretty darn good. In fact, I have never, during the three harvests that I have done these forecasts, been off by more than 2.5%. I used to farm in Lake County, in fact, and was always right on with my Lake County predictions. At least I didn’t use this one for any clients.
So, what went wrong? I’ve got a few hypotheses that I will be testing:
Hypothesis 1: Nothing’s Wrong
I guess this is the first thing to look into. My model predicted a 0.52% chance that prices would rise to this price or higher. If I had 200 other predictions and this was the only one off by such an amount, then maybe I could just chalk this up to the pseudo-randomness of the market. But I don’t. So I have to assume that the model failed, not that this result was an outlier.
Hypothesis 2: A False Variable
As I noted in the initial post, one of my variables had a p-value of almost .20. That means that it has a 20% chance of being a false variable that should be discarded. I had my reasons for keeping it in, including the fact that I was not using this prediction for a client. But this isn’t the reason for the failure, as the variable was a measure of macroeconomic influences that actually pushed the price higher than the model would have predicted without it. So there goes that theory.
Hypotheses 3: Missing Variable
The model only claimed to explain 95% of price variation. About 5% of price variation, according to the model, was driven by factors the model did not account for. That 5%, however, would not be enough to explain the disparity here. Still, this points to an important issue: the model does not include Napa-specific market conditions (like prices for Napa grapes or new plantings in Napa.) I did look into these, but did not find statistically significant correlations.
So, What Do I Do?
Well, first off, at this point, the model is getting put on ice. I’ve never had to do that before, but I have never sold a model whose underlying methodology did not already have three years of reliable performance. So, back to the drawing board on this one. I think my first point of attack will be to look again at Napa correlations. I will also take a look at how the model performs with Table 10 numbers to see if transactions between related entities is throwing something off. Finally, I will be looking at correlations to certain retail statistics. If I find something interesting, I will be sure to post it here, with the caveat about the uncertainty surrounding the model.