In my previous post, we looked at whether package sizes followed Benford's Law. We found that for individual categories (such as beer, or yogurt) the first digit of package sizes did not follow Benford's Law, but when we aggregated across products we did get a decent fit, especially considering we weren't quite expecting one.
Today we'll look at an area where we would expect to see Benford's Law apply -- the unit and dollar sales of items. Once again, we'll use the IRI Marketing Data Set [1], but we'll just use two categories, beer and yogurt, so we don't show an overwhelming number of graphs. (Also because I have only looked at these two categories so far.)
Dollar sales in a week for a product (individual UPC, Universal Product Code) in a supermarket is easy to understand. Units is a bit trickier but corresponds to buying a package: for beer, a 6 pack, a 12 pack and a 24 pack are each one unit.
At the level of UPC - store - week (how many units are sold in this store this week for each product) there is a good fit for both yogurt and beer:
The number of observations is 112 million for beer and 141 million for yogurt.
Today we'll look at an area where we would expect to see Benford's Law apply -- the unit and dollar sales of items. Once again, we'll use the IRI Marketing Data Set [1], but we'll just use two categories, beer and yogurt, so we don't show an overwhelming number of graphs. (Also because I have only looked at these two categories so far.)
Dollar sales in a week for a product (individual UPC, Universal Product Code) in a supermarket is easy to understand. Units is a bit trickier but corresponds to buying a package: for beer, a 6 pack, a 12 pack and a 24 pack are each one unit.
At the level of UPC - store - week (how many units are sold in this store this week for each product) there is a good fit for both yogurt and beer:
The number of observations is 112 million for beer and 141 million for yogurt.
Aggregation doesn't always help!
We expected that if we aggregated across products do look at category - store - week (e.g. total units of yogurt, across all products, sold in a store in a week) we would get a similar, maybe even a better fit.
But that turns out not to be the case. The fit for beer on both dollars and units is poor -- basically except for a first digit of 1 being the most common, the rest of the distribution is flat.
Yogurt is better, but still substantially different from the Benford's Law expectation.
The number of observations is lower, because we've aggregated, but still high: 719 thousand for beer and 754 thousand for yogurt. With these sample sizes, significance tests are beside the point.
As to why Benford's Law fits at the UPC-store-week level but not at the category-store-week level, we haven't figured that out yet. But maybe that's not really a consistent pattern. We look at more categories in our next post: http://www.truncatedthoughts.com/2015/11/benfords-law-in-consumer-packaged-goods.html
[1] Bronnenberg, B. J., Kruger, M. W., & Mela, C. F. (2008). Database Paper—The IRI Marketing Data Set.Marketing Science, 27(4), 745–748. http://doi.org/10.1287/mksc.1080.0450
No comments:
Post a Comment