Why not subscribe?

Friday, November 27, 2015

REI OptOutside Day: Walking to REI!

REI is closed today, Black Friday, the day after Thanksgiving. They are urging people to go outside today rather than shop. It's a good idea (and probably a good marketing stunt).

People spend much less time outdoors than they used to -- few of us are farmers, more of us work in offices, we live in climate-controlled houses, and kids play video games rather than pickup games at the playground.  So we forget. And we forget that there's outside, and there's OUTSIDE.

So, I decided to walk to REI today. It's about a five mile round trip.  I'm going to walk there the way that provides the most contact with nature, and walk back the way I'd drive there. Let's see the contrast.

Going there the natural way

So here I am in front of the condo. I'm in fleece with an REI windbreaker made for cycling.  The rain has just stopped and it's in the 30's. I hope REI orders better weather for OptOutside day next year.

I quickly get to a city park, with a path around the lake.
There's a bird's nest in a tree next to the path. I wonder where those birds are now.

There are few other people out.
 But there are more ducks! (and geese and seagulls). I don't see any blue herons or egrets today, though.
 I cross the tracks at the station (after the train passes).
 Just south of the station there's another path to the east.
 This is the Navy Ditch, which drains the lake above into the river.

The drain from the lake to the ditch is busy today, since we've had quite a bit of rain in the last 24 hours.

On the right, the path passes a light industrial area.

 Intersecting paths, arrows not quite pointing in the right direction, and no map.  The signage could use some improvement, but is adequate.
 Here;'s why the sign above says "low water crossing only". Can't cross here, even with waterproof boots!

But I hack through the underbrush and find a crossing point.
 The path continues.
 Now, on the right, there's a golf course connected to a condo development.
 Even on this "woodland" path, there's a reminder that this is not really a natural environment. There's no real wilderness in the Chicago area. Even the natural areas require care to keep invasive species such as buckthorn under control.
 We're at the river, and we'll walk around a flood control basin.  This is 1.25 miles around and basically is a big hole in the ground. Ordinarily it's empty except for a small pond below the spillway.

The river ordinarily goes through culverts under the yellow fence.  But only so much river is allowed through

The rest of the water is diverted passively over this spillway and into the big hole.

Here it is from the other side.  Does it ever fill up? Yes. What then? It overflows. How does the water get out? It gets pumped back into the river after the river levels go back down.

There's a recreational path around the basin, and a playground.  A trailer park is behind the playground.
Now we'll cut through a subdivision and cross to the shopping center REI is in.

 This might be called a "God and Mammon" shot.  This whole area used to be a farm for the Divine Word Missionary Priests, whose monastery is in the background.  But in order to generate funds, they are now leasing the land for corporate headquarters (Kraft, Crate and Barrel), subdivisions, and this shopping center.
 Lots of cars in the parking lot today -- except here.

Returning on the Roads

Now, we'll return home by walking next the roads I'd use if I was driving here. The roads aren't particularly ugly -- this is a wealthier area that has paid a lot of attention to improving appearance.  But it's still a combination of strip malls, residential developments screened off from the road, some car dealers, and auto repair.

 This is the front side of the trailer park.
 Lots of salons, nail and pedicure, yoga, tutoring, and some fast food. Pretty much suburbia.
 The residential developments are screened off from this busy road, and can't be seen. The first one is the development with the golf course we saw earlier, from the other side.

Yep, two yoga studios in the same block.

 Crossing the tracks again.

And back home.

So what am I trying to say here?

First, that getting outdoors, even on a bad weather day, can be fun.

Second, that the choice of your route is important.  Of course, every cyclist is well aware of this. You usually don't cycle or walk using the same route you drive. It's a matter of safety and enjoyment to choose a route that's better suited to cycling or walking. Get out and explore your own neighborhood!

Thursday, November 26, 2015

Benford's Law in Consumer Packaged Goods: More Categories, More Patterns

In our previous posts in this series, on package sizes and product sales, we found that Benford's Law (in which a first digit of 1 is more common than a first digit of 2, etc.) held for unit and dollar sales at the UPC-store-week level, but not when we aggregated further to the category-store-week level for two categories, beer and yogurt.

I've taken the earlier graphs for beer and yogurt and combined them so we have one graph for each category.  The Benford's Law expectation is shown in the dashed blue line.

As we noted before, the first digits of units (UPC-store-week units) and dollars (UPC-store-week dollars) show a good fit. A UPC is a unique product with a specific Universal Product Code on the label, so it's very low level.  When we aggregate up to category level, to sales of the entire product category, the fit isn't as good.

I looked at two other categories in the IRI Marketing Data Set [7] to see if this pattern generalizes.

For household cleaners, we see a reversal.  The UPC level has an excess of first digit 1s and 2s for units, which isn't surprising since these items tend to have low weekly movement. The dollars fit at the UPC level is good, but shows a bit of divergence.

The fit at the category level for both units and dollars is excellent.
For frozen pizza, we see a result that's similar to beer and yogurt: the fit at the UPC level is good, but at the aggregate level has divergence.

Does this mean anything?

There are some pretty graphs here, but do they mean anything?

The generalization from this analysis is that the degree of fit of actual data to Benford's Law is variable, and depends on the degree of aggregation.  Aggregating the data to a higher level may produce a better fit. We found this with product sizes and with household cleaners. Or, it may degrade the fit. We found this with beer, yogurt and frozen pizza sales.

Benford's Law is mostly a curiosity, but is cited as one way to test for fraud. Examples of this are claimed in financial statements [1], Iranian elections [2,3,4,5,6] and other cases.

But we have to apply caution here. What fits in one context (e.g. one company) may not fit in another -- as we see here with household cleaners showing the opposite pattern from the other three categories.

In addition, there's the obvious implication that using Benford's Law patterns to test for fraud only works until the fraudsters get smarter.

[1] Amiram, Dan, Bozanic, Zahn and Rouen, Ethan. Financial statement errors: evidence from the distributional properties of financial statement numbers.  Rev. Account Stud (2015) 20:1540-1593.

[2] Roukema, Boudewijn F.  Benford's law anomalies in the 2009 Iranian presidential election. (2009) Submitted to the Annals of Applied Statistics.

[3] Gelman, Andrew. Unconvincing (to me) use of Benford’s law to demonstrate election fraud in Iran. Blog post, June 17, 2009. http://andrewgelman.com/2009/06/17/unconvincing_to/

[4] Gelman, Andrew. Combining findings at the Province and County Level from Iran's Election. Blog post, June 20, 2009. http://andrewgelman.com/2009/06/20/combining_findi/

[5] Gelman, Andrew. The Devil is in the Digits. Blog post, June 20, 2009. http://andrewgelman.com/2009/06/20/the_devil_is_in/

[6] Beber, Beermd and Scacco, Alexandra. The Devil is in the Digits: Evidence That Iran's Election was Rigged.  Washington Post, June 20, 2009 http://www.washingtonpost.com/wp-dyn/content/article/2009/06/20/AR2009062000004.html?hpid=opinionsbox1

[7] Bronnenberg, B. J., Kruger, M. W., & Mela, C. F. (2008). Database Paper—The IRI Marketing Data Set.Marketing Science27(4), 745–748. http://doi.org/10.1287/mksc.1080.0450

Tuesday, November 24, 2015

Renaming Woodrow Wilson School of Public Policy

There's a current controversy about how much of Princeton should be named after the former president of Princeton, and president of the United States, Woodrow Wilson. Most notable is the Woodrow Wilson School of Public Policy, founded in 1930 and renamed for Wilson in 1948 (coincidentally the same year Truman issued orders to desegregate the military),

On the one hand, few historical figures are perfect -- particularly as measured by the lens of history, 100 years later. Do we want to expunge all of the people who did some good things, some bad things, from having anything named after them? Must everything in the U.S. be named after Fred Rogers? (It's doubtful he'd  have wanted that, anyway.)

On the other hand, Wilson wasn't just a racist in thought, He was a racist in action. He didn't just express racist thoughts, he did things like REsegregating the federal government. In short, he helped Jim Crow become even more entrenched as public policy.

So, what to do?  I think we take our cue from other institutions and add a name.  Carnegie Institute of Technology became Carnegie-Mellon. Dyche Stadium at Northwestern became Ryan Field (inside Dyche Stadium). National College of Education became National-Louis University.  We should add a name that honors someone who stood against some of the bad stuff Wilson stood for.

I'd nominate Harriet Tubman.

Harriet Tubman (born Araminta Ross c. 1822[1] – March 10, 1913) was an African-American abolitionisthumanitarian, and, during the American Civil War, a Union spy. Born into slavery, Tubman escaped and subsequently made some thirteen missions to rescue approximately seventy enslaved family and friends,[2] using the network of antislavery activists and safe houses known as the Underground Railroad. She later helped abolitionist John Brown recruit men for his raid on Harpers Ferry, and in the post-war era struggled for women's suffrage.
We flip a coin to decide whether it's the Wilson-Tubman School of Public Policy or the Tubman-Wilson School of Public Policy.  Either way, it adds a person who certainly tried to influence public policy in a different direction than Wilson.


Monday, November 23, 2015

Benford's Law and Unit and Dollar Product Sales

In my previous post, we looked at whether package sizes followed Benford's Law. We found that for individual categories (such as beer, or yogurt) the first digit of package sizes did not follow Benford's Law, but when we aggregated across products we did get a decent fit, especially considering we weren't quite expecting one.

Today we'll look at an area where we would expect to see Benford's Law apply -- the unit and dollar sales of items.  Once again, we'll use the IRI Marketing Data Set [1], but we'll just use two categories, beer and yogurt, so we don't show an overwhelming number of graphs. (Also because I have only looked at these two categories so far.)

Dollar sales in a week for a product (individual UPC, Universal Product Code) in a supermarket is easy to understand. Units is a bit trickier but corresponds to buying a package: for beer, a 6 pack, a 12 pack and a 24 pack are each one unit.

At the level of UPC - store - week (how many units are sold in this store this week for each product) there is a good fit for both yogurt and beer:

The number of observations is 112 million for beer and 141 million for yogurt.

Aggregation doesn't always help!

We expected that if we aggregated across products do look at category - store - week (e.g. total units of yogurt, across all products, sold in a store in a week) we would get a similar, maybe even a better fit.

But that turns out not to be the case.  The fit for beer on both dollars and units is poor -- basically except for a first digit of 1 being the most common, the rest of the distribution is flat.

Yogurt is better, but still substantially different from the Benford's Law expectation.

The number of observations is lower, because we've aggregated, but still high: 719 thousand for beer and 754 thousand for yogurt. With these sample sizes, significance tests are beside the point.

As to why Benford's Law fits at the UPC-store-week level but not at the category-store-week level, we haven't figured that out yet.

[1] Bronnenberg, B. J., Kruger, M. W., & Mela, C. F. (2008). Database Paper—The IRI Marketing Data Set.Marketing Science27(4), 745–748. http://doi.org/10.1287/mksc.1080.0450