Category Archives: Data

A look at Minnesota State Fair Attendance

My enjoyment of looking at data lead to me create a few charts which look at the attendance figures for the Minnesota State Fair. I found data easily accessible for attendance figures going back to 2007. There might be figures going back further out there but have not found them yet. Still eight years of attendance figures still gives us a good picture to look at and a long enough time span to make some judgements.

You can drag to zoom on the charts.

This first chart shows the average attendence for each day of the fair over the eight year span we have. If you think about it, what you see here should not be a huge surprise. You have peaks for attendance on the two weekends and the smallest turn out days during midweek of the fair on tuesday and wednesday. I am slightly surprised that the average lowest attendance day over the eight years is actually the first day of the fair. I would have thought given it is the first day people would be more likely to show up given pent up excitement for the fair and first chance to get your favorite food or get to their favorite building/attraction but I was obviously wrong in that assumption before looking at the data. The first day of the fair also has the lowest attendance figure in the daily attendance records that have been set for each day. In the end I should not be surprised since the attendance figures are similar for the two thursdays the fair has and the factors I described do not outweigh the factors of people’s daily lives that make them less likely to come to the fair during tuesday through thursday.

You are able to de-select the years you don’t want to see.

This chart is looking at the daily attendance for the fair for each day from the last eight years. The biggest take away I found from this graph was the variances of the attendance figures for tuesday-thursday of the mid-week days at the fair over the eight years. What you will see for the top three overall attendance years of 2014, 2009 and 2012 is that for six of the nine days for the tuesday through thursday stretch is that they outperform the average attendance for that given day. What that provides for those three years is to avoid the more normal real dip you see mid-fair for the daily attendance figures and helps propel those three years into being the top three overall attendance years.

Looking at the 2013 numbers if you remember the weather we had during the fair you should understand why it has several days below or just at the daily average attendance numbers. We had six straight days of above 90 degree temperatures, starting from the first saturday to the second thursday of the fair. For that first sunday we had a high of 97 and a dew point of 71, which all lead to the lowest attendance number for that first sunday of the fair in the eight years on the graph. The attendance figure for that sunday ended up being 145,706 which was 18,486 lower then the second lowest attendance day (2014 at 164,192) for that first sunday at the fair.

This third chart is the total yearly attendance figures for the State Fair over the last eight years. As talked about above the 90 degree heat wave is what helped lead to the lowest total attendance since 2008 for 2013. 2014 total attendance set a new yearly record and for the first time ever broke the 1,800,000 mark. The 2014 total attendence record I would say was help driven by the record second saturday of the fair which was set at 252,092 and beat the previous 2010 record for that day by 17,708. Another record for the 2014 second saturday of the State Fair was that 252,092 number also set the all time single day attendance record which was previously held by the second sunday in 2013 at 236,197. So, for both of those records the 252,092 mark simply shattered both of those attendance records. I know there are other factors that contributed to 2014 being a record attendance year but from looking at it from a more statistical point of view that is what I see.


Data Dive: The Lost Art of the Baseball Complete Game

Cy Young, Career Complete Game Leader.

First off a little definition. The complete game is where the starting pitcher pitches the entire game and faces every batter without help from a relief pitcher. A pitcher can pitch the whole game and still lose.

I have started to research some parts of baseball history and found a few interesting things concerning complete games in baseball and how they have steadily decreased over the years. What got me interested initially to think about this subject is all the great pitching performances we have had in the playoffs leading up to this year’s world series.

This post is more about looking at the numbers and how they have changed over time. Likely in a future post, I will delve into why there has been a decline in complete games.

As I started to look for the historical data on complete games for each season I noticed that for the most part, complete games were going down over time. In general, it was a fairly gradual decline to the number of complete games in a season with a few areas that are a more dramatic decline, and some areas where they briefly spiked back up. To dig into this more, first here are two charts to help show what I am talking about.

The charts are interactive if you click on them and they will open up in a new window.

MLB Complete Games Yearly 1876-2013
% of games that have a CG

What you can see from both graphs but more easily from Graph two is the steady decline in the number of complete games. From 1904 to 1913 there is a decline of 33%, the most dramatic period of decline in the entire graph.  After that dramatic drop from those nine years, things start to level off and even come back a little bit for a few years. Starting in 1921 we reach a period where for the next 25 years to 1946 where things move at a much more gradual downward slope losing 10%, going from 52% in 1921 to 42% in 1946.

Another chart behind the link.

Continue reading Data Dive: The Lost Art of the Baseball Complete Game