Monday, June 7, 2010

Data: Why are you so difficult to track down?

My first order of business to let everyone know that my friend Lindsay and I have been writing about our gastro-escapades at our 321 Eat! blog (click here to check it out). We've had a great time eating! We hope you may have a great time reading...and then eating :).

Okay, moving on. The last few weeks have been rough at my internship, mainly because our data keeps on changing, which is extremely frustrating for me because I have to keep updating our databases. Basically, this project has gotten a bit more complex. Initially, I was just creating a performance index, but I have been working on a number of other metrics as well including admissions per capita and cost per admissions using an adjusted media spend by title and country. To give you a taste for how much data we're talking about, I'm doing an analysis of 11 films on 23 countries with about 20 variables per film. That's a lot of numbers! Now, initially I thought my job was done, because I had done a quality control on the spreadsheet and found all of the kinks, but then I found out my boss wanted me to create a fully-updating document, which means rebuilding the entire spreadsheet to include source documents and a whole lot of linked cells. Great, I can totally do that. It will take time, but it's definitely something that I can handle (y'all will get so good with Excel. It's like, a miracle program!).

First, I need to gather all the source documents, most of which are at Fox and not readily available to Fox. So we start contacting people and I get the first round of data in. I'm a bit OCD, so I start checking this new data with the original data we had (I did not build the first spreadsheet, I just updated and QC'd it, so I wasn't sure if the numbers were good). Of course, I find tons of discrepancies. This is troubling. We want the data to be as up-to-date and reliable as possible. Making a long story short, I've gone through a number of data revisions due to the fact that none of the data from the various databases match up. I'm just crossing my fingers that this is the most updated version of the data and I will no longer have to swap the data out.

The great thing about this spreadsheet is that it is going to be fully update-able in the future. All one has to do is switch out the original data, as long as everything is in the same format. There are dozens of charts and formulas that will be updated every time data is switched out and/or updated. Life will be so easy! Also, the spreadsheet has all of the original source materials in it, which makes it easier to track down where the numbers came from and check quality. There will also be two different ways to display the data: by territory and by title. This allows users to not only see how a film did across territories, but also see how different films did within a single territory by benchmarking against territory-wide standards.

Anyways, it sounds tedious, and it is. But it's also really interesting to see the patterns that begin to emerge. U.S. films are doing really well in some of the emerging markets including Russia, Brazil, and Malaysia. Europe is a more difficult region to compete in, especially because their domestic film markets are doing really well. I get to go back to work tomorrow. I'm hoping to finish the project up this week (I'm keeping my fingers crossed).

No comments:

Post a Comment