Game of Thrones, Season 2 is getting illegally downloaded. A lot. In fact, it’s on track to be the most pirated show of 2012, and maybe 2011 as well:
Image may be NSFW.
Clik here to view.
Maybe that’s interesting to you, maybe not. Here’s what’s interesting to me: notice the horizontal axis of the graph. What this represents is number of days since the premiere of each show. For instance, Dexter Season 6 started October 2, 2011 — so day 10 is October 13, 2011 and day 30 is November 2, 2011. Game of Thrones Season 2 started April 1, 2012, so day 10 is April 11, and so on. So even though these are from two completely different stretches of time, we can overlap one trend on top of the other.
This is good stuff!
Why? Well, what do we care about in this comparison? We care about how the torrenting behavior with Dexter compares to Game of Thrones. We could, if we wanted, start the graph some time last year and run it to now (May), showing how Dexter (which aired last year) compares to Game of Thrones (which just started airing) over the course of 12 months.
But that would hide what we are really interested in, which is not dates, but how torrenting persists over time. So what we do is we determine a separate zero-day for each show — in this case the day that each show premiered. And then we show behavior since that zero-day. In this case, we can see that Dexter torrenters peaked about 25 days after the premiere (about 3 weeks in), and then maintained a more moderate and stable level. Game Of Thrones torrenters, on the other hand, started torrenting on Day One, and have maintained a high level since. Looking at the graph this way it is easier to see why people believe that Game of Thrones S02 will ultimately be more torrented than Dexter S06.
While this might seem trivial when talking about torrenting, the method gives us insights into many important issues. Take for example this graph:
Image may be NSFW.
Clik here to view.
This is a comparison of job loss patterns in recessions since 1948. Here our zero-day is set at the point in each recession where job loss peaked. By using the zero day comparison, it is easier to see how different in scale and shape this recession is from previous ones, with both a longer duration, and a sluggish asymmetrical recovery.
It can’t tell you why that is, of course. But by removing the distraction of dates and replacing it with an aligned measure of duration it makes the pattern much more evident.
P.S. Did I make the term “zero-day comparison” up? Yes, I did! I’d like to use a more accepted term, but I can’t find a term that doesn’t sink into technical quicksand (“Longitudinal comparisons indexed from a comparable starting point” just doesn’t have the same ring, you know?). If you have a better term, let me know!