Checking out (of) Reality
DRAFT – NOT FINAL COPY
The Climate Reality Project’s recent 24 Hours of Reality: The Dirty Weather Report has produced some interesting numbers regarding the view counts of their 24 hour event. The event was on the 14th and 15th of November 2012. Currently, 7 days later, according to a Google search of the sites climaterealityproject.org and climatereality.com for the term “million” within the last month, they do not seem to say anything more specific than it was “watched by millions”.
Other sympathisers and collaborators were tweeting and posting a final score of 16M. I guess it was what they wanted to believe. For example, a trivial Google search provides these, mostly from a week ago:
https://www.facebook.com/TCPIndonesia (retrieved 2012-11-23)
http://www.skepticalscience.com/24-hours-climate-reality.html (retrieved 2012-11-22)
The @ClimateReality twitter account tweeted this
at 20:47:41 UTC on the 15th. (The content says “9.5mil views of 24Hrs of #Reality. Tune in & help us reach 15mil before @algore’s finale @ 8PM ET ow.ly/fkh1P Pls RT”). There has been no specific mention of a “final score” from the twitter account either, that I could find at the time of writing (if you have found it, from any “official outlet”, please let me know). You would have thought that if it really was successful, they would have been shouting the super-impressive number from the rooftops.
Climate Reality (@ClimateReality) November 15, 2012
The @ClimateReality tweet confirms my screen capture of about the same time.
So it is clear that they thought it was possible to stretch to beyond 15 million views by the end of the event. That is was possible to get 5.5 million new views in the next four and a quarter hours.
I get the feeling that they felt they could tweet with confidence.
I also get the feeling that somebody was instructed to turn off the “mechanical viewers” for an hour before the end of the show, to prevent an embarrassment of views. There also seems to be evidence of “mechanical viewers” in the early section of the show.
Take a look at this graph of views over time for the entire Reality stream, based on the “bonus graph” from my earlier teaser graph post. I have added a grid to make it easier to read, and better labelled the axes.
(TODO: replace with updated graph with grid etc.)
There is a time between about 23:07 and 00:23 (about 25 minutes before the End), where the view counter was “stuck” at 14.8M. The final data point was 16.3M at 01:07. These can be confirmed in my supporting video.
It is possible to work out a views-per-period figure from the above data. I have calculated the views-per-second, but you can easily convert the figure by multiplying by 60 to get views-per-minute, or 3600 to get views per hour at that moment. The next three graphs show the rate for all the show, and for the last few hours, and the last few hours again lightly averaged to make it easier to read the dark blobby area. The “steppiness” of the data after about 21:00 is to some extent due to the 30 second periodic sampling and the resolution of the reported numbers being reduced to one decimal place of millions rather than all the digits (explained TODO:here)
(TODO: replace with nicer versions)
As you can see from the last graph, there is a view rate of slightly over 700 views per second (over 42,000 per minute or 2.5 million per hour) over a sustained period at least from 22:20 until 23:05, taking the total views up to 4.8M at 23:07. This is pretty close to the 15M suggestion in the tweet, but there are nearly two hours to go! The counter picks up again at 00:23, and is back to 700 views per second for the period 00:41 to 01:02, slowing after that time to end up with 6.3M at 01:09. 15M was hit at 00:25, and 16M at 01:00.
If the 700 views-per-second, or 42,000 views per minute, were “real”, there would be no reason for it to have paused for about 75 minutes between 23:07 and 00:23. At 42,000 views-per-minute, that would have worked out to about 3.1 million additional views by the end of the show. That would be “infeasibly wrong” based on the “optimistic” tweet, giving a total of over 19M at 01:00. That would have been about 10 million views in the four and a quarter hours after the tweet, very roughly the same number of views in the last four hours as in the first 20. Which doesn’t seem right to me.
TODO: point out the straight line segments and constant views-per-period figures are “not natural”. Find some other examples on the web.
The Early Show
Now to look at the earlier part of the show, where we actually have “current viewer” numbers.
If you watch my 48 hours video between the start of the screenshot recording at 02:18 UTC and the break at 03:18 UTC, and look at the numbers for the Climate Reality show, you’ll see several things.
1) Some of the numbers are indistinct because of a pretty “scrolly mile-o-meter” effect on the counts in the Ustream video player. It’s a pity, but because of this a fair number of potential samples aren’t recorded as a data point.
2) The “total views” counter (the number to the right of the slash) is increasing by about 2 or 3 thousand views in each screenshot taken every 30 seconds. It is 356K in the first shot, at 02:18:12, and 630K one hour later at 03:18:01. That’s about 4,500 views per minute, and is pretty consistent. The count of current viewers of the stream (to the left of the slash) seems to hover in the region of 10,000 views over the same period. Are the numbers really telling me that every minute, about half of the current audience clicks off the stream?
This consistency is pretty obvious when you look at the graph of total views by time in this graph (described in early view plots post), with a red line (originally drawn “for fun”) from the origin (zero views at the start of the event at 01:00 UTC) through the first 98 minutes I failed to record, and through the first and last data points themselves over an hour apart. The tiny stars along the bottom are the “current viewer” counts for that time.
It highly unlikely for a random group of people to be this consistent in their viewing, what with the rate of increase of views v. the nearly straight, nearly horizontal count of viewers. I’m not surprised the viewers count was turned off.
3) Take a closer look at the “currently viewing” counts in the 48hours video. Because the video is massively speeded up–about 1 hour of “live” per 30 seconds of video–another effect becomes apparent. The currently viewing count cycles repetitively up and down quite noticeably over a period of 5 minutes of “live”, or a period of about every two seconds of watching the the video. This cycle seems to be apparent for at least half an hour in that first segment.
Plotting the current viewers graph (again, from the early view plots post) gives
It does seem to have a bit of a wiggle on, doesn’t it? It could even be a sine wave, on top of slower increasing signal. So let’s estimate the amplitude from the graph, I reckon about 300 viewer counts with the “fingers on screen” estimating tool, and add a 5 minute period sine wave.
I added more grid. I also had to adjust the phase by 120 seconds, and iteratively guesstimated the slope and offset for a couple of tries, but I think that is a pretty decent match of the phase and frequency over most of the 30 minutes of reasonably decent data available. Even where the counts drop off after about 03:05, the peaks and troughs and the up/down slopes of the sine wave seem to match the data pretty well.
I don’t think a mass of people on the internet are capable of organising themselves to that degree, and it is incredibly unlikely for that to happen randomly. Unfortunately that does seem to indicate a “mechanical” component rather than purely random human hands, since there is a very regular pattern.
The best case scenario (ie. with minimal impact to the overall figures, signifying the least artificial inflation of video traffic) is that there is a single system that generates about 600 parallel “views” over 5 minutes, and somehow shapes the start times and durations of “watching” and happens to approximate a sine or triangle wave. This would mean that only 600 views every 5 minutes are “automatic”, or 7,200 per hour.
The worse case scenario is that there is more than one system generating views, and are generating views such that there is a difference in posting rates from more than one bot, producing a heterodyne beat signal in the count of current viewers. This could easily account for nearly all the “current” and “total” views.
(TODO: factor in the calculated per-second views, which are around 75 views per second at this time from the first of the “steppy” graphs above, and is calculated as 4,500 per minute in the discussion of the “straight line” graph a few graphs up from here)
(TODO: also add aliasing/nyquist scenario: since my machine was sampling about every 30 seconds this means the highest frequency I can directly read is one cycle per minute, so 1/5 cycles per minute could be undersampling a significantly higher rate. Add fft of data, showing peak at about 300 seconds duration/cycle, or an alias of more exactly 16 seconds duration/cycle.)
Interesting drop after 03:05. Perhaps people were not interested in the Western United States segment?