Exploring towards the optimal video stats solution

the explorers club

Exploring towards the optimal video stats solution

Jim Highsmith wrote an insightful blog post about how waterfall project management is about "plan and do" whereas agile is about "envision and explore." This weekend I was working on a video statistics system for Tutr.tv. It ended up being another one of those ah-ha moments of why agile processes are so brilliantly effective and how "envision and explore" produces better results faster than the old "plan and do."

In my day job, I mostly do strategy and solution architecture work (e.g. requirements and planning) but still sometimes get to play developer. In the role of solution architect, I have always found creating detailed project requirements difficult, in large part because as a developer I am used to working through a problem by exploring. Only by exploring do I know what is possible and what can be done efficiently. Only then can I understand the detailed requirements for maximizing value.

Of course that is bass ackwards from traditional waterfall project management. In waterfall, you have to detail out the requirements by estimating (e.g. guessing) the logic and effort required with little information other than past experience.

One of the beautiful dynamics of agile is you don't sweat the planning small stuff. In a waterfall process, if I needed to specify a statistics system I would use cases, wireframes, and maybe even interface and class diagrams. Everything would be spaced out in detail before I started to do coding and before I truly knew what was possible.

In agile, I simply have a high-level need in the form of a user story: As a visitor I would like to know which videos are the most popular.

Time to go exploring

I knew what I ideally wanted: when someone pressed the play button on a video, the play would be tracked and then of course, reports galore. But first, I needed the tracking. The problem was that Tutr is embedding videos from 3rd party services such as YouTube, Vimeo, Blip.tv and Archive.org. There is no way to tell what is going on inside their players - or so I thought.

My first exploration step was to check out the MediaFront module. Travis had told me that MediaFront could act as a front-end player for YouTube and Vimeo and had a Javascript API where I could track play events. I got it installed, wrapped it around a YouTube video, and everything looked great. Then I tried a Vimeo video. Not so great. While MediaFront will do it, Vimeo does not allow other players to replace their built-in player. MediaFront does a rather clever Javascript control pass-through that puts the Vimeo player inside MediaFront's player. Clever idea, but not an ideal presentation.

But what was MediaFront doing? Tapping into Vimeo's Javascript API. Maybe I could just tap directly into the Javascript API to capture play events. Maybe other video providers had something similar.

So I tried interfacing with Vimeo's API. After a few hours of frustration trying to unsuccessfully bend Drupal to get froogaloop working, I gave up for plan B (or is it C by now?).

While Googling through copious amount of Vimeo's API documentation, I wandered across documentation for their standard API. Ah-ha, it has views stats. While the stats from the provider wouldn't tell me how many views actually occurred on Tutr.tv, the stats from the providers would fulfill the user story of knowing which tutorials where the most popular.

So back to PHP, which I am much stronger at than Javascript. I quickly interfaced into Vimeo, YouTube and Archive.org's APIs to get views data. I hit a snag with Blip.tv. They don't include stats data in their API's. Oh well, I was happy to have the data for the other three.

I set up a cron to run each night to download the data points for that day. There remained a significant unknown - how much can I poll the APIs before being flagged for abuse? The API docs were unclear on this, so time to learn by experiment. I set the crons to run and went to sleep.

The next day I had my data, except for those non-data-sharing Blip.tv folks. Problem solved? Yes, but we can do better.

Looking over the backlog of user stories, I noticed the As a member I would like to track the videos I have watched. We implemented a watched flag to satisfy this story. It was a quick, cheap solution for this requirement. The problem is, most people were not using the flag buttons. They either didn't understand what it was or were used to the way video watches were automatically tracked on sites like YouTube and Vimeo.

If I did get the Javascript APIs working, then I could also auto set the watched flag. The Vimeo Javascript API was a bear, so I decided to look at YouTube's API. Four lines of code, that's it - it worked. Click play on the YouTube player and the watch flag clickfire.

Now I got inspired to tackle Vimeo's API. This time it took an hour to work through it. Blip.tv was pretty straight forward, maybe 30 minutes. Drupal's Archive.org integration uses Flowplayer. The lack of solid examples for the Flowplayers API cost me some time, maybe an hour to get it working. Great, now I have stats for total network views and auto watched flagging. But wait, now I see the next step into awesomeness. With an ajax callback, I can track any video plays on Tutr, including by anonymous visitors. One last hurdle to jump: anonymous session handling to assure play stats are unique. We are using <a href="https://getlevelten.com/%3EPressflow%20so%20no%20Drupal%20session%20tracking%20for%20anonymous%20users.%20Well%2C%20so%20let%27s%20explore%20what%20we%20do%20have%20to%20work%20with.%20Echo%20out%20all%20%24_SESSION%20and%20%24_COOKIE%20data.%20Hmm%2C%20seems%20Google%20Analytics%20is%20doing%20some%20user%20session%20tracking.%20Would%20I%20have%20come%20up%20with%20leveraging%20GA%20for%20anonymous%20session%20tracking%20in%20standard%20up%20front%20planning?Probably_not_%3C%2Fp%3E_%3Cp%3EAfter_half_a_day%2C_I_had_the_whole_system_working%2C_piggy-backing_on_Google_Analytics%27_session_tracking__Went_to_push_to_live%2C_session_tracking_broke__Darn_Varnish__Okay%2C_another_couple_hours_and_I_found_a_hack_to_get_the_GA_session_past_Varnish_caching_%3C%2Fp%3E_%3Ch3%3EFinding_new_heights%3C%2Fh3%3E_%3Cp%3EIn_just_two_days_I_got_done_what_I_didn%27t_even_think_was_possible_-_or_at_least_I_would_not_spec_with_any_certainty_in_waterfall_big_design_up_front_planning__In_the_time_it_would_take_to_do_classic_%3Ca_href"http://www.pmi.org/">PMI inspired work breakdowns and UML modeling, I had half the problem solved by actually doing it.

When it was done, I picked up two bonus innovations - auto-watched flagging for members and network views stats from the providers.

If I had tried to "plan" this feature, I would have wasted time on a highly uncertain guess for the effort, spec-ed a more conservative version of the feature, and lost out on a couple of innovations. In the end, big design up-front would have meant more work and less value - and no real certainty in estimating.

The lesson: Stop "planning" what can't be planned with certainty (e.g. most website features). Quickly envision and get to exploring as soon as possible. Then document your agile wins. See how many you can rack up.

photo by Curious Expeditions

Related Posts

Exploring Typography Rules

Kayla Wren
Read more

Sortable User Flag Stats in Views Using the Profile Module

Tom McCracken
Read more

7 Crucial Google Analytics Stats You Can't Afford to Ignore

Colin
Read more

Twilio: A Winning Solution for Business Websites

Felipa Villegas
Read more

8 Important Stats About Email and Marketing Automation

Felipa Villegas
Read more

Drupal WYSIWYG solution with CKEditor and CKFinder

Rachel
Read more