Tuesday, 27 May 2014

The Roadmap to 'Hadoop in the Cloud'


The Twitter ball started rolling again just now. Matt Asay posed an interesting question about Forrester suggesting Hadoop isn't a great fit for the cloud. (Even) without context Vijay Vijayasankar and I started firing off questions and answers which inevitable led to my promise of writing down the transition plan for it

Here it is

Sunday, 8 September 2013

Influence tools: the devil is in the details


For those of you who haven't heard of Klout, let me give you a brief history: they started back in 2009 with a lot of marketing, a so-so product and non-existent service. They had two ways of handling criticism: either shower the critic in increased Klout score, or ignore him (or her).
With criticism multiplying as Klout was not willing or able to tackle it, Klout decided to take away the cause for it: detailed data on the components making up the Klout score. If you look now, Klout consists of a single score - just a number. Surprisingly, Tweetlevel has travelled the other way - or have they?

The business case for obfuscating Klout details is strong: not anymore will I be able to prove that their figures are statistically impossible:


The above pics show Klout's former marketing manager Megan Berry's @mention count and people mentioning her for a 30-day straight period: every single day the exact same number



The two pics above show Megan's friend count according to Klout, and according to Twittercounter. I'm sure you can see a striking resemblance - between the 3 Klout pics.

That was back in October-November 2010. I notified Klout of my post, several times, but never got a reaction. It wasn't long after that when Klout decided to not show these stats anymore, and just put out a single number for "all-time" RT's, mentions, etc:

That was back in May 2011. You can see the poor attempt at incorporating Facebook into their stats as well, but the most important point is that you can't spot a rotten trend anymore - or can you? I found it odd to notice that no two daily scores ever were the same exactly, but maybe Klout did improve the quality of their code?

Fast forward 4 months, when I investigate the so-called True Reach and find that it's basically your Twitter follower count multiplied by 2.6, give or take 10%:


Within a month, Klout decided to recalibrate True Reach into something even I couldn't recalculate, and eliminate all other subscores except "Amplification" and "Network", which resulted in dramatic increases but mostly drops for pretty much everyone they kept a record on. Soon after, popular opinion forced Klout to enable opt-out for everyone which resulted in a Klout o' Calypse where 2.5 million people opted out of Klout within a month.

Today, Klout is one measly single score without anything below to find out what it's made up of (see the picture at the top of this post). Now, on to Edelman's Tweetlevel: I've always liked their service. They only had one single number with 3 subscores (yes, I know!) and those were steady. They revealed their scoring mechanism to a good detail and appeared to be a roch in the rough sea of Online Influence. Until November 2012, when they revamped the layout and look of it:


That's my Word Cloud right there, plus some stats. The quick stats seem accurate-ish, but the Word Cloud certainly isn't. Thanks to Twitter allowing users to archive their tweets, I can guarantee you that I used the hashtag #Irene 5 times, the last one dating back to August 28 2011. Needless to say, #e20 hasn't been on top of my tweets since roughly then either.
I contacted Johnny Bentwood of Edelman about this and other inconsistencies, and his complete response was:
Thanks – as we are in beta, we are implementing many code fixes so that would explain your 440 errors, please try again later
I'd say he's following Klout tactics there. For completeness' sake, that conversation took place almost a year ago, and this is today's picture...
So I notice that I'm not that happy about Tweetlevel anymore, simply because I witness that part of the data used is over 2 years old - so how can the rest be even close to accurate?

I wonder whether or when Tweetlevel decides to pull detailed sub scores like Klout did, in order to evade simple questions that have no pleasant simple answer. Looking at the buzz around both, however, I'm pleased to see that Tweetlevel is flatlining and not even Microsoft pumping money into Klout raised any eyebrows whatsoever, with Klout attention being back to the same level when they still showed detailed subscores, no matter how ridiculously flawed they were.

The lesson learned? We're still very far away from measuring "online influence" or even Twitter use, and as long as we don't fully master semantics (perfect translation machines would be a proper indication of that), in stead of quality only quantity of interaction can be measured - and as far as we can check, both Klout as well as Tweetlevel stink at even that

Thursday, 7 February 2013

Speeding up hyperlinks: topics


In a conversation with Jon Husband earlier today, we discussed hyperlinks - and how they've changed this world. In my view, hyperlinks form zero-threshold access to any and all information just a single click away. Whenever I scavenge the Web for info, I open up links in new tabs until there are 20 or so of them, and then scan the results, greatly helped by search, maybe jumping back and forth or drilling down deeper and deeper.
Compare that with the old fashioned way I had to gather information, which at best resulted in a day or so in one or more libraries where some or most books would be out on loan and I'd only have the full result set after a week or two, sometimes more - leaving me with a metre of paper books I had to plow through

Scanning them was simple yet elaborate: read the index, pick the most appetising chapters, and from each of those carefully read the first and last paragraph. Mark in mind or on paper if worthwhile, and continue search - I used to write 10-page papers in a single night doing so

Now, we have hyperlinks - and I still miss something. I call it topics, and here is how I envision them to work