How to scale (with Big Data)

As our friends and families are scattered all over the world, we chose this unconventional approach to announce something big.

Big data: Buzzword for some, reality for others


I'm looking forward to my talk at WebExpo 2012 in Prague. Among many other things, I will talk about famous Peter Harris of Adpac. Peter Harris had his own personal Silicon Valley before there was anybody at The Peninsula (the story how he found his VC is legendary), and he's an icon for many of us at GoodData as we shared an office with him in 2010.

You might wonder what the connection between Mr Harris and big data is. While doing my research for the talk, I found a copy of Computerworld from 1981. One article is about Peter Harris; it starts with the following amazing line:

PORTLAND, Ore. – “Structured programming,” one of the software buzzwords for the 1980s, has been a reality for more than a decade in the data processing offices of the Georgia-Pacific Corp. (GP) headquarters here.

Mr Harris is the one who invented the term structured programming. That was in sixties. In 1981, he explains in a major  IT periodical that GO TO command is bad.

For me, that's insane. And it shows something unbelievable: What we see today as an important and doubtless trend, was actually a war of many battles. Many of them, apparently, belittling structured programming as yet another buzzword.

This is exactly where we are with big data. We feel the urge of discussing the buzzwordness of big data and we predict it will all settle down. Well, not all of us. Some have started to climb the ladder.

For many, the first rung is to realize big data is all around us. Next rungs are about collecting it and analyzing it.

I believe your first step into the world of big data should be different. First you should understand big data influences your business, and–if you're smart–big data can drive your business.

To analyze big data is futile if you don't take any action. That's why credit card companies work with big data for years without even calling it big data. Yes, the volume is epic but that's just one feature of big data (and definitely not the crucial one).

Dust is swirling, visibility is low. You can wait and read more and more definitions of big data. Or, you can be Peter Harris and do what it takes. His story shows the stakes can be incredibly high.

Define, explain, measure, share, act

Here are five steps to metric-driven business.

Step 1: Define

That’s easy, I was told once by VP of Sales in a big company. We have only one KPI, and that’s the number of our clients.

Ok, I said, how do you measure it? The VP raised eyebrows. What do you mean, how? It’s just a number!

Well, I continued, so what’s the number today? In a minute, he realized it’s not a single number: Corporate clients should be separated, people with savings accounts should count more, etc. It actually took two weeks to discuss it in detail and define what they want to measure.

It’s hard to come up with a good definition but it’s worth to be as exact as possible.


Step 2: Explain

This is the most difficult step. It’s tricky because there are several questions hidden under the “why” stone.

Clients give us money, we need to measure the number of our clients! That’s a lazy answer. It’s our KPI! Even lazier.

First, with “why” you often realize your definition can be better. What does “we have 5,000 clients” tell about the performance of your company? (KPI should be about performance, right?) Not much unless you know what’s the goal (to have 10,000 clients) and what was the same number a month or a quarter ago (we had 4,000 clients a quarter ago).

Second, there’s a difference between having a set of KPIs and running a metric-driven business. Knowing the values of your KPIs does not mean you drive by it. It’s good to know you’re driving 65 mph and your target is 130 miles away but it does not tell you where to turn.

Hunt your “why”. It is good to measure your velocity because if you know it’s 65 mph and your target was 130 miles away an hour ago and now it is not 65 miles but 100 miles away, you also know your direction is wrong.

Answering “why” tells you a lot about your directions. Company momentum can be a great KPI.


Step 3: Measure

A lot of people do it the other way round: They look at what they are measuring or what they can measure easily at the moment, and they try to find their KPI there. That’s wrong. The goal has to be defined by business.  Don’t get dragged into numbers just because you have them. The important question is: Do you need them?

Not so long ago, CEO of a GoodData client was very strict about it: he was coming to meetings with a single consistent request: give me this metric. The events in the application were not logged: he didn’t care. The internal database was not ready for it: he didn’t care. He insisted on his metric and he got it: application was changed, database was changed, and today his morning ritual is simple: drink coffee and watch the numbers.

If you know what you want to measure and why, IT will follow.


Step 4: Share

You’re either Louis XIV of France (enchanté) or you want everybody in your company to move toward the same goal (even bigger pleasure to meet you, and congratulations to this wonderful idea).

Nothing can get you closer to your goal than sharing the number you drive by. This is what we measure, here’s why, and you can see our up-to-date numbers every day.

Being secretive helps if your goal is to know more than others. If you want to achieve more, you need to share. To have the numbers visible to everybody on your intranet homepage is great. To let them shine from big LCDs in the office is better. To talk about them every day is... guest what.

When you share, you motivate. Shared number is our number, shared goal is common goal.


Step 5: Act

Imagine you measure your great metric. The number grows constantly over the last two quarters and the value was 4,912 yesterday. Suddenly, it’s 4,073. And 2,885 the next day. And even lower the day after. You need to act, and you need to act quickly.

It’s great if you can dive into the numbers, break them down, analyze them, and understand the infamous “why”. It can be anything: wrong process alignment, lack of marketing campaigns, new competitive product, bad support. You can save yourself a lot of time if you’re prepared.

Imagine it happened right now. It’s a great mental exercise. How would you start to analyze the failure? I bet you would first like to know what was responsible for that constant growth during the last two quarters–and that’s the point.

When your numbers are on 50% of what you expect them to be, it’s a great time to do a retrospective, a post mortem analysis to understand what went wrong. However, you need to learn faster. Post mortem can also mean it’s too late because the company is already dead.

In a metric-driven business, acting is an integral and continuous part of the process. Why? Because that’s the driving, nothing else. All the previous steps are helping but without acting you’re not getting anywhere (and if somewhere, it’s definitely not where you want to be).

To be able to react to a failure, you need to know what drives your success. If your numbers move, it’s time to act. If your numbers stall, it’s time to act.

Define, explain, measure, share, and act. And measure, share, and act again. And again. Every day.

You don’t need to spend two weeks identifying your first KPI. Start small, grow fast, and hold your driving wheel.

What to see at #cloudstock

Cloudstock puts us all into a very challenging situation. At every minute, there are 5,000 awesome developers to speak with, and up to 12 parallel sessions. The bottom line is–if you are from this world–you will miss most of them. So where to go? Here's my list of the sessions I want to be at.

9:30am - 10:15am
Amazon, Google, LinkedIn, mongoDB, force.com, and Engine Yard in the same slot? Oh my. Based on what I know about the cloud apps and infrastructure already, I'd skip Amazon Web Services introduction and Google introduction. There's a lot of online stuff available about mongoDB and force.com, so LinkedIn is my winner.

10:30am - 11:15am
Jakub is talking about GoodData APIs and developer tools in his talk called One Stop Shop for Analytics. I would like to know more about Heroku but I'll go to Introduction to Bulding Apps with Twilio.

11:30am - 12:15am
This is easy, I need to get some enlightenment before the lunch, so Apigee it is: Your API Sucks.

1:15pm - 2:00pm
Apigee one more time? Or Box? Yahoo? This is not decided yet. Most probably Thinking outside the API hosted by Box. Interaction of platforms is something I want to hear about.

2:15pm - 3:00pm
This is very easy. I need to be at BI Platform as a Service because I talk there. Do you want to know what I'll be talking about? Check out my mindmap.

3:15pm - 4:00pm
Another difficult slot: Yahoo, CloudKick, Twilio, and Heroku together. After the whole day in Moscone, I would need some air, and Yahoo's Managing a Cloud: The View from 30,000 Feet will give me exactly that.

As you can see, Cloudstock will make my day. And–I will not be participating in Hackathon. I have no idea how the hackers among us will be able to code and listen to the sessions at the same time.

You can tell me on Monday.

We help you over troubled water

It can be hard to start with a new technology. And it's even harder to start with it and deliver an application in a few days. That's what Cloudstock Hackathon is about. You need to use at least two Cloudstock partners APIs or services, and you need to deliver the application by Monday. See complete rules.

I mean: stop reading, start coding! Time is running fast and you don't want to spend another minute procrastinating. We want the hackathon to be fun, not pain, so we are here to help you. In the end, you have to build your hacking bridge over API water on your own. However, if you want to use GoodData in your app, we are here to help you understand how to work with our APIs and tools.



So start your keyboards now: MacBook, iPads, iPods, Mac Minis, Kindles, and other prices can't wait to be won!

Join our GoodData Developer Network. Start with reading our blog post about Cloudstock Hackathon. You should also follow @gooddata_dev at Twitter.

What integration is next? #inext2010

San Francisco goes crazy as the Giants are marching through the city. In the same time, I'm in Austin, Texas, at IntegratioNEXT 2010, Pervasive Integration User Conference.

I like the conference. It's very well organized (thanks, Lori), great people, excellent talks. If you ask people at other conferences, they tell you it's interesting. Here I hear it's useful, and useful is better than just interesting.

I'm here because we've just announced our partnership with Pervasive. With GoodData connector, you can just take the integration stuff you already have and send your data directly to GoodData platform.

Dashboard in 15 minutes? Not a dream. As Pervasive is now completely in the cloud, it's very easy to make your data actionable. It just makes sense. Data as a Service and Platform as a Service fit together. Read more about it at GoodData blog.

The conference is almost over, and some people say I'd better be home, watching the World Series Champions parade. But believe me, this is not the last time the Champions parade is happening in San Francisco. I tell you that.

The best dashboard ever

There's hunger for dashboards outside. You can buy books about creating dashboards, you can hire consultants to create dashboards, you can spend a lot of time and a lot of money hunting your dashboard dream. The truth is, the best dashboard is very simple.

The only dashboard that anyone would ever need has three pieces.



At the top, there's a line chart, and the line goes always up. There's no explanation of axis or whatever (you can guess time is running to the right and money is running to the top but who knows and who cares). The only goal of the chart is to make you happy. The line's going up, hurray!

Then there are traffic lights on the left. The light tells you how you're doing. Is it green? Perfect. Is it yellow? Do something. Is it red? It's too late to do anything (however the line's going up, so you have enough time to pack your box and quit the job in a decent way).

Finally, there a text box telling you what to do. If the light is green, it's telling you what to do to keep it green. If the light is yellow, it's telling you what to do to make it green again. (If the light is red, don't waste your time reading the text box, just go, go!)

Now there's more than a joke in this dashboard. A friend of mine who's running a very small business has recently told me that BI tools are useless for her. "I don't need to slice and dice my sales, I don't need to measure how good my campaigns are," she has complained. "I just need to know what to do next."

Her point was clear: It's good to know your sales are decreasing for the last 6 months. And it's better to know that it's all because 85% of your existing customers in South Africa have declined your renewal package. Maybe you knew it without BI, maybe not. However even if it's a new fact for you, it's not telling you how to deal with it, how to fix it.

Davenport, Harris, and Morison are describing it in their recent book Analytics At Work as a shift from information to insight. Fully automated BI tools can help you with information: what happened, what is happening now, and what will happen.

To understand how and why did it happen, what's the next best action, and what's the best/worst that can happen, you need something more. You need people who can make the shift: understand information, and get insight out of it.

These people will create the dashboard discussed above, and it will be the best dashboard ever.