"time series data" entries

Building self-service tools to monitor high-volume time-series data

The O'Reilly Data Show Podcast: Phil Liu on the evolution of metric monitoring tools and cloud computing.

One of the main sources of real-time data processing tools is IT operations. In fact, a previous post I wrote on the re-emergence of real-time, was to a large extent prompted by my discussions with engineers and entrepreneurs building monitoring tools for IT operations. In many ways, data centers are perfect laboratories in that they are controlled environments managed by teams willing to instrument devices and software, and monitor fine-grain metrics.

During a recent episode of the O’Reilly Data Show Podcast, I caught up with Phil Liu, co-founder and CTO of SignalFx, a SF Bay Area startup focused on building self-service monitoring tools for time series. We discussed hiring and building teams in the age of cloud computing, building tools for monitoring large numbers of time series, and lessons he’s learned from managing teams at leading technology companies.

Evolution of monitoring tools

Having worked at LoudCloud, Opsware, and Facebook, Liu has seen first hand the evolution of real-time monitoring tools and platforms. Liu described how he has watched the number of metrics grow, to volumes that require large compute clusters:

One of the first services I worked on at LoudCloud was a service called MyLoudCloud. Essentially that was a monitoring portal for all LoudCloud customers. At the time, [the way] we thought about monitoring was still in a per-instance-oriented monitoring system. [Later], I was one of the first engineers on the operational side of Facebook and eventually became part of the infrastructure team at Facebook. When I joined, Facebook basically was using a collection of open source software for monitoring and configuration, so these are things that everybody knows — Nagios, Ganglia. It started out basically using just per-instance instant monitoring techniques, basically the same techniques that we used back at LoudCloud, but interestingly and very quickly as Facebook grew, this per-instance-oriented monitoring no longer worked because we went from tens or thousands of servers to hundreds of thousands of servers, from tens of services to hundreds and thousands of services internally.

Read more…

Comment
Four short links: 16 June 2015

Four short links: 16 June 2015

Accessibility Testing, Time-Series Graphing, NO BUBBLE TO SEE HERE, and Technical Documentation

  1. axe — accessibility testing of web apps, so you can integrate accessibility testing into your continuous EVERYTHING pipeline.
  2. metrics-graphics — Mozilla Javascript library optimized for visualizing and laying out time-series data.
  3. US Tech Funding: What’s Going On? (A16Z) — deck eloquently arguing that this is no bubble.
  4. Teach Don’t Tellwhat I think good documentation is and how I think you should go about writing it. Sample common sense: This is obvious when you’re working face-to-face with someone. When you tell them how to play a C major chord on the guitar and they only produce a strangled squeak, it’s clear that you need to slow down and talk about how to press down on the strings properly. As programmers, we almost never get this kind of feedback about our documentation. We don’t see that the person on the other end of the wire is hopelessly confused and blundering around because they’re missing something we thought was obvious (but wasn’t). Teaching someone in person helps you learn to anticipate this, which will pay off (for your users) when you’re writing documentation.
Comment