Ann Spencer

On reading Mike Barlow’s “Real-Time Big Data Analytics: Emerging Architecture”

Barlow's distilled insights regarding the ever evolving definition of real time big data analytics

Reading Barlow on a Sunday Afternoon

Reading Barlow on a Sunday afternoon

During a break in between offsite meetings that Edd and I were attending the other day, he asked me, “did you read the Barlow piece?”

“Umm, no.” I replied sheepishly. Insert a sidelong glance from Edd that said much without saying anything aloud. He’s really good at that.

In my utterly meager defense, Mike Loukides is the editor on Mike Barlow’s Real-Time Big Data Analytics: Emerging Architecture. As Loukides is one of the core drivers behind O’Reilly’s book publishing program and someone who I perceive to be an unofficial boss of my own choosing, I am not really inclined to worry about things that I really don’t need to worry about. Then I started getting not-so-subtle inquiries from additional people asking if I would consider reviewing the manuscript for the Strata community site. This resulted in me emailing Loukides for a copy and sitting in a local cafe on a Sunday afternoon to read through the manuscript.

Read more…

Join me for the Strata Online Conference on data warfare on January 22nd

Learn more about potential attack vectors and how to defend against them

Jeez, the days are flying by,” I muttered to myself the other day. The next Strata Online Conference on data warfare is just around the corner. I’ve been excited about this event for some time. How could I not be excited? There will be discussions on using data for evil, hacking cybersecurity, crowdsourcing identity theft, black hat data science, and more.

As I have referred to before, I just love thought provoking and candid discussions.

I first heard about the event when Kathy YuAlistair Croll, and I met at the SF Ferry Building to talk about Strata over breakfast. I’m not a morning person. It takes a few moments for the caffeine to take effect. Alistair is the opposite. I don’t know if Alistair had his dose of caffeine earlier that day or if he just generates his own energy. Whatever it is, it enables him to chair Strata, run his own business, keep up with his precocious two-year-old daughter, and co-author the forthcoming Lean Analytics. Yet, that morning, I was half-tuning Alistair out while I was sipping on my coffee and taking a picture of my crispy caramelized waffle. Yes, I’m that person. But when Alistair started talking about data warfare, he had my full attention. As we rely more upon data, we become more vulnerable to various attacks. It is important for us to learn more about what the potential attack vectors could be and how to defend against them. The speakers at the upcoming Strata Online Conference on data warfare will get us all thinking about this.

The speakers and the topics of their sessions include: Read more…

Improve your math skills

Practical advice for those considering a career in data science

When I was a youngster in college I found myself dissatisfied after I took a stats class from the math department.  So I decided to take another stats class. Classmates thought I was crazy. Let’s be real, what precocious over-achieving teenager majoring in English lit seeks to retake a math class? And not because of a grade but because they were dissatisfied with what they didn’t get out of it? After a bit of research, I decided to take the stats class offered by the psych department.

It made a significant difference.

Thinking about math from the perspectives of research design methodology and how data can be used to manipulate people made quite an impact on my teenage worldview. This experience also reinforced my belief that education is what you decide it will be. There is always more than one way to learn and education doesn’t necessarily have to happen in a physical classroom. Growing up in the San Francisco Bay Area where friends and loved ones decided to forgo traditional higher ed completely to start their own companies or immediately work in jobs in technology also contributed to this belief.

While full time students who are looking at a career in data science may have the time to do seemingly nutty things like take overlapping math classes, this is not something that most people with full time jobs are able to do. When people with full time jobs ask me about what they need to do to move into data science, I probe them about the kind of job in data science they want and about their analytical and empathy skills. Then, I immediately follow up with “So, how are your math skills?.” Interestingly enough, I get a lot people saying how they don’t have time to physically go into a classroom or that it has been, like, forever since they’ve used statistics and/or linear algebra for data analysis. Even more interesting is how often people don’t realize just how many resources are available to learn math outside of the physical-attendance-in-a-classroom-model.

Huh. Read more…

How do you become a data scientist? Well, it depends

My obsession with data and user needs is now focused on the many paths toward data science.

Thanksgiving 2012

Over Thanksgiving, Richie and Violet asked me if I preferred the iPhone or the Galaxy SIII. I have both. It is a long story. My response was, “It depends.” Richie, who would probably bleed Apple if you cut him, was very unsatisfied with my answer. Violet was more diplomatic. Yet, it does depend. It depends on what the user wants to use the device for.

I say, “It depends” a lot in my life.

Both in the personal life and the work life … well, because it really is all one life isn’t it?  With my work over the past decade or so, I have been obsessive about being user-focused. I spend a lot of time thinking about whom a product, feature, or service is for and how they will use it. Not how I want them to use it — how they want to use it and what problem they are trying to solve with it.

Before I joined O’Reilly, I was obsessively focused on the audience for my data analysis. “C” level execs look for different kinds of insights than a director of engineering. A field sales rep looks for different insights than a software developer. Understanding more about who the user or audience was for a data project enabled me to map the insights to the user’s role, their priorities, and how they wanted to use the data. Because, you know what isn’t too great? When you spend a significant amount of time working on something that does not get used or is not what someone needed to help them in their job.
Read more…

Approaching ethics and big data

What to do when facing the stoic expressions that pop up during ethics discussions.

The other day I clicked on a message posted to the O’Reilly editors’ email list and the message text filled up almost the entire monitor screen. I must admit that I thought “Am I going to require another caffeine hit to read through this?”

I decided to take a chance, not take another break just then, and read the lengthy note. I didn’t need that caffeine hit after all. Apparently, neither did half a dozen other editors.

The note was about ethics.

In a previous life, I worked in the competitive intelligence field. I remember participating in a friendly confab at an industry event and then someone mentioned the word “e-t-h-i-c-s”. It was rather fascinating to see how that word elicited stoic faces.  No one wanted to be the first person to say anything on that topic. Now when working at ORM, mention the word “ethics!” and folks are not shy about saying exactly what they think. Not. At. All.

During the discussion, Ethics of Big Data by Kord Davis, came up.  While I was not the editor on this book, I did read it when I was in New York. It made my list of recommended books for people looking to jump into the world of big data. Why? Because I remembered the stoic poker faces from my previous life in competitive intelligence. Read more…

A change is gonna come

Join us in the data revolution. photo

When I told some of my friends and family that I was joining O’Reilly Media as an editor focusing on ORM’s Strata practice area, their responses reflected the diversity of my loved ones.

I’ve paraphrased some of the best ones here:

  • “That is great! I have a bunch of their books. Everyone I know has the animal books.”
  • “Bill O’Reilly owns a media company?”
  • “I don’t get you techie people. Didn’t you already do a bunch of weird ninja-y data type stuff?”
  • “Congrats! I have a lot of respect for ORM.”
  • “… wait a sec, didn’t you STOP being a Java editor years ago to go work at an assessment data startup? ”


The people in my life have a few things in common.  They are smart, articulate, really truly not afraid to say what they think, and seek to be the change they wish to see in the world.  We don’t always agree [massive understatement]. Yet, our motivations are the same.

Why am I telling you this?

I believe that at our core, no matter how different we may seem, we do not actively seek to harm. Yet, everyone that works with data already has or will be facing certain choices on what to do with data. Choices that are obviously for good or for evil. Choices that are neither completely for good or completely for evil. Choices that we are reluctant to discuss because we do not want to implicate ourselves or the companies we work for. Yet, just because we are reluctant to discuss them does not mean we are not facing these challenges.

If you have the courage to speak out regarding the real everyday challenges that you experience while working with data, then I want to listen. If you have discovered solutions to these everyday challenges, then I want to publish your insight. If you engage in anything I publish, whether you agree or disagree, have suggestions for how things could be different or better, then please say something.

You can reach me at Read more…