"training" entries

Training in the big data ecosystem

The O'Reilly Radar Podcast: Paco Nathan and Jesse Anderson on the evolution of the data training landscape.

Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.

350px-Philo_medievIn this week’s Radar Podcast, O’Reilly’s Ben Lorica talks to Paco Nathan, director of O’Reilly Learning, and Jesse Anderson, technical trainer and creative engineer at Confluent.

Their discussion focuses on the training landscape in the big data ecosystem, their teaching techniques and particular content they choose, and a look at some expected future trends.

Here are a few snippets from their chat:

Training vs PowerPoint slides

Anderson: “Often, when you have a startup and somebody says, ‘Well, we need some training,’ what will usually happen is one of the software developers will say, ‘OK, I’ve done some training in the past and I’ll put together some PowerPoints.’ The differences between a training thing and doing some PowerPoints, like at a meetup, is that a training actually has to have hands-on exercises. It has to have artifacts that you use right there in class. You actually need to think through, these are concepts, these are things that the person will need to be successful in that project. It really takes a lot of time and it takes some serious expertise and some experience in how to do that.”

Nathan: “Early on, you would get some committer to go out and do a meetup, maybe talk about an extension to an API or whatever they were working on directly. If there was a client firm that came up and needed training, then they’d peel off somebody. As it evolved, that really didn’t work. That kind of model doesn’t scale. The other thing too is, you really do need people who understand instructional design, who really understand how to manage a classroom. Especially when it gets to any size, it’s not just a afterthought for an engineer to handle.” Read more…

Comment: 1

Design for success: Manage business and user goals

Laura Klein on what makes a successful designer and how we should measure the success of product designs.


Register for the UX Design for Growth — Improving User Conversion training session with Laura Klein. In this online, interactive training workshop, Klein, author of “UX for Lean Startups,” will teach you to design for product growth.

Designers have become more and more integral to the success of their organizations. This increase in visibility and responsibility requires new skills, a greater understanding of the goals of the business, the ability to work with a wider variety of stakeholders within the organization, and new ways to measure the success of design work. I recently spoke with Laura Klein, designer, researcher, engineer, and author of UX for Lean Startups and the popular design blog Users Know, about these topics.

Understanding the goals of the business

In discussing the essential skill set for designers today, Klein explains why designers need to understand what their organization is trying to accomplish and why they should get comfortable working with people outside of the design team:

I think nowadays we really have to understand what the business goals are and also what the user goals are, and how those two things can work together to make a great experience for the customer that also helps the business. … More and more, we’re really working on cross-functional teams, which I think is wonderful. It might mean that we’re working with a marketing person and an engineer or several engineers, and a product manager. We’re no longer just working off in our little silos with all the other designers, when all we have to do is talk design. We’re working with a really diverse group of people … I think it’s better for products, but it does mean we have to know how to communicate with more types of people. Learning how to do that can be incredibly important.

Read more…


Knowing when not to design

Don’t waste time on features that users don’t want.


Attend “UX Design for Growth,” a training session by Laura Klein that will give you the skills you need to design products that convert and retain users.

After many years as a designer, I’ve realized that some of the most important design decisions have nothing to do with what any of us consider design. Instead of designing the perfect version of a feature, sometimes the best thing we can do is learn that we shouldn’t build the feature in the first place.

In my all-day, online workshop on September 15, 2015, I’ll be talking about another aspect of building products: how to make them grow. Potentially fabulous products fail every day because product managers and UX designers don’t spend enough time thinking about how their product is going to be discovered by new users.

The following is an excerpt from my book, UX for Lean Startups, where I give one practical tip for learning whether or not you should build a specific feature for your product. If you’d like some practical tips on getting people to start using all those features you decide to build, please join me on September 15th for my UX Design for Growth training session. Read more…


Validating data models with Kafka-based pipelines

A case for back-end A/B testing.

Start the O’Reilly “Introduction to Apache Kafka” training video for free. In this video, Gwen Shapira shows developers and administrators how to integrate Kafka into a data processing pipeline.

A/B testing is a popular method of using business intelligence data to assess possible changes to websites. In the past, when a business wanted to update its website in an attempt to drive more sales, decisions on the specific changes to make were driven by guesses; intuition; focus groups; and ultimately, which executive yelled louder. These days, the data-driven solution is to set up multiple copies of the website, direct users randomly to the different variations and measure which design improves sales the most. There are a lot of details to get right, but this is the gist of things.

When it comes to back-end systems, however, we are still living in the stone age. Suppose your business grew significantly and you notice that your existing MySQL database is becoming less responsive as the load increases. Suppose you consider moving to a NoSQL system, you need to decide which NoSQL solution to pick — there are a lot of options: Cassandra, MongoDB, Couchbase, or even Hadoop. There are also many possible data models: normalized, wide tables, narrow tables, nested data structures, etc.

A/B testing multiple data stores and data models in parallel

It is surprising how often a company will pick a solution based on intuition or even which architect yelled louder. Rather than making a decision based on facts and numbers regarding capacity, scale, throughput, and data-processing patterns, the back-end architecture decisions are made with fuzzy reasoning. In that scenario, what usually happens is that a data store and a data model are somehow chosen, and the entire development team will dive into a six-month project to move their entire back-end system to the new thing. This project will inevitably take 12 months, and about 9 months in, everyone will suspect that this was a bad idea, but it’s way too late to do anything about it. Read more…

Comment: 1

Announcing Cassandra certification

A new partnership between O’Reilly and DataStax offers certification and training in Cassandra.

apache-cassandra-certified-300x300I am pleased to announce a joint program between O’Reilly and DataStax to certify Cassandra developers. This program complements our developer certification for Apache Spark and — just as in the case of Databricks and Spark — we are excited to be working with the leading commercial company behind Cassandra. DataStax has done a tremendous job growing and nurturing the Cassandra community, user base, and technology.

Once the certification program is ready, developers can take the exam online, in designated test centers, and at select training courses. O’Reilly will also be developing books, training days, and videos targeted at developers and companies interested in the Cassandra distributed storage system.

Cassandra is a popular component used for building big data and real-time analytic platforms. Its ability to comfortably scale to clusters with thousands of nodes makes it a popular option for solutions that need to ingest and make sense of large amounts of time series and event data. As noted in an earlier post, real-time event data are at the heart of one of the trends we’re closely following: the convergence of cheap sensors, fast networks, and distributed computation. Read more…

Comments: 2

Wrap-up from FLOSS Manuals book sprint at Google

Mixtures of grassroots content generation and unique expertise have existed, and more models will be found. Understanding the points of commonality between the systems will help us develop such models.

Comments: 3

FLOSS Manuals books published after three-day sprint

Joining the pilgrimage that all institutions are making toward wider data use, FLOSS Manuals is exposing more and more of the writing process.


Day two of FLOSS Manuals book sprint at Google Summer of Code summit

As a relatively conventional book, the KDE manual was probably a little easier to write (but also probably less fun) than the more high-level approaches taken by some other teams that were trying to demonstrate to potential customers that their projects were worth adopting.


Day one of FLOSS Manuals book sprint at Google Summer of Code summit

Four teams at Google launched into endeavors that will lead, less than 72 hours from now, to complete books on four open source projects.

Comment: 1

FLOSS Manuals sprint starts at Google Summer of Code summit

Four free software projects have each sent three to five volunteers to write books about the projects this week. Along the way we'll all learn about the group writing process and the particular use of book sprints to make documentation for free software.