Data storage and management providers are becoming key contributors for insight as a service.
Contrary to what many believe, insights are difficult to identify and effectively apply. As the difficulty of insight generation becomes apparent, we are starting to see companies that offer insight generation as a service.
Data storage, management and analytics are maturing into commoditized services, and the companies that provide these services are well-positioned to provide insight on the basis not just of data, but data access and other metadata patterns.
Companies like DataHero and Host Analytics [full disclosure: Host Analytics is one of my portfolio companies] are paving the way in the insight-as-a-service space. Host Analytics’ initial product offering was a cloud-based Enterprise Performance Management (EPM) Suite, but far more important is what they are now enabling for the enterprise: they have moved from being an EPM company to being an insight generation company. In this post, I will discuss a few of the trends that have enabled insight as a service (IaaS) and discuss the general case of using a software-as-a-service (SaaS) EPM solution to corral data and deliver insight as a service as the next level of product.
Insight generation is the identification of novel, interesting, plausible and understandable relations among elements of a data set that a) lead to the formation of an action plan and b) result in an improvement as measured by a set of KPIs. The evaluation of the set of identified relations to establish an insight, and the creation of an action plan associated with a particular insight or insights, needs to be done within a particular context and necessitates the use of domain knowledge. Read more…
Revealing patterns in tools, tasks, and compensation through clustering and linear models.
Download the free “2015 Data Science Salary Survey” report to learn about tools, trends, and what pays (and what doesn’t) for data professionals.Data scientists are constantly looking outward, tapping into and extracting information from all manner of data in ways hardly imaginable not long ago. Much of the change is technological — data collection has multiplied as well as our means of processing it — but an important cultural shift has played a part, too, evidenced by the desire of organizations to become “data-driven” and the wide availability of public APIs.
But how much do we look inward, at ourselves? The variety of data roles, both in subject and method, means that even those of us who have a strong grasp of what it means to be a data scientist in a particular domain or sub-field may not have a complete view of the data space as a whole. Just as data we process and analyze for our organizations can be used to decide business actions, data about data scientists can help inform our career choices.
That’s where we come in. O’Reilly Media has been conducting an annual survey for data professionals, asking questions primarily about tools, tasks, and salary — and we are now releasing the third installment of the associated report, the 2015 Data Science Salary Survey. The 2015 edition features a completely new graphic design of the report and our findings. In addition to estimating salary differences based on demographics and tool usage, we have given a more detailed look at tasks — how data professionals spend their workdays — and titles. Read more…
Building access policies into data stores.
Hadoop jobs reflect the same security demands as other programming tasks. Corporate and regulatory requirements create complex rules concerning who has access to different fields in data sets, sensitive fields must be protected from internal users as well as external threats, and multiple applications run on the same data and must treat different users with different access rights. The modern world of virtualization and containers adds security at the software level, but tears away the hardware protection formerly offered by network segments, firewalls, and DMZs.
Furthermore, security involves more than saying yes or no to a user running a Hadoop job. There are rules for archiving or backing up data on the one hand, and expiring or deleting it on the other. Audit logs are a must, both to track down possible breaches and to conform to regulation.
Best practices for managing data in these complex, sensitive environments implement the well-known principle of security by design. According to this principle, you can’t design a database or application in a totally open manner and then layer security on top if you expect it to be robust. Instead, security must be infused throughout the system and built in from the start. Defense in depth is a related principle that urges the use of many layers of security, so that an intruder breaking through one layer may be frustrated by the next. Read more…
The expanding role of data analytics in a trillion-dollar industry.
Download our new free report “Data Analytics in Sports: How Playing with Data Transforms the Game,” by Janine Barlow, to learn how advanced predictive analytics are impacting the world of sports.
Sports are the perfect playing field on which data scientists can play their game — there are finite structures and distinct goals. Many of the components in sports break down numerically — e.g., number of players; length of periods; and, taking a broader view, how much each player is paid.
This is why sports and data have gone hand-in-hand since the very beginning of the industry. What, after all, is baseball without baseball cards?
In a new O’Reilly report, Data Analytics in Sports: How Playing with Data Transforms the Game, we explore the role of data analytics and new technology in the sports industry. Through a series of interviews with experts at the intersection of data and sports, we break down some of the industry’s most prominent advances in the use of data analytics and explain what these advances mean for players, executives, and fans.
Piracy isn’t the threat; it’s centuries old. Music Science is the game changer.
Download our new free report “Music Science: How Data and Digital Content are Changing Music,” by Alistair Croll, to learn more about music, data, and music science.
In researching how data is changing the music industry, I came across dozens of entertaining anecdotes. One of the recurring themes was music piracy. As I wrote in my previous post on music science, industry incumbents think of piracy as a relatively new phenomenon — as one executive told me, “vinyl was great DRM.”
But the fight between protecting and copying content has gone on for a long time, and every new medium for music distribution has left someone feeling robbed. One of the first known cases of copy protection — and illegal copying — involved Mozart himself.
As a composer, Mozart’s music spread far and wide. But he was also a performer and wanted to be able to command a premium for playing in front of audiences. One way he ensured continued demand was through “flourishes,” or small additions to songs, which weren’t recorded in written music. While Mozart’s flourishes are lost to history, researchers have attempted to understand how his music might once have been played. This video shows classical pianist Christina Kobb demonstrating a 19th century technique.