Why the finance world should care about big data and data science

Roger Magoulas on data's potential to improve finance systems and create new businesses.

ABOVE by Lyfetime, on FlickrFinance experts already understand that data has value. It’s the lifeblood of their industry, after all. But as O’Reilly director of market research Roger Magoulas (@rogerm) notes in the following interview, some in the financial domain may not grasp all that data has to offer. Data science and big data have led to an expansion of data types, Magoulas says, and the associated influx of information could very well shape investment strategies and create new businesses.

How does big data apply to the financial world?

Roger Magoulas: There are two flavors of it. One is analyzing things like your investments, econometrics, trading activity, and longer-term data analysis. That’s clearly part and parcel of the finance business, and people in the space already have great familiarity with this side of data.

The second flavor is the integrated approach to data in all facets of how organizations do business. This involves understanding customers, understanding competitors, understanding behavior, taking advantage of the world of sensors, and using a computational and quantitative mindset to make sense of a very confusing world.

Is there a disconnect between the finance world and terms like “data science” and “big data”?

Roger Magoulas: Everyone is struggling with the semantics, so finance isn’t worse off than others. They’re actually making an effort to understand it. Adding to the semantic confusion, the terms “data science” and “big data” are sometimes co-opted by organizations trying to show how they embody these attributes. That’s fine, but the finance ecosystem has a responsibility to learn as much as it can about these areas. The best way to do that is directly from the data science practitioners: see the tools data scientists use and how they approach their work. That firsthand experience will help finance experts inform their investment strategies and see where the data space is heading.

What’s the relationship between data science and business intelligence?

Roger Magoulas: My background is in data warehousing, and the front-end access to the data warehouse was known as “business intelligence” in the ’90s. These early data warehouses were mostly constructed out of quantitative data from operational systems — things like order entry and customer service systems. “Business intelligence” tools were used to access the mostly well-understood operational data in the data warehouses.

What’s changed is that we’ve had an explosion of data types. For example, no one was doing analysis on search terms back in the ’90s because the tools to do that weren’t available. Now, we need new terms to help accommodate what analysts do: natural language processing, machine learning, etc. Moreover, the old business intelligence tools were based on operational things, like how many orders a customer placed. They weren’t built to tackle these new tasks.

Will data science and big data incrementally improve existing techniques with new tools? Or are we also talking about the creation of whole new industries?

Roger Magoulas: It’s going to do both. The analogy might be to when open source software became widely used. While there were open source business models and companies, the real growth of open source came from companies like Google, Yahoo and Amazon that based their core technologies on the open source stacks. There was this two-headed approach that came out of the adoption of open source.

LinkedIn is an example of this two-headed approach. The company is a social network, but it uses data science tools, techniques and processes to build products that make sense of the social network for LinkedIn’s clients. Would LinkedIn exist without data science? I think you can imagine a social network that just helps business people connect with each other, but the real monetization part — the thing that helped them go public — came from LinkedIn using the data they capture to identify and build products.

This interview was edited and condensed.

Strata Conference New York 2011, being held Sept. 22-23, covers the latest and best tools and technologies for data science — from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.

Save 30% on registration with the code ORM30

Finance sessions at Strata New York

A number of sessions at the three Strata NY events (Sept. 19-23 in New York City) will examine the intersection of finance and data science. Here’s a selection:

Thin and Thick Value in a Transparent Environment

Presenter: Umair Haque, Havas Media Lab, HBR

Big data is a necessary part of a transition to an economy that’s not just more efficient and productive, but more efficient and productive in 21st century terms. Yet today, we’re hyper-connected, but in a relative data vacuum, which leaves us prone to large-scale crises and “too big to fail” thinking. In this session, Harvard’s Umair Haque looks at the future of thin and thick value in a data-driven world.

Next Best Action for MBAs

Presenter: James Kobielus, Forrester Research, Inc.

Leading-edge organizations have implemented “next best action” (NBA) technologies, such as big data analytics, within their multichannel customer relationship management programs. In this session, Forrester senior analyst James Kobielus will provide a vision, case studies, ROI metrics, and guidance for business professionals evaluating applications of NBA in their organizations.

Big Data: The Next Frontier

Presenter: Michael Chui, McKinsey Global Institute

McKinsey’s influential big data report has helped define and explain the opportunity created by the torrent of data flowing daily through business. Michael Chui outlines the big picture of data innovation, challenges and competitive advantage.

The New Corporate Intelligence

Presenter: Sean Gourley, Quid

What if corporate strategists could literally draw a map to find growth opportunities? A technique called semantic clustering analysis makes this possible. When applied to technology entities worldwide, this analysis can reveal not only which innovation areas are thick with competition, but also where in the market there are opportunities, or “white spaces,” ripe for innovation.

Creating a National Data Utility: Dodd-Frank Financial Reforms

Presenters: Donald F. Donahue, The Depository Trust & Clearing Corporation; Paul Sforza, U.S. Department of the Treasury

Donahue and Sforza will discuss America’s first public financial services data utility. This project is being incorporated into the United States’ existing information infrastructure to provide consistent, quality data to investors, institutions, and regulators.

Photo: ABOVE by Lyfetime, on Flickr

tags: , , , ,