ENTRIES TAGGED "data tools"

2013 Data Science Salary Survey

Tools, Trends, What Pays (and What Doesn't) for Data Professionals

salary_survey_coverThere is no shortage of news about the importance of data or the career opportunities within data. Yet a discussion of modern data tools can help us understand what the current data evolution is all about, and it can also be used as a guide for those considering stepping into the data space or progressing within it.

In our report, 2013 Data Science Salary Survey, we make our own data-driven contribution to the conversation. We collected a survey from attendees of the Strata Conference in New York and Santa Clara, California, about tool usage and salary.

Strata attendees span a wide spectrum within the data world: Hadoop experts and business leaders, software developers and analysts.  By no means does everyone use data on a “Big” scale, but almost all attendees have some technical aspect to their role.  Strata attendees may not represent a random sample of all professionals working with data, but they do represent a broad slice of the population.  If there is a bias, it is likely toward the forefront of the data space, with attendees using the newest tools (or being very interested in learning about them).

Read more…

Comment

An update on in-memory data management

In-memory data management brings data close to the computation.

By Ben Lorica and Roger Magoulas

We wanted to give you a brief update on what we’ve learned so far from our series of interviews with players and practitioners in the in-memory data management space. A few preliminary themes have emerged, some expected, others surprising.

Performance improves as you put data as close to the computation as possible. We talked to people in systems, data management, web applications, and scientific computing who have embraced this concept. Some solutions go to the the lowest level of hardware (L1, L2 cache), The next generation SSDs will have latency performance closer to main memory, potentially blurring the distinction between storage and memory. For performance and power consumption considerations we can imagine a future where the primary way systems are sized will be based on the amount of non-volatile memory* deployed.

Putting data in-memory does not negate the importance of distributed computing environments. Data size and the ability to leverage parallel environments are frequently cited reasons. The same characteristics that make the distributed environments compelling also apply to in-memory systems: fault-tolerance and parallelism for performance. An additional consideration is the ability to gracefully spillover to disk when main is memory full. Read more…

Comment: 1

Six ways data journalism is making sense of the world, around the world

Early responses from our investigation into data-driven journalism had an international flavor.

When I wrote that Radar was investigating data journalism and asked for your favorite examples of good work, we heard back from around the world.

I received emails from Los Angeles, Philadelphia, Canada and Italy that featured data visualization, explored the role of data in government accountability, and shared how open data can revolutionize environmental reporting. A tweet pointed me to a talk about how R is being used in the newsroom. Another tweet linked to relevant interviews on social science and the media:

Two of the case studies focused on data visualization, an important practice that my colleague Julie Steele and other editors at O’Reilly Media have been exploring over the past several years.

Several other responses are featured at more length below. After you read through, make sure to also check out this terrific Ignite talk on data journalism recorded at this year’s Newsfoo in Arizona. Read more…

Comment

Health records support genetics research at Children's Hospital of Philadelphia

Michael Italia on making use of data collected in health care settings.

Michael Italia from Children's Hospital of Philadelphia discusses the tools and methods his team uses to manage health care data.

Comment

Health records support genetics research at Children’s Hospital of Philadelphia

Michael Italia on making use of data collected in health care settings.

Michael Italia from Children's Hospital of Philadelphia discusses the tools and methods his team uses to manage health care data.

Comment: 1

Everyone has a big data problem

MetaLayer's Jonathan Gosier on data tools and the data divide.

MetaLayer's Jonathan Gosier talks about the need to democratize data tools because everyone has a big data problem.

Comment: 1
Why data visualization matters

Why data visualization matters

The best data visualizations expose something new.

Effective data visualizations go beyond aesthetics; they also allow organizations to make quick and correct decisions from massive amounts of information.

Comments: 12
Embracing the chaos of data

Embracing the chaos of data

Pete Warden on the upside of unstructured data.

Data scientists, it's time to welcome errors and uncertainty into your data projects. In this interview, Jetpac CTO Pete Warden discusses the advantages of unstructured data.

Comment
Global Adaptation Index enables better data-driven decisions

Global Adaptation Index enables better data-driven decisions

The Global Adaptation Index combines development indicators from 161 countries.

Speed, accessibility and open data have come together in the Global Adaptation Index, a new data browser that rates a given country's vulnerability to environmental shifts.

Comment
When was the last time you mined your site's search data?

When was the last time you mined your site's search data?

Lou Rosenfeld on the benefits of parsing and refining site search.

A gold mine is hiding in the data generated by website search engines, yet many site owners pay little attention to the analytics those engines yield. Author Lou Rosenfeld explains why site search is worth your time.

Comments: 4