"Big Data Culture" entries
Stories from women who are making a big impact on the field of big data.
Through a series of 15 interviews with women across the data field, we’ve uncovered stories we think you’ll find and both interesting and inspiring. The interviews explore:
- Interviewees’ views about opportunities for women in the fields of science, technology, engineering, and math (STEM)
- Benefits of the data field as a career choice for women
- The changing attitudes of Millennials toward women working in data
- Remedies for continuing to close the gender gap in tech
Our findings reveal an important consensus among the women we interviewed — the role of female mentors and role models working in STEM is extremely important for opening up the pathway for more women to enter these fields. In fact, the impact that mentors have had on our interviewees has inspired many of them to serve as mentors to other female colleagues, and younger generations of girls, today. Read more…
A look at the social and moral implications of living in a deeply connected, analyzed, and informed world.
We’ll now look at both the light and the shadows of this new dawn, the social and moral implications of living in a deeply connected, analyzed, and informed world. This is both the promise and the peril of big data in an age of widespread sensors, fast networks, and distributed computing.
Solving the big problemsThe planet’s systems are under strain from a burgeoning population. Scientists warn of rising tides, droughts, ocean acidity, and accelerating extinction. Medication-resistant diseases, outbreaks fueled by globalization, and myriad other semi-apocalyptic Horsemen ride across the horizon.
Can data fix these problems? Can we extend agriculture with data? Find new cures? Track the spread of disease? Understand weather and marine patterns? General Electric’s Bill Ruh says that while the company will continue to innovate in materials sciences, the place where it will see real gains is in analytics.
It’s often been said that there’s nothing new about big data. The “iron triangle” of Volume, Velocity, and Variety that Doug Laney coined in 2001 has been a constraint on all data since the first database. Basically, you could have any two you want fairly affordably. Consider:
- A coin-sorting machine sorts a large volume of coins rapidly, but assumes a small variety of coins. It wouldn’t work well if there were hundreds of coin types.
- A public library, organized by the Dewey Decimal System, has a wide variety of books and topics, and a large volume of those books — but stacking and retrieving the books happens at a slow velocity.
What’s new about big data is that the cost of getting all three Vs has become so cheap it’s almost not worth billing for. A Google search happens with great alacrity, combs the sum of online knowledge, and retrieves a huge variety of content types. Read more…
The evolving marketplace is making new data applications and interactions possible.
Here’s a look at some options in the evolving, maturing marketplace of big data components that are making the new applications and interactions we’ve been looking at possible.
First used in social network analysis, graph theory is finding more and more homes in research and business. Machine learning systems can scale up fast with tools like Parameter Server, and the TitanDB project means developers have a robust set of tools to use.
Are graphs poised to take their place alongside relational database management systems (RDBMS), object storage, and other fundamental data building blocks? What are the new applications for such tools?
Inside the black box of algorithms: whither regulation?It’s possible for a machine to create an algorithm no human can understand. Evolutionary approaches to algorithmic optimization can result in inscrutable, yet demonstrably better, computational solutions.
If you’re a regulated bank, you need to share your algorithms with regulators. But if you’re a private trader, you’re under no such constraints. And having to explain your algorithms limits how you can generate them.
As more and more of our lives are governed by code that decides what’s best for us, replacing laws, actuarial tables, personal trainers, and personal shoppers, oversight means opening up the black box of algorithms so they can be regulated.
Years ago, Orbitz was shown to be charging web visitors who owned Apple devices more money than those visiting via other platforms, such as the PC. Only that’s not the whole story: Orbitz’s machine learning algorithms, which optimized revenue per customer, learned that the visitor’s browser was a predictor of their willingness to pay more. Read more…
Salary insights from more than 800 data professionals reveal a correlation to skills and tools.
In the results of this year’s O’Reilly Media Data Science Salary Survey, we found a median total salary of $98k ($144k for US respondents only). The 816 data professionals in the survey included engineers, analysts, entrepreneurs, and managers (although almost everyone had some technical component in their role).
Why the high salaries? While the demand for data applications has increased rapidly, the number of people who set up the systems and perform advanced analytics has increased much more slowly. Newer tools such as Hadoop and Spark should have even fewer expert users, and correspondingly we found that users of these tools have particularly high salaries. Read more…
In this O'Reilly Radar Podcast: Dr. Gilad Rosner talks about data privacy, and Alasdair Allan chats about the broken IoT.
In this podcast episode, I catch up with Dr. Gilad Rosner, a visiting researcher at the Horizon Digital Economy Research Institute in England. Rosner focuses on privacy, digital identity, and public policy, and is launching an Internet of Things Privacy Forum. We talk about personal data privacy in the age of the Internet of Things (IoT), privacy as a social characteristic, an emerging design ethos for technologists, and whether or not we actually own our personal data. Rosner characterizes personal data privacy as a social construct and addresses the notion that privacy is dead:
“Firstly, it’s important to recognize the idea that privacy is not a regime to control information. Privacy is a much larger concept than that. Regimes to control information are ways that we as a society preserve privacy, but privacy itself emerges from social needs and from individual human needs. The idea that privacy is dead comes from the vulnerability that people are feeling because they can see that it’s very difficult to maintain walls between their informational spheres, but that doesn’t mean that there aren’t countercurrents to that, and it doesn’t mean that there aren’t ways, as we go forward, to improve privacy preservation in the electronic spaces that we continue to move into.”
As we move more and more into these electronic spaces and the Internet of Things becomes democratized, our notions of privacy are shifting on a cultural level beyond anything we’ve experienced as a society before. Read more…