ENTRIES TAGGED "strata"
Open source communities to help find the next blockbuster drug
Big drug companies are not what they used to be. It is harder to find new drug candidates, to test them, and to get them approved than ever before. Drugs that are “mere chemicals” are becoming more and more complex. Frequently, new drugs require DNA interaction, which requires them to be manufactured through a mostly automated cellular process rather than just mixing the right components in the right order. Just the changes to the refrigeration requirements for these new drugs represents a challenge to drug manufacturers, pharmacies and hospitals.
Combined, these difficulties create a combustible business environment that can ignited by the pressure of expiring patents. Experts estimate that the approval process ensures that a drug company actually gets only about 12 years of exclusivity before a 20-year patent wears off. So in pharma-land, the march of popular medications to generic status forces the original developers into the famous Innovators Dilemma. Most companies face competition from the generic versions of their own previous work.
Barlow's distilled insights regarding the ever evolving definition of real time big data analytics
During a break in between offsite meetings that Edd and I were attending the other day, he asked me, “did you read the Barlow piece?”
“Umm, no.” I replied sheepishly. Insert a sidelong glance from Edd that said much without saying anything aloud. He’s really good at that.
In my utterly meager defense, Mike Loukides is the editor on Mike Barlow’s Real-Time Big Data Analytics: Emerging Architecture. As Loukides is one of the core drivers behind O’Reilly’s book publishing program and someone who I perceive to be an unofficial boss of my own choosing, I am not really inclined to worry about things that I really don’t need to worry about. Then I started getting not-so-subtle inquiries from additional people asking if I would consider reviewing the manuscript for the Strata community site. This resulted in me emailing Loukides for a copy and sitting in a local cafe on a Sunday afternoon to read through the manuscript.
The biggest problems will almost always be those for which the size of the data is part of the problem.
A recent VentureBeat article argues that “Big Data” is dead. It’s been killed by marketers. That’s an understandable frustration (and a little ironic to read about it in that particular venue). As I said sarcastically the other day, “Put your Big Data in the Cloud with a Hadoop.”
You don’t have to read much industry news to get the sense that “big data” is sliding into the trough of Gartner’s hype curve. That’s natural. Regardless of the technology, the trough of the hype cycle is driven by by a familiar set of causes: it’s fed by over-agressive marketing, the longing for a silver bullet that doesn’t exist, and the desire to spout the newest buzzwords. All of these phenomena breed cynicism. Perhaps the most dangerous is the technologist who never understands the limitations of data, never understands what data isn’t telling you, or never understands that if you ask the wrong questions, you’ll certainly get the wrong answers.
Big data is not a term I’m particularly fond of. It’s just data, regardless of the size. But I do like Roger Magoulas’ definition of “big data”: big data is when the size of the data becomes part of the problem. I like that definition because it scales. It was meaningful in 1960, when “big data” was a couple of megabytes. It will be meaningful in 2030, when we all have petabyte laptops, or eyeglasses connected directly to Google’s yottabyte cloud. It’s not convenient for marketing, I admit; today’s “Big Data!!! With Hadoop And Other Essential Nutrients Added” is tomorrow’s “not so big data, small data actually.” Marketing, for better or for worse, will deal. Read more…
Using data science to predict the Oscars
Sophisticated algorithms are not going to write the perfect script or crawl YouTube to find the next Justin Beiber (that last one I think we can all be thankful for!). But a model can predict the probability of a nominee winning the Oscar, and recently our model has Argo overtaking Lincoln as the likely winner of Best Picture. Every day on FarsiteForecast.com we’ve been describing applications of data science for the media and entertainment industry, illustrating how our models work, and updating the likely winners based on the outcomes of the Awards Season leading up to the Oscars. Just as predictive analytics provides valuable decision-making tools in sectors from retail to healthcare to advocacy, data science can also empower smarter decisions for entertainment executives, which led us to launch the Oscar forecasting project. While the potential for data science to impact any organization is as unique as each company itself, we thought we’d offer a few use cases that have wide application for media and entertainment organizations.
It's not about IT buying, but about making data work for you. Learn more in the Big Data in Enterprise IT program at Strata California.
In a world where technology and business are evermore intertwined, IT leaders aspire to key roles in their organizations. Sadly, industry conferences can lag behind, assuming IT is all about making the right buying decisions.
Not so at Strata.
Our approach is to take a view of data for business that centers around the problems you need to solve. The excitement around big data isn’t really about large volumes of data, it’s about smart use of data. It’s about using data to make your products better, help you be significantly more efficient, and create new products and businesses.
Getting the most from big data and data science is a lot more than a software choice. The business aims come first, and a good understanding of the problems you want to solve. Then you need to understand the capabilities of the technology and where data science can be best applied. Finally, you need to know how to run successful data projects, and how to hire and manage data teams.
Working with analytics and BI expert Mark Madsen, I’ve compiled a day-long program at Strata called Big Data in Enterprise IT that will take you through big data strategy, the issues of managing data, and how data science can be used effectively in your organization. Read more…
Featured Strata Community Profile on Analytics Manager Kim Stedman
When Kim Stedman starts talking about the science of asking questions, I am all ears. As a reporter, I make a living asking questions. She goes on to explain the potential of data science to nudge us all in the direction of thinking about whether we could be asking better questions or making better use of the answers.
Take control of your identity and make sure your electronic footprint works for you.
Recently, the Wall Street Journal published an article discussing the explosion of tax-identity theft, which has ballooned to 1.1 million cases in 2011 from 51,700 in 2008. The Wall Street Journal mentioned that the Treasury Inspector General for Tax Administration reported an additional 1.5 million potentially fraudulent 2011 tax refunds totaling in excess of $5.2 billion.
What is the cause of this? Taxpayers now have the option to file their taxes online and receive their refunds directly deposited into bank accounts, and this tax filing method creates all kinds of opportunities for electronic fraud.
How does this affect you? Fraudsters who have certain pieces of your data can file a tax return under your identity and receive your refund. Fraudsters look for “at-risk” identities that they can use in scale to scam refunds out of the IRS. “At risk” identities include deceased identities, identities of minors, and legitimate citizens. Fraudsters only need a valid name and social security number combination. How much time, effort and money do you think it would take you to convince the IRS that a fraudster stole your identity, filed your taxes and received your refund? Even though this trend is on the rise and the government has seen it, the onus would still be on you, to prove that you were defrauded.
How the inevitable rise of software means cycle time trumps scale.
Exponential curves gradually, inexorably grow until they reach a limit. The function increases over time. That’s why a force like gravity, which grows exponentially as objects with mass get closer to one another, eventually leads to a black hole. And at the middle of this black hole is a point of infinite mass, a singularity, within which the rules no longer apply.
Financiers also like exponents. “Compound interest is the most powerful force in the universe” is a quote often attributed to Einstein; whoever said it was right. If you pump the proceeds of interest back into a bank account, it’ll increase steadily.
Computer scientists like to throw the term “singularity” around, too. To them, it’s the moment when machines become intelligent enough to make a better machine. It’s the Geek Rapture, the capital-S-Singularity. It’s the day when machines don’t need us any more, and to them, we look like little more than ants. Ray Kurzweil thinks it’s right around the corner — circa 2045 — and after that time, to us, these artificial intelligences will be incomprehensible.
Businesses need to understand singularities, because they have one of their own to contend with. Read more…
A sneak peek at an upcoming visualization session from the 2013 Strata Conference in Santa Clara, Calif.
Strata Editor’s Note: Over the next few weeks, the Strata Community Site will be providing sneak peeks of upcoming sessions at the Strata Conference in Santa Clara. Nicolas’ sneak peek is the first in this series.
Last year was a great year for data visualization at Twitter. Our Analytics team expanded and created a dedicated data visualization team, and some of our projects were released publicly with great feedback.
Our first public interactive of 2012 was a fun way to expose how the Eurocup was experienced at Twitter. You can see in this organic visualization how people cheered for their teams during each match, and how the tension and volume of tweets increased towards the finals.
Join us in the data revolution.
When I told some of my friends and family that I was joining O’Reilly Media as an editor focusing on ORM’s Strata practice area, their responses reflected the diversity of my loved ones.
I’ve paraphrased some of the best ones here:
- “That is great! I have a bunch of their books. Everyone I know has the animal books.”
- “Bill O’Reilly owns a media company?”
- “I don’t get you techie people. Didn’t you already do a bunch of weird ninja-y data type stuff?”
- “Congrats! I have a lot of respect for ORM.”
- “… wait a sec, didn’t you STOP being a Java editor years ago to go work at an assessment data startup? ”
The people in my life have a few things in common. They are smart, articulate, really truly not afraid to say what they think, and seek to be the change they wish to see in the world. We don’t always agree [massive understatement]. Yet, our motivations are the same.
Why am I telling you this?
I believe that at our core, no matter how different we may seem, we do not actively seek to harm. Yet, everyone that works with data already has or will be facing certain choices on what to do with data. Choices that are obviously for good or for evil. Choices that are neither completely for good or completely for evil. Choices that we are reluctant to discuss because we do not want to implicate ourselves or the companies we work for. Yet, just because we are reluctant to discuss them does not mean we are not facing these challenges.
If you have the courage to speak out regarding the real everyday challenges that you experience while working with data, then I want to listen. If you have discovered solutions to these everyday challenges, then I want to publish your insight. If you engage in anything I publish, whether you agree or disagree, have suggestions for how things could be different or better, then please say something.