"genomics" entries

Sequencing, cloud computing, and analytics meet around genetics and pharma

Bio-IT World shows what is possible and what is being accomplished

If your data consists of one million samples, but only 100 have the characteristics you’re looking for, and if each of the million samples contains 250,000 attributes, each of which is built of thousands of basic elements, you have a big data problem. This is kind of challenge faced by the 2,700 Bio-IT World attendees, who discover genetic interactions and create drugs for the rest of us.

Often they are looking for rare (orphan) diseases, or for cohorts who share a rare combination of genetic factors that require a unique treatment. The data sets get huge, particularly when the researchers start studying proteomics (the proteins active in the patients’ bodies).

So last week I took the subway downtown and crossed the two wind- and rain-whipped bridges that the city of Boston built to connect to the World Trade Center. I mingled for a day with attendees and exhibitors to find what data-related challenges they’re facing and what the latest solutions are. Here are some of the major themes I turned up.

Read more…

Comment: 1
Four short links: 25 February 2014

Four short links: 25 February 2014

MtGox Go Boom, Flappy Bird, Air Hockey Hack, and Robo Lab

  1. Bitcoin Markets Down — value of bitcoins plunges as market uncertain after largest bitcoin exchange goes insolvent after losing over 750k bitcoins because they didn’t update their software after a flaw was discovered in the signing of transactions.
  2. Flappy Bird for the Commodore 64 — the 1980s games platform meets the 2014 game. cf the machine learning hack where the flappy bird learns to play the game successfully.
  3. Air Hockey Robot — awesome hack.
  4. Run 30 Lab Tests on Only One Drop of Blood — automated lab processing to remove the human error in centrifuging, timing, etc. that added to variability of results.
Comment: 1

Big Data systems are making a difference in the fight against cancer

Open source, distributed computing tools speedup an important processing pipeline for genomics data

As open source, big data tools enter the early stages of maturation, data engineers and data scientists will have many opportunities to use them to “work on stuff that matters”. Along those lines, computational biology and medicine are areas where skilled data professionals are already beginning to make an impact. I recently came across a compelling open source project from UC Berkeley’s AMPLab: ADAM is a processing engine and set of formats for genomics data.

Second-generation sequencing machines produce more detailed and thus much larger files for analysis (250+ GB file for each person). Existing data formats and tools are optimized for single-server processing and do not easily scale out. ADAM uses distributed computing tools and techniques to speedup key stages of the variant processing pipeline (including sorting and deduping):

Variant Calling Pipeline

Very early on the designers of ADAM realized that a well-designed data schema (that specifies the representation of data when it is accessed) was key to having a system that could leverage existing big data tools. The ADAM format uses the Apache Avro data serialization system and comes with a human-readable schema that can be accessed using many programming languages (including C/C++/C#, Java/Scala, php, Python, Ruby). ADAM also includes a data format/access API implemented on top of Apache Avro and Parquet, and a data transformation API implemented on top of Apache Spark. Because it’s built with widely adopted tools, ADAM users can leverage components of the Hadoop (Impala, Hive, MapReduce) and BDAS (Shark, Spark, GraphX, MLbase) stacks for interactive and advanced analytics.

Read more…

Comment

Ticking all the boxes for a health care upgrade at Strata Rx

What is needed for successful reform of the health care system?

Here’s what we all know: that a data-rich health care future is coming our way. And what it will look like, in large outlines. Health care reformers have learned that no single practice will improve the system. All of the following, which were discussed at O’Reilly’s recent Strata Rx conference, must fall in place.

Read more…

Comment: 1

Denny Ausiello discusses phenotypes, pathways, and stratification

A video interview with Colin Hill

Last month, Strata Rx Program Chair Colin Hill, of GNS Healthcare, sat down with Dr. Dennis Ausiello, Jackson Professor of Clinical Medicine at the Harvard Medical School, Co-Director at CATCH, Pfizer Board of Directors Member, and Former Chief of Medicine at the Massachusetts General Hospital (MGH), for a fireside chat at a private reception hosted by GNS. Their insightful conversation covered a range of topics that all touched on or intersected with the need to create smaller and more precise cohorts, as well as the need to focus on phenotypic data as much as we do on genotypic data.

The full video appears below.

Read more…

Comment

Podcast: emerging technology and the coming disruption in design

Design's role in genomics and synthetic biology, robots taking our jobs, and scientists growing burgers in labs.

On a recent trip to our company offices in Cambridge, MA, I was fortunate enough to sit down with Jonathan Follett, a principal at Involution Studios and an O’Reilly author, and Mary Treseler, editorial strategist at O’Reilly. Follett currently is working with experts around the country to produce a book on designing for emerging technology. In this podcast, Follett, Treseler, and I discuss the magnitude of the coming disruption in the design space. Some tidbits covered in our discussion include:

And speaking of that lab burger, here’s Sergey Brin explaining why he bankrolled it:

Subscribe to the O’Reilly Radar Podcast through iTunesSoundCloud, or directly through our podcast’s RSS feed.

Comment: 1

Genomics and the Role of Big Data in Personalizing the Healthcare Experience

Increasingly available data spurs organizations to make analysis easier

This article was written with Ellen M. Martin and Tobi Skotnes. Dr. Feldman will deliver a webinar on this topic on September 18 and will speak about it at the Strata Rx conference.

Genomics is making headlines in both academia and the celebrity world. With intense media coverage of Angelina Jolie’s recent double mastectomy after genetic tests revealed that she was predisposed to breast cancer, genetic testing and genomics have been propelled to the front of many more minds.

In this new data field, companies are approaching the collection, analysis, and turning of data into usable information from a variety of angles.
Read more…

Comment: 1

Podcast: George Church on genomics

"Like a spaceship that was parked in our back yard"

A few weeks ago some of my colleagues and I recorded a conversation with George Church, a Harvard University geneticist and one of the founders of modern genomics. In the resulting podcast, you’ll hear Church offer his thoughts on the coming transformation of medicine, whether genes should be patentable, and whether the public is prepared to deal with genetic data.

Here’s how Church characterizes the state of genomics:

It’s kind of like ’93 on the Web. In fact, in a certain sense, it’s more sophisticated than electronics because we have inherited three billion years of amazing technology that was just like a spaceship that was parked in our back yard and we’re just reverse-engineering and probably not fully utilizing even the stuff that we’ve discovered so far.

A few other helpful links:

On this podcast from O’Reilly Media: Tim O’Reilly, Roger Magoulas, Jim Stogdill, Mike Loukides, and Jon Bruner. Subscribe to the O’Reilly Radar podcast through iTunes or SoundCloud, or directly through our podcast’s RSS feed.

Comment

Genomics and Privacy at the Crossroads

Would you let people know about your dandruff problem if it might mean a cure for Lupus?

Two weeks ago, I had the privilege to attend the 2013 Genomes, Environments and Traits conference in Boston, as a participant of Harvard Medical School’s Personal Genome Project. Several hundreds of us attended the conference, eager to learn what new breakthroughs might be in the works using the data and samples we have contributed, and to network with the researchers and each other.

The Personal Genome Project (PGP) is a very different type of beast from the traditional research study model, in several ways. To begin with, it is a Open Consent study, which means that all the data that participants donate is available for research by anyone without further consent by the subject. In other words, having initially consented to participate in the PGP, anyone can download my genome sequence, look at my phenotypic traits (my physical characteristics and medical history), or even order some of my blood from a cell line that has been established at the Coriell biobank, and they do not need to gain specific consent from me to do so. By contrast, in most research studies, data and samples can only be collected for one specific study, and no other purposes. This is all in an effort to protect the privacy of the participants, as was famously violated in the establishment of the HeLa cell line.

The other big difference is that in most studies, the participants rarely receive any information back from the researchers. For example, if the researcher does a brain MRI to gather data about the structure of a part of your brain, and sees a huge tumor, they are under no obligation to inform you about it, or even to give you a copy of the scan. This is because researchers are not certified as clinical laboratories, and thus are not authorized to report medical findings. This makes sense, to a certain extent, with traditional medical tests, as the research version may not be calibrated to detect the same things, and the researcher is not qualified to interpret the results for medical purposes.

Read more…

Comment
Four short links: 6 September 2012

Four short links: 6 September 2012

Human Genome Doxed, Programmed by Movies, CritterDrones, and Responsive Websites

  1. ENCODE Project — International project (headed by Ewan Birney of BioPerl fame) doxes the human genome, bigtime. See the Nature piece, and Ed Yong’s explanation of the awesome for more. Not only did they release the data, but also the software, including a custom VM.
  2. 5 Ways You Don’t Realize Movies Are Controlling Your Brain — this! is! awesome!
  3. RC Grasshoppers — not a band name, an Israeli research project funded by the US Army, to remotely-control insects in flight. Instead of building a tiny plane whose dimensions would be measured in centimeters, the researchers are taking advantage of 300 million years of evolution.
  4. enquire.js — small Javascript library for building responsive websites. (via Darren Wood)
Comment