Google Extends Its Cloud-based Genome Data Set Business


This summer, Google formed a partnership with Autism Speaks, a US-based autism research foundation, that resulted in Google hosting 10,000 fully sequenced genomes on its cloud infrastructure. The partnership was formed to reduce the lag time researchers were experiencing when they tried to share newly sequenced genetic data. With a single fully sequenced genome filling a 100 gigabyte disk, researchers tend to rely on conventional mail to share data when sharing hundreds of genetic samples, rather than trying to transfer the information over the web. The financial arrangements behind this deal were never disclosed.

Now, Google is announcing a similar partnership with the Institute for Systems Biology, a Seattle-based cancer research foundation. ISB has been pioneering genetic research for the Cancer Genome Atlas for the past five years. The Cancer Genome Atlas is a collaborative project that aims to explore the link between genetic abnormalities and various types of cancers. This data set also holds 10,000 samples, which researchers have no easy way of sending to new collaborators, other than to mail it to them.

Like with Autism Speaks, Google has been contracted to take the massive data set and move it onto the cloud, where researchers can simultaneously access and add to the data set. Unlike with Autism Speaks, the financial details behind the deal have been disclosed with this partnership. The Institute for Systems Biology, in conjunction with Google and a supporting IT company called SRA International, will receive $6.5 million through an NIH grant to move the data set to the cloud and host it for two years. The project is being called the Cancer Genomics Cloud, or CGC. Beyond hosting the genetic data on the cloud, Google will also provide the required computing power to analyze the massive data sets.

“Cancer researchers will be able to analyze and explore entire cohorts of rich genomic data, without needing access to a large local compute cluster. The CGC will also facilitate collaborative research by allowing scientists to work on common datasets and projects in a cloud environment.” – Dr. Ilya Shmulevich, professor at ISB and CGC prime investigator.

As genetic research makes its way into specialized medicine, researchers struggling to collaborate with the extremely large data sets seem to be moving to the cloud to resolve some of the logistical issues that comes up with big data research, and Google has been the dominant vendor of choice when it comes to hosting thus far.  For a company whose own founder has publically lambasted the healthcare industry as over regulated and a “painful business to be in,” there certainly seems to be increasing momentum in Google’s healthcare initiatives.

Enjoy HIStalk Connect? Sign up for update alerts, or follow us at @HIStalkConnect.

↑ Back to top

Founding Sponsors

Platinum Sponsors