Tag Archives: data reuse

Doorway to cancer data

Precision medicine is opening the doorway to cancer data and offering hope to cancer patients. The power of genomics and the masses of data it creates is transforming cancer research and allowing personalised treatments with more proven effects.

Like hundreds of other cancer researchers, Mark Ragan and his team at The University of Queensland’s Institute for Molecular Bioscience (IMB) need to design experiments based on data from human and cancer genetics. Using data chips and next generation sequencing they must assemble their genetic data, interpret it to understand what genes their data refer to by comparison with other samples, and then classify patients’ cancer into subtypes. If they can’t match to an existing subtype, they identify a new one. Ragan says this intensive work requires access to as much genetic data as possible.

“It would literally be impossible without the data reuse that TCGA and other genome research programs offer”

Doorway to cancer data

Luckily, there are portals with this type of data. One of the first to start collecting cancer genome data was the The Cancer Genome Atlas (TCGA). The initials TCGA also make up the four-letter code of nucleotide bases thymine, cytosine, guanine and adenine that DNA uses to ‘write’ genetic information.

Doorway to cancer data
Photo by Richard Ricciardi.

TCGA was started by the US National Institutes of Health (specifically the National Cancer Institute and the National Human Genome Research Institute) in 2006. Ragan says its initial goal was to generate data from researchers across research institutions on two cancer types. Early success expanded the initial goal to collect and profile more than 10,000 samples from over 20 tumour types. While the sample collection phase ended in 2013, data reuse ensures the data generated from those samples are still being analysed. Over 2700 papers have been published by TCGA data so far, including Australian researchers.

The data portal for the TCGA is “amazing” says Ragan. “It’s a really powerful portal that lets you ask questions and interrogate gigantic amounts of cancer genome data, including sequences, survival rates and subtype classifications.”

“Just about everything in it is open access, and the raw data, which isn’t open access, is made available by applying through research institutions’ ethics committees.”

A newer initiative inspired by the success of TCGA, the International Cancer Genome Consortium (ICGC), is an international project in which Ragan’s colleagues play a part. ICGC is built on the TCGA project, which provides about 60% of the patient data in ICGC’s Data Coordination Center. ICGC aims to cover 50 tumour types and currently funds 78 international cancer genome projects like the Australian project at IMB.

“Our research into breast cancer subtypes and survival would literally be impossible without the data reuse that TCGA and other genome research programs offer. We can tell if we’ve discovered a new cancer subtype or not, or even whether the existing data need reinterpreting,” says Ragan.


New treatments

Knowing a patient’s cancer subtype allows more tailored, evidence-based treatment, potentially increasing survival rates and quality of life by allowing clinicians to more confidently focus on prescribing the drugs most likely to succeed for a particular patient.

One of the exciting things Ragan and other researchers are finding from the data is that some quite different cancer types have a similar genetic basis. This means drugs to treat one type of cancer, such as breast cancer, could be used for another, such as ovarian cancer.

“Instead of waiting 10 years for a new drug to be developed, patients may be able to be treated straight away with a drug that’s already available for another cancer,” says Ragan.

That’s good news for patients, and it also makes drug development, which can cost hundreds of millions of dollars per drug, more cost-effective. This potentially creates a larger market for a given drug, and makes some drugs financially viable that otherwise wouldn’t get to market.

Story provided by Refraction Media.

Originally published in Share, the newsletter magazine of the Australian National Data Service (ANDS).

One small step for open data…

NASA has a plan. Not one, in this case, about spaceships and astronauts, but something far more ‘down to earth’: open data. The organisation’s Plan for Increasing Access to the Results of Scientific Research was first published in late 2014, laying out NASA’s commitment to open up its datasets for international reuse. Full implementation of the plan is set to be in place from October 2015.

The plan aims, in NASA’s words, to “ensure public access to publications and digital data sets arising from NASA research, development, and technology programs”.

Done properly, opening up complex data sets for public analysis and reuse can lead to new and exciting discoveries, sometimes by those with nothing more than a keen amateur interest (or perhaps obsession) with the topic.

NASA is fully aware of this potential. It says it wants to support researchers to make new findings based on its data, not just in the US but around the globe. As if to prove the point, NASA’s Data Stories website highlights a number of case studies of people reusing its datasets in original applications, such as a ‘Solar System Simulator’ created by Canadian website developer Martin Vezina.

NASA also knows it needs to show commitment to scientific integrity and the accuracy of its research data and wants to encourage others to do the same. So by publishing its own datasets, NASA’s team are setting a benchmark for researchers hoping to grab a slice of the organisation’s annual research investment – a whopping US$3 billion. A condition of funding those research contracts, outlined in the 2014 document, is that researchers must develop their own data management plans describing how they will provide access to their scientific data in digital format. One small step for open data, one giant leap for new scientific discovery?

“This plan will ‘ensure public access to publications and digital data sets arising from NASA research, development, and technology programs’.”


How public data is being reused: The Australian Survey of Social Attitudes

The Australian Survey of Social Attitudes (AuSSA) is the main source of data for the scientific study of the social attitudes, beliefs and opinions of the nation.

It measures how those attitudes change over time as well as how they compare with other societies, which helps researchers better understand how Australians think and feel about their lives. Similar surveys are run in other countries, meaning data from AuSSA also allows us to compare Australia with countries all over the world.

Access to the AuSSA data has allowed independent researchers to explore changes in social attitudes in Australia over time. For example, Andrew Norton (now at the Grattan Institute in Melbourne) has analysed AuSSA to examine changes in attitudes towards same sex relationships between 1984 and 2009, noting the major shifts in favour of same sex relationships during that period.

AuSSA is often used as a reference point for public policy debate. A number of media articles have been based on its findings, discussing topics as diverse as climate change, the welfare state and the kindness of Australians.

Similarly Australian Policy Online includes 18 different papers making use of AuSSA, including papers on perceptions of democracy, population growth, cultural identity and tax policy.

AuSSA datasets can be accessed via its website.

With thanks to Steve McEachern, Director of the Australian Data Archive at Australian National University.


Story provided by Refraction MediaOriginally published in Share, the newsletter magazine of the Australian National Data Service (ANDS).

Featured image source (above): NASA.