IN THE HEALTH SECTOR, big data has been harnessed with remarkable success. One high-profile example is Google’s Flu Trends website, reported in a paper for the journal Nature in 2009 for accurately predicting the spread of epidemics based on the frequency of disease-related search queries.
Associate Professor Trish Williams, who heads the eHealth Research Group at Edith Cowan University in Joondalup, WA, says that unlike a lot of health research, projects using big data don’t focus on ‘cause and effect’. Instead, they tap into the huge potential of predictive analytics.
That’s an area where collaborative research can come to the fore, she says. Williams adds that big data research is most effective when done by cross-disciplinary teams who can both interpret information and present the findings to a broad audience.
“In health, it is really important that the semantics of the data are well-understood before you start analysing things,” she says. “You’ve also got to work out how to use some very big datasets, perhaps in ways that they weren’t necessarily intended to be used.”
“We’re working to improve the algorithms that detect what kind of problem the person has.”
This conundrum is very familiar to Associate Professor Jane Burns, CEO of the Young and Well CRC. When her team compared the results of a national survey that used ‘traditional’ computer-assisted telephone interviews with those from a similar Facebook survey, they expected both datasets would reveal similar trends.
“We found that the results were not similar at all; the internet results showed far higher levels of psychological distress,” she says, adding that there’s no sure way to work out which survey style had less bias. “Possibly, people are far more honest over the internet than they are over a telephone interview.”
Researching suicide indicators in social media is in its early stages, with researchers from the Young and Well CRC working with key industry partners such as Facebook, Twitter and Google.
“We’re trying to understand from a suicide prevention perspective, how we might be able to use big data to understand trends in the way in which people respond to things, to see if we can look to algorithms to capture some of the risks,” says Burns.
With more than 500 million short messages going out through the Twitter network daily, Burns says that finding algorithms to uncover keywords for suicide risk is a huge challenge.
Included in the research is suicide contagion – where one suicidal act within a community increases the likelihood of more occurring. Burns says a key focus of their research around suicide contagion, as well as identifying early warning symptoms or signs, is initiating support networks.
Within the Young and Well CRC, Associate Professor Rafael Calvo of the University of Sydney is working to design tools that help moderators in online health-focused communities, such as youth mental health support service ReachOut.com, to provide appropriate feedback and support for their members.
Thousands of forum posts can be automatically processed, generating a report that prioritises more serious problems so moderators can respond immediately. The team has also developed suggested ‘intervention’ templates, which link to helpful resources.
“We have built the interface for the moderator, and we’re now working on improving the algorithms that detect what kind of problem the person has,” Calvo says.
Social media is just one of the big data examples in health. At the CRC for Mental Health, researchers are looking for biomarkers – measurable biological indicators that might enable early intervention for people at risk of Alzheimer’s disease, mood disorders, schizophrenia and Parkinson’s disease. Datasets include the Australian Imaging, Biomarker & Lifestyle Flagship Study of Ageing, which has genomic information for more than 1500 people – some with normal cognitive function, others with mild cognitive impairment and others who have been diagnosed with Alzheimer’s disease.
Dr Noel Faux, a bioinformatician at the Florey Department of Neuroscience and Mental Health, says that the vast amounts of information already available include blood measurements of thousands of hormones and proteins. Cognitive and clinical assessments are also being gathered.
His team is working with software developer Arcitecta to help researchers capture clinical data on-site and feed it into a data repository that can be used by multiple research institutions.
HealthTracks, a web-based tool built by the CRC for Spatial Information, has been used by researchers at Western Australia’s Department of Health to merge health data with spatially-based datasets. The aim is to identify populations at risk of disease and gaps in the location of essential health services.
So far, hospital and regional health data has been combined with public datasets via the WA Landgate Shared Land Information Platform. When rolled out nationally, the tool will include modular enhancements for the analysis of mental health, child health and environmental health data.