The following appeared week or two.
July 08, 2011 | Molly Merrill, Associate Editor
BALTIMORE – Twitter can be used to track important health trends, according to computer scientists at Johns Hopkins University.
Mark Dredze and Michael J. Paul fed two billion public tweets posted between May 2009 and October 2010 into computers, then used software to filter out the 1.5 million messages that referred to health matters. Dredze, a researcher at the university’s Human Language Technology Center of Excellence and an assistant research professor of computer science, and Paul, a doctoral student, said identities of the tweeters were not collected.
“Our goal was to find out whether Twitter posts could be a useful source of public health information," said Dredze. “We determined that indeed, they could. In some cases, we probably learned some things that even the tweeters’ doctors were not aware of, like which over-the-counter medicines the posters were using to treat their symptoms at home.”
By sorting these health-related tweets into electronic “piles,” Dredze and Paul uncovered patterns about allergies, flu cases, insomnia, cancer, obesity, depression, pain and other ailments. “There have been some narrow studies using Twitter posts, for example, to track the flu,” said Dredze. “But to our knowledge, no one has ever used tweets to look at as many health issues as we did.”
In addition to finding a range of health ailments in Twitter posts, the researchers were able to record many of the medications that ill tweeters consumed, thanks to posts such as: “Had to pop a Benadryl … allergies are the worst.”
Other tweets pointed to misuse of medicine. “We found that some people tweeted that they were taking antibiotics for the flu,” said Paul. “But antibiotics don’t work on the flu, which is a virus, and this practice could contribute to the growing antibiotic resistance problems. So these tweets showed us that some serious medical misperceptions exist out there.”
To find the health-related posts among the billions of messages in their original pool, the Johns Hopkins researchers applied a filtering and categorization system they devised. With this tool, computers can be taught to disregard phrases that do not really relate to one’s health, even though they contain a word commonly used in a health context.
Once the unrelated tweets were removed, the remaining results provided some surprising findings. “When we started, I didn’t even know if people talked about allergies on Twitter,” said Paul. “But we found out that they do. And there was one thing I didn’t expect: The system found two different types of allergies – the type that causes sniffling and sneezing and the kind that causes skin rashes and hives.”
In about 200,000 of the health-related tweets, the researchers were able to draw on user-provided public information to identify the geographic state from which the message was sent. That allowed them to track some trends by time and place, such as when the allergy and flu seasons peaked in various parts of the country. “We were able to see from the tweets that the allergy season started earlier in the warmer states and later in the Midwest and the Northeast,” said Dredze.
First it was Google tracking epidemics based on what was searched for and now we have Twitter offering at least some useful information in the quest to understand patterns of symptom and disease spread.
Talk about an ‘unintended outcome’!