Wednesday, November 23, 2016

Data De-Identification Seems To Have Become A Little Controversial. We Need Some Expert Advice Made Available Based On Successful Experience.

These two articles popped up last week.
First we have:

Clear-cut definition of de-identified data critical in legislation: Pilgrim

Australia's Privacy Commissioner has said the de-identification of data is an area requiring regulation, and that agreed industry standards could be useful to fill the public with confidence.
By Asha McLean | November 16, 2016 -- 05:12 GMT (16:12 AEDT) | Topic: Security
A successful data-driven economy needs a strong foundation in privacy, and accordingly, good privacy management and great innovation go hand in hand, Australian Information and Privacy Commissioner Timothy Pilgrim has said.
Speaking at a data sharing and interoperability workshop during the GovInnovate summit in Canberra on Wednesday, Pilgrim said that by and large, people do want their personal information to work for them, provided that they know about it. He also noted that when there is transparency in how personal information is used, citizens should feel a sense of clarity, choice, and confidence that their privacy rights are being respected.
For Pilgrim, building trust with the public is key to the challenges big data presents for organisations, including government, and highlighted that trust is further challenged by the nature of secondary uses of data.
"Part of the solution, potentially a significant part I suggest, lies in getting de-identification right," he said.
"This includes ensuring that government agencies, regulators, businesses, and technology professionals have a common understanding as to what 'getting it right' means.
"At the moment, that common clarity is not evident."
While Pilgrim said that de-identification can be a smart and contemporary response to the privacy challenges of big data, which he said aims to separate the "personal" from the "information" within data sets, the commissioner highlighted that there was no clear-cut definition of how far-removed personal identifiers needed to be before the dataset is considered de-identified.
"I stress as privacy commissioner that de-identification is not the only approach available to manage the privacy dimensions of big data, but we are keen to explore its potential when done fully and correctly," he said.
"That potential could include the ability to facilitate data sharing between agencies, and unlock policy and service gains of big data innovation, whilst protecting the fundamental human right to privacy.
"That is a great prospect, and one worth pursuing."
The Pilgrim-hosted discussion comes after Australian Attorney-General George Brandis introduced legislation into the Senate last month that criminalises the re-identification of de-identified datasets that are collected and published by the Commonwealth.
"De-identification may prove to be an effective way of protecting the personal information of individuals in large data sets," Pilgrim said. "In doing so, de-identification could support large data-gathering projects by building community confidence that personal information will be protected."
Pilgrim said a common understanding of de-identification standards is yet to be reached, a view shared by all seven on his panel colleagues. However, to Gemma Van Halderen, this is part and parcel of her day-to-day duties at the Australian Bureau of Statistics (ABS) as the GM of Strategy and Partnerships.
Van Halderen is working in an area she calls "official statistics", where de-identification means removing personal identifiers like names or addresses. She said, however, that removing names or addresses is not enough for her business.
"In the statistical land, we actually call that secrecy or confidentiality. In other sectors it's called anonymisation," she said. "In the case of the ABS, we actually not only uphold and respect the Privacy Act, but we also have our own legislation. We also have to protect secrecy ... we actually have this whole gamut of things that we have to do."
Lots more here:
Second we have this:

Is data de-identification a myth?

By Paris Cowan on Nov 16, 2016 9:30PM

Experts lock horns in Canberra.

A schism has opened up between Australian privacy advocates and the research community over what level of risk the public will stomach in pursuit of benefits hyped by open data champions.
The contest threatened to boil over on Wednesday morning in Canberra, where the Office of the Australian Information Commissioner hosted experts to wrestle over the issue of successful data de-identification.
In one corner, cryptologist and privacy champion Dr Vanessa Teague said she was "skeptical" that any method of de-identification exists that could guarantee the safety of sensitive health or welfare data sets.
Teague was the researcher who alerted the Department of Health earlier this year when she found clinician IDs could be extracted from a Medicare claims database she claimed was weakly de-identified.
"It is a myth that we have an algorithm that works," she told the delegation.
But her view was panned by Canadian de-identification expert, Dr Khaled El Emam, who countered that decades of statistical and computer science research has produced sophisticated anonymisation models and risk metrics.
"We have a lot of knowledge about what works and what doesn't work," he said.
The very concept of 'privacy' was thrust into the tug-of-war, as experts on all sides contested what level of risk could feasibly earn the label "safe".
Lots more here:
Reading these two articles it seems to me that the way we should approach these issues is by no means settled, while it is also clear there is a fair bit of experience and expertise out there.
For me what needs to happen is that we need to hasten slowly with these data releases and learn as we move forward extracting the benefits that can be obtained from the use of such data.
David.

No comments: