9:10 am today

Data anonymisation tool 'difficult' to reverse - IRD

9:10 am today
Climate Change Minister Simon Watts outines the five pillars to the government's climate strategy, 10 July 2024.

Revenue Minister Simon Watts has repeated IRD's claim that the hashing process was irreversible. Photo: RNZ / Nick Monro

Inland Revenue (IRD) rushed through an impact assessment of its advertising strategy to share taxpayers' details with Facebook and other social media platforms last month, within days of telling the Minister it "continuously" reviews its processes to ensure safety.

This was only its second assessment in eight years, with the previous one carried out in 2016.

Both of the assessments rated its practice as a "medium" risk - although that rating assumed the way IRD anonymised people's details was effective, even when international research has debunked this for years.

RNZ revealed last month that Inland Revenue has been sharing hundreds of thousands of taxpayers' details - name, date of birth, city, postcode, country, phone number or email address - with Facebook, Instagram, Google and LinkedIn for years to target specific customers with its campaigns over the likes of student loan debt or GST being due.

This was done 30-50 times a month, sometimes covering up to half a million people at a time, it said.

"No financial or tax information is included," an IRD report said.

The department stated its anonymisation tool - called hashing - was fully effective.

Revenue Minister Simon Watts repeated this last month. "This process is irreversible," he told RNZ.

Yet OIA responses have since shown that the IRD's own impact assessment - done on 18 September - did not go this far.

"The hash or fingerprint is difficult to reverse back to the original data," it said.

Ted Linney of Wellington wrote to Watts, asking for a copy of the information the minister relied on.

This showed that within hours of RNZ running the story, IRD told Watts that it had zero concerns and that hashing was effective.

"Inland Revenue continuously reviews our processes to ensure we're safe."

This was from a two-page briefing to three of Watts' staff, and to IRD's acting commissioner and deputy commissioner, the Linney OIA showed.

At least two people have since written basic programmes to counter the IRD's assertion. They told RNZ they had easily been able to reverse hashing. There are also multiple online tools that do this.

Ross Boswell took seconds to de-hash phone numbers.

"These simple examples show that hashing is not securely protecting the personal information I am required by law to provide to Inland Revenue," he told RNZ.

More than 9000 people have used an online tool to find out if IRD shared their details with the social media platforms, since the Taxpayers' Union lobby group set it up last month in response to the RNZ story.

Other individual taxpayers have written to IRD asking about its use of their data, but IRD has replied that it could not tell them what ad campaigns it had included them in.

"Due to the large number of ad campaigns we do to ensure people are aware of their tax obligations and entitlements, it's not reasonably practicable for us to search to see which campaigns you may have been included in," it said.

"We don't hold this information in a way that enables it to be readily retrieved."

It had assurances from the big tech companies they did not use the taxpayers' data in other way, and was "seeking updated assurances... that the data is not added to user profiles or used in any other way", said a report.

David* told RNZ he would complain next to the Privacy Commissioner and Ombudsman.

"I find it inconceivable that IRD is unable to identify whether my information has been provided to Meta," he said.

There was no option for a taxpayer to opt-out from IRD sharing the data.

The tax department paused the practice in mid-September, and began a review.

"The Inland Revenue review is ongoing. No final decisions have been made," it said.

"The creation and uploading of customer lists is still paused."

The department told Watts it undertook continuous review to ensure safety, but it was not clear what that entailed.

The papers showed its officials did a "brief privacy analysis" in 2016, into sharing details with just Facebook.

It did not do another one when it expanded this to include Google, in 2019, according to the documents.

IRD only undertook a second privacy impact assessment on 18 September this year.

The "medium" risk rating in both the 2016 and 2024 assessments noted that some personal information was involved, but the risks could be dealt with satisfactorily. Only a "high" rating would have triggered a mandatory full privacy impact assessment.

The 2016 assessment said IRD was giving people's details to its advertising agency FCB, before it went to the platforms.

"Currently we provide our advertising agency (FCB) with personal information in the form of a list of email addresses and mobile numbers. This data is then 'hashed'," it said.

The 2024 assessment said Facebook and Google had advised IRD that no data or flag was added to individual user accounts as a result of being in an advertising campaign, and LinkedIn said something similar.

Hashing has become increasingly less effective amid huge technology advances that have also enabled the large social media platforms to do more with people's sensitive data.

There was no evidence of IRD doing further reviews to catch up in the interim.

Boswell said he wrote a "simple-minded programme" on his "not-very-powerful laptop" which decoded the hash of his phone number in 48 seconds, his wife's number in three seconds, and his birth date in "less than one-tenth of a second".

"If Meta [Facebook's owner] wanted to know whether other individuals live with us, then all they have to do is check whether the hashes of their addresses match the hash of our address," he said.

"Establishing the fact of cohabitation between individuals may be information that those individuals regard as deeply private."

The department told complainants that hashed data that was not matched on a platform was deleted, and any hashed data that did find a match, was "automatically deleted" right after that match.

"All matched and non-matched hashes are deleted from the social media platform's servers."

It has previously said it reviewed and approved each platforms' internal privacy principles, while at the same time noting it did not upload customised audience lists to "other social media such as X (formally known as Twitter) or TikTok". The papers suggest it relied primarily on a 2013 review of Facebook's controls by consultants PWC.

During the "pause", the department has been working with the Office of the Privacy Commissioner (OPC).

IRD would not say if that had identified any issues with hashing.

"When the review is complete it will be made public, along with any decisions Inland Revenue makes as a result. We expect this to be in the next few weeks," the department said in a statement.

The OPC said it was waiting on the outcome of IRD's review into hashing. "We can't say much else at this point."

* RNZ agreed not to use David's surname.

Get the RNZ app

for ad-free news and current affairs