The Canadian Government is moving towards treating its data as "open by default”. An exception to this default data that is "personal information" must be removed or masked before being disclosed as open data to any third party. However, these steps are not always enough to protect privacy in data, and information about individuals can be reidentified after the open data is released. This FAQ examines this conflict between privacy and open data.

Open Data

Open Data and Privacy

Privacy Risks of Open Data

Resources

Non-profits

The Open Definition”. This site discusses openness in a data and content manner.

Canadian Open Data Experience”. This website provides some detail on the annual Canadian hackathon that is organized to encourage innovation through use of the government open data sets

WhatDoTheyKnow.  This is a website that enables members of the public to ask for data of which the government has knowledge.

Patrick Meier, “How Crisis Mapping Saved Lives in Haiti, National Geographic” National Geographic Emerging Explorers (2 July 2012), online: National Geographic. 

What is College Abacus”, College Abacus. The nonprofit organization, College Abacus, gives prospective college students the ability to compare tuition and associated costs between different American Colleges using open data.

Academics

Ira Rubinstein and Woodrow Hartzog, “Anonymization and Risk” (2015) (New York University Public Law and Legal Theory Working Papers, Paper 15-36).  This paper outlines several variables that affect the risk level of company’s data sets, including: volume, sensitivity, class of recipient, and use of data.

Barbara Ubaldi, “Open Government Data: Towards Empirical Analysis of Open Government Data Initiatives” (2013) OECD Working Papers on Public Governance.  This article discusses several applications for open government data privacy, anonymization and reidentification.

Arvind Narayanan, Joanna Huey, Edward W Felten, “A Precautionary Approach to Big Data Privacy” (19 March 2015) Computers, Privacy, and Data Protection, at 7.  Among other things, this article discusses what anonymized data is, reidentification, and risks of privacy breaches.

Arvind Narayanan and Vitaly Shmatikov, “Privacy and security: Myths and Fallacies of ‘Personally Identifiable Information” (2010) 53(6) Communications of the ACM 24.  The article help define how PII is defined. Narayanan and Shmatikov look are the different ways various bodies define PII.

Paul Ohm “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization” (2010) 57 UCLA Law Review 1701. Ohm looks at the failings of anonymization. In one quote this article looks at how “data can either be useful or perfectly anonymous, but never both."

Teresa Scassa, “Privacy and Open Government”, (2014) 6 Future Internet 2. Scassa looks at the tensions between an open, transparent government, and the public’s interest in maintaining a level of privacy.

Michael Zimmer, “‘But the Data is Already Public’: On the Ethics of Research in Facebook” (2010) 12 Ethics Inf Technol 313.  Zimmer describes an incident in which the identity of FB profiles were compromised through reidentification.

GOVERNMENT SOURCES

House of Commons, Standing Committee on Government Operations and Estimates, Open Data: The Way of the Future, (June 2014), 41st Parliament, 2nd Sess.  This report discusses open data in the Canadian government and recommendations for protecting confidential information.

Office of the Privacy Commissioner of Canada, Interpretation Bulletins.  The Office of the Privacy Commissioner of Candaa publishes bulletins providing its non-binding interpretation of court decisions and findings related to PIPEDA. 

INDEPENDENT SOURCES

FOIMonkey, “Data Loss Incidents” Blog: WilmSlowFilmFanatics, (07 June 2015), online: WilmSlowFlimFanatics

< https://wilmslpwfilmfanatics.wordpress.com >.  This blog post details the release of personal information by the UK government on the WDTK website.

Tom Slee, “Data Anonymization and Re-Identification: Some Basics of Data Privacy” Whimsley (26 September 2011). < https://tomslee.net >. Among other topics, Slee discusses different techniques used to anonymize data.

Joel Gurin, “Secrets of Sentiment Analysis” Open Data Now (24 October 2013).  This article discusses the uses of sentiment analysis to derive value from publicly available tweets, reviews, etc.

Linda Tischler, “He struck gold on the Net (really)” Fast Company (31 May 2002). This article discusses Goldcorp’s plan to make their geological survey data publicly available in an effort to improve their gold extraction efficiency.

James Manyika, Michael Chui, Diana Farrell, Steve Van Kuiken, Peter Groves, and Elizabeth Almasi Doshi , “Open data: Unlocking innovation and performance with liquid information” (2013), McKinsey Global Institute Report , at 14 online: < http://www.mckinsey.com/>. This article discusses how nonprofit groups combined open data from numerous sources to organize relief efforts following the earthquake in Haiti.