PII (personally identifiable information) is data, when associated with other data, identifies information associated with a single person. Some obvious data associated with a single person include a full name, Social Security Number, driver’s license number, birthday, and birthplace.
PII is not something new. Before spam, company’s would buy massive directories of phone numbers and mailing addresses they would use to solicit people to buy their product. In this technology age, companies, individuals, and attackers, can simply scour the Internet, looking for data they can correlate with other data, which will in turn lead to PII. The issue with PII isn’t as much the fact that it’s data that can be associated with a single person, as it is the ease in which PII can be obtained and correlated with other data.
PII is difficult to fully define, because each dataset can lead to PII discovery. For example, data provided by health insurance companies to employers, even in aggregate, can lead to disclosure of PII when joined with employee rosters. For example, suppose an employee decides to join the companies wellness program. One of the criteria for joining is providing biometric data (ex. blood pressure, blood glucose reading, etc.). The data itself is not considered PII, because just the biometric data is nothing more than numbers related to the health of anonymous employees. It’s when the biometric readings are joined with other data (such as gender, age, race) that the biometric data becomes PII. Knowing that a 43 year old White male has a blood pressure reading of 165/85 could be enough for a company to know which 43 year old White male employee may have high blood pressure, if there’s only one 43 year old White male in the entire company.
The Department of Homeland Security (DHS) attempts to define PII as: “any information that permits the identity of an individual to be directly or indirectly inferred, including any information which is linked or linkable to that individual regardless of whether the individual is a U.S. citizen, lawful permanent resident, visitor to the U.S., or employee or contractor to the Department” (“Handbook for Safeguarding Sensitive Personally Identifiable Information”, p. 6). Almost any data, when coupled with the right PII attributes, could lead to the personalization of data.
The Electronic Frontier Foundation (EFF) points out in a September 11, 2009 article that data, no matter no insignificant it may appear, can lead to PII disclosure. The example given in the article shows that just a zip code and birth date was all that was needed to identify the data given with a single person.
Disclosure of PII to people or entities, without knowledge or consent, may be illegal, depending upon the circumstances. However, PII is also a way to identify one person from another. For example, without PII being disclosed, how can someone distinguish one John Doe from another John Doe on a social networking site? Without PII disclosure in some form, the answer is no one can distinguish one person from another in a social network.
Disclosure of PII must be a decision made by the owner of PII and the consumer of PII. It is the absolute right of the person to be given the reasons for PII disclosure, and how the PII is used. It is the responsibility of the PII consumer (government, company, etc.) to give full disclosure not only why the PII is being requested, but how PII is stored and protected from unauthorized disclosure. For such PII as biometric data, this is covered under laws governing the use, storage, and protection of PII (such as the Health Insurance Portability and Accountability Act, or HIPAA). For data shared in social media forums (such as Facebook), disclosure by the requesting entity may not be as evident. When in doubt about PII, ask the requesting entity why PII is being requested and ask how the PII will be stored and protected. If the answers are unsatisfactory, choose not to disclose PII to the requester.