10 min read
NOTE: While this is obvious to most people, I am restating this here for additional emphasis: this is my personal blog, and only represents my personal opinions. In this space, I am only writing for myself. END NOTE.
I am going to begin this post with a shocking, outrageous, hyperbolic statement: privacy policies are difficult to read.
Shocking. I know. Take a moment to pull yourself up from the fainting couch. Even Facebook doesn't read all the necessary terms. Policies are dense, difficult to parse, and in many cases appear to be overwhelming by design.
When evaluating a piece of technology, "regular" people want an answer to one simple question: how will this app or service impact my privacy?
It's a reasonable question, and this process is designed to make it easier to get an answer to that question. When we evaluate the potential privacy risks of a service, good practice can often be undone by a single bad practice, so the art of assessing risk is often the art of searching for the poison pill.
To highight that this process is both not comprehensive and focused on surfacing risks, I'm calling this process Privacy Postcards, or Poison Pill Privacy - it is not designed to be comprehensive, at all. Instead, it is designed to highlight potential problem areas that impact privacy. It's also designed to be straightforward enough that anyone can do this. Various privacy concerns are broken down, and include keywords that can be used to find relevant text in the policies.
To see an example of what this looks like in action, check out this example. The rest of this post explains the rationale behind the process.
If anyone reading this works in K12 education and you want to use this with students as part of media literacy, please let me know. I'd love to support this process, or just hear how it went and how the process could be improved
1. The Process
Collect some general information about the service under evaluation.
- Name of Service:
- Android App
- Policy Effective Date:
Pull a screenshot of selected app permissions from the Google Play store. The iOS store from Apple does not support the transparency that is implemented in the Google Play store. If the service being evaluated does not have a mobile app, or only has an iOS version, skip this step.
The listing of app permissions is useful because it highlights some of the information that the service collects. The listing of app permissions is not a complete list of what the service collects, nor does it provide insight into how the information is used, shared, or sold. However, the breakdown of app permissions is a good tool to use to get a snapshot of how well or poorly the service limits data collection to just what is needed to deliver the service.
Accessing contacts from a phone or address book is one way that we can compromise our own privacy, and the privacy of our friends, family, and colleagues. This can be especially true for people who work in jobs where they have access to sensitive information or priviliged information. For example, if a therapist had contact information of patients stored in their phone and that information was harvested by an app, that could potentially compromise the privacy of the therapist's clients.
Keywords: contact, friend, list, access
Virtually every service in the US needs to comply with law enforcement requests, should they come in. However, the languaga that a service uses about how they comply with law enforcement requests can tell us a lot about how a service's posture around protecting user privacy.
Additionally, is a service has no language in their terms about how they respond to law enforcement or other legal requests, that can be an indicator that the terms have other areas where the terms are incomplete and/or inadequate.
Keywords: legal, law enforcement, comply
Location information and Device IDs
As individual data elements, both a physical location and a device ID are sensitive pieces of information. It's also worth noting that there are multiple ways to get location information, and different ways of identifying an individual device. The easiest way to get precise location information is via the GPS functionality in mobile devices. However, IP addresses can also be mapped to specific locations, and a string of IP addresses (ie, what someone would get if they connected to a wireless network at their house, a local coffee shop, and a library) can give a sense of someone's movement over time.
Device IDs are unique identifiers, and every phone or tablet has multiple IDs that are unique to the device. Additionally, browser fingerprinting can be used on its own or alongside other IDs to precisely identify an individual.
The combination of a device ID and location provides the grail for data brokers and other trackers, such as advertisers: the ability to tie online and offline behavior to a specific identity. Once a data broker knows that a person with a specific device goes to a set of specific locations, they can use that information to refine what they know about a person. In this way, data collectors build and maintain profiles over time.
Keywords: location, zip, postal, identifier, browser, device, ID, street, address
Data Combined from External Sources
As noted above, if a data broker can use a device ID and location information to tie a person to a location, they can then combine information from external sources to create a more thorough profile about a person, and that person's colleagues, friends, and families.
We can see examples of data recombination in how Experian sorts humans into classes: data recombination helps them identify and distinguish their "Picture Perfect Families" from the "Stock cars and State Parks" and the "Urban Survivors" and the "Small Towns Shallow Pockets".
And yes, the company combining this data and making these classifications is the same company that sold data to an identity thief and was responsible for a breach affecting 15 million people. Data recombination matters, and device identifiers within data sets allow companies to connect disparate data sources into a larger, more coherent profile.
Keywords: combine, enhance, augment, source
Third Party Collection
If a service allows third parties to collect data from users of the service, that creates an opportunity for each of these third parties to get information about people in the ways that we have described above. Third parties can access a range of information (such as device IDs, browser fingerprints, and browsing histories) about users on a service, and frequently, there is no practical way for people using a service to know what third parties are collecting information, or how these third parties will use it.
Additionally, third parties can also combine data from multiple sources.
Keywords: third, third party, external, partner, affiliate
Social Sharing or Login
Social Sharing or Login, when viewed through a privacy lens, should be seen as a specialized form of third party data collection. With social login, however, information about a person can be exchanged between the two services, or taken from one service.
Social login and social sharing features (like the Facebook "like" button, a "Pin it" link, or a "Share on Twitter" link) can send tracking information back to the home sites, even if the share never happens. Solutions like this option from Heise highlight how this privacy issue can be addressed.
Keywords: login, external, social, share, sharing
This category only makes sense on services that are used in educational contexts. For services that are only used in a consumer context, this section might be superfluous.
As noted below, I'm including COPPA in the list of keywords here even though COPPA is a consumer law. Because COPPA (in the US) is focused on children under 13, there are times when COPPA connects with educational settings.
Keywords: parent, teacher, student, school, , family, education, FERPA, child, COPPA
Because this list of concerns is incomplete, and there are other problematic areas, we need a place to highlight these concerns if and when they come up. When I use this structure, I will use this section to highlight interesting elements within the terms that don't fit into the other sections.
If, however, there are elements in the other sections that are especially problematic, I probably won't spend the time on this section.
Summary of Risk
This section is used to summarize the types of privacy risks associated with the service. As with this entire process, the goal here is not to be comprehensive. Rather, this section highlights potential risk, and whether those risks are in line with what a service does. IE, if a service collects location information, how is that information both protected from unwarranted use by third parties and used to benefit the user?
2. Closing Notes
At the risk of repeating myself unnecessarily, this process is not intended to be comprehensive.
The only goal here is to streamline the process of identify and describing poison pills buried in privacy policies. This method of evaluation is not thorough. It will not capture every detail. It will even miss problems. But, it will catch a lot of things as well. In a world where nothing is perfect, this process will hopefully prove useful.
The categories listed here all define different ways that data can be collected and used. One of the categories explicitly left out of the Privacy Postcard is data deletion. This is not an oversight; this is an intentional choice. Deletion is not well understood, and actual deletion is easier to do in theory than in practice. This is a longer conversation, but the main reason that I am leaving deletion out of the categories I include here is that data deletion generally doesn't touch any data collected by third party adtech allowed on a service. Because of this, assurances about data deletion can often create more confusion. The remedy to this, of course, is for a service to not use any third party adtech, and to have strict contractual requirements with any third party services (like analytics providers) that restrict data use. Many educational software providers already do this, and it would be great to see this adopted more broadly within the tech industry at large.
The ongoing voyage of MySpace data - sold to an adtech company in 2011, re-sold in 2016, and breached in 2016 - highlights that data that is collected and not deleted can have a long shelf life, completely outside the context in which it was originally collected.
For those who want to use this structure to create your own Privacy Postcards, I have created a skeleton structure on Github. Please, feel free to clone this, copy it, modify it, and make it your own.