This article is part of our series on the Quantified Self.
Quantifying ourselves means tracking the most sensitive kind of data: Our behavior and our location.
The conversations we have on a day-to-day basis about body tracking and the Quantified Self clearly show that most people are acutely aware of just how sensitive this type of data is. In fact, privacy implications tend to be one of the first issues to come up.
And this most certainly isn’t just an exaggerated reaction, but rather the sensible thing to think about. But let’s take it step by step.
Our most sensitive data
In theory, tracking and “optimizing” ourselves lead to better life decisions – in other words, to be more ourselves. However, disclosing exact data about our bodies and our whereabouts makes us vulnerable. The potential for abuse is immense.
Now there are several aspects to look at:
- What kind of data do we capture? This is largely determined by the types of services and devices is user.
- Where is the data captured? In most cases these days our data sets are stored in the cloud, not locally. This makes it easier to handle and backup, but also more hackable and commercially exploitable. In most cases, the cloud is the right place to put this data, but I’d imagine there’s a business case to be made to allow users to store data locally. Some people might even pay a premium.
- How do we share our data? The spectrum ranges from publishing our data sets in full, publicly and non-anonymously (this is roughly where Foursquare is), to highly anonymous aggregated data (medical data). More on that later.
And of course: Who is interested in our data?
Who wants our data?
There are quite a few players out there for whom our data is highly valuable – often in a straight-forward financial way.
Marketing departments are obvious in this context, as behavioral data creates opportunity to target potential customers, and build relationships. This could be used in “white hat” marketing, ie. in non-critical ways that actually creates value for consumers. It could also be done “black hat”, ie. in abusive ways. Think data mining gone awry.
Researcher of all flavors are interested in the kind of data sets created through self-tracking.
Governments might be tempted by location and mobility data, and try to match social graphs, location overlaps, group behaviors.
Insurance companies might see behavioral data as gold mines. Depending on your country’s legal and social security framework, health insurances might charge customers differently depending on their fitness regime, or smoking behavior, or regularity of their heartbeats, or the number of drinks per week, or even the numbers of bars visited. Maybe the types of meals eaten and calories consumed, or body weight. In this particular context the possibilities for use and abuse are endless.
Which brings us to…
Where we deal with sensitive data, trust is key. We, the consumers & users of web services, have collectively suffered privacy missteps by internet companies and over-zealous startups over and over again. (I’m looking at you, Facebook!)
While many of us have gotten used to sharing some aspects of our social graphs online, behavior data might be a different beast altogether.
This isn’t just a matter of degree, either. Here we have such clear abuse scenarios that insisting on control over our data simply becomes commons sense.
As researcher Danah Body states (highlights are mine):
“People should – and do – care deeply about privacy. But privacy is not simply the control of information. Rather, privacy is the ability to assert control over a social situation. This requires that people have agency in their environment and that they are able to understand any given social situation so as to adjust how they present themselves and determine what information they share. […Privacy is] protected when people are able to fully understand the social environment in which they are operating and have the protections necessary to maintain agency.”
Take Facebook for example. Recently, the company introduced what they call “frictionless sharing“. What that means is that all kinds of apps and services share your activity on Facebook – which song you’re listening to, which articles you’re reading, what you comment on etc. While the announcements drew quite a bit of criticism, we can only assume that the increased sharing activity will serve the company’s goals well: It will create more engagement data, at the cost of privacy and control. In other words, at the cost of agency.
What this means for companies operating in this field is this: It must be absolutely clear that they never, ever share your behavioral data with anyone without your clear consent. More bluntly: If you don’t actively share your data with anyone outside the company, they must not do it.
Here I’d even go so far as to suggest thinking about worst-case scenarios: Maybe it even makes sense for some companies not to even store your data but instead save it on the client side, so that they could not even be subpoenaed into giving up user data.
So, now that we have reduced the potential for abuse a bit, the next question is…
Who owns our body data?
Now here’s a question that’s both very simple and incredibly complex. As a guideline, the ideal we should always strive for is: We do! Nobody but ourselves.
However, it’s of course a bit more tricky. The service provider will need some of the data for their business case. Expect not to get anything for free. As the old internet proverb goes, you either pay or you’re being sold.
So we have data ownership and usage rights on one hand, and then we have data portability.
Let’s say we upload our running data into Runkeeper, track our meals with The Eatery, our sleep patterns with the FitBit or the Jawbone Up, and our social life through Foursquare. That’s already quite an array of services for even a basic tracking setup.
If history has taught us anything, then it is that web services don’t live forever. So we need to be able to get our data back when we need it. Better still, we should be able to move our data sets from one service to another, combine and mash them up, and allow different services to access our data in ways we can easily control. Easy is key here as we move towards mainstream adaption.
A simple data dump won’t necessarily do – the data has to be structured, maybe even standardized. Only then can we use it in new, interesting ways.
A hypothesis: Collecting behavior data is good. Sharing behavior data is better.
Bigger data sets allow us to derive more meaning, potentially even to create more data from what we have. The kind of data we talk about becomes immensely interesting once we start thinking in terms of scalability. Think two people comparing their fitness data is cool? A billion people comparing fitness data is cool!
To protect the individual, aggregated and anonymized data is the way to go here. Aggregated data sets still allow us interesting correlations while providing some level of protection. Although even aggregated data sets can be tricky: A study found that 87 percent of the people in the United States were uniquely identifiable with just three pieces of information: gender, date of birth and area code.
Sharing isn’t a simple process, either. As Christopher Poole pointed out in fantastic talk, most current models of online sharing assume that you have one identity and that users just need to be able to determine which bits of information to share. In reality, though, it’s much more complex. Our identity online should be like in the physical world – multi-faceted, context-dependent. It is not, in Christopher’s words, who we share to, but who we share as. This is hard to put in code, but it’s important that we think about, and hard.
Context is key in sharing. I might no be willing to publicly share my brain activity, heart beat and genetical information. However, I might be very willing to share parts of either of these with my doctor while in treatment – as long as I can be sure that the doc won’t pass it on to the insurance so they charge me extra for higher-than-average genetical chance of some kinds of cancer or Alzheimer’s. Today, most doctors or even larger clinics aren’t even able to make use of the type of genetical snapshot commercial services like 23andme, although this might change over time.
A duty to share?
In a radio interview recently we discussed privacy implications of the Quantified Self in general, and of DNA analysis in particular. What is safe to share, what is reasonable to share?
I’d like to flip the question around: What is ok not to share? Maybe we even have a duty to share?
Think of the medical research that could be done, and the treatments that could be found, if more of our behavioral data was openly available. If even just one major disease could be treated more effectively by discoveries made through body tracking and our shared data, would that not be worth it?
It’s a question we can’t answer, but I urge you to think about it. Maybe it’ll make you want to track and share some more.
Until then we encourage you all – both users and producers of Quantified Self services – to pay privacy implications the attention they deserve. So that at some point we can stop worrying and start building stuff that helps us be more ourselves.