On Thursday the White House study group issued their Big Report on Big Data, and it’s a Big Deal. (More)
It’s Non-Cynical Saturday and Some People said that means I’m not allowed to be grumpy. Of course, Some People could take a break from the
hot tub faculty lounge squirrel bath to do their jobs before they go down to the wine cellar library to spend the weekend drinking thinking on our motto of Magis vinum, magis verum (“More wine, more truth”).
But no. Some People wanted a week off, and they didn’t even tell me. They just asked me to fill in Wednesday and then Thursday and now it’s Saturday and I haven’t done any work on my thesis in 21st Century Political Nuttitude all week and I can’t even be grumpy about it because here at BPI Campus all Saturdays are Non-Cynical. So I’m not grumpy. Dammit.
In case you’re curious what Some People did with their week off, the answer is they spent most of it relaxing in and around the
hot tub faculty lounge squirrel bath. I know this because of Big Data. The twins, Nancy and Michelle, are taking biology and the teacher told them to observe specimens of another species for a week. The hot tub faculty lounge squirrel bath is right next to Árbol Squirrel, so there ya go. A whole week’s worth of careful notes on the activities of Some People.
Of course, Some People will probably get upset about that, even though Nancy and Michelle carefully anonymized their Big Data. For example, Wednesday afternoon’s entry does not say:
Professor Plum tossed the Frisbee into Árbol Squirrel. It got stuck. Professor Plum started throwing flip flops at it. Ms. Scarlet giggled until one flip flop almost hit our drey. Then she told him to stop. Also, she told him to wash his flip flops.
It just says:
Frisbee tossed into Árbol Squirrel. Thrown flip flops missed, to much amusement. Frisbee remains. Also, flip flops stink.
Even so, Some People may think that’s Too Much Information, especially if they start getting helpful hints in their email. Or ads from a soap company. It probably wouldn’t be quite as upsetting as your teenage daughter getting pregnancy product ads from Target, but still.
Stories like that are why Time’s Janet Vertesi decided to see if she could keep her pregnancy out of Big Data:
It all started with a personal experiment to see if I could keep a secret from the bots, trackers, cookies and other data sniffers online that feed the databases that companies use for targeted advertising. As a sociologist of technology I was launching a study of how people keep their personal information on the Internet, which led me to wonder: could I go the entire nine months of my pregnancy without letting these companies know that I was expecting?
Vertesi found that “opting out” of Big Data was a Big Challenge:
Avoiding this layer of data detectors isn’t a question of checking a box. Last year, many people were shocked by the story of the teenager in Minnesota whose local Target store knew she was expecting before her father did. Based on her in-store purchasing patterns tracked with credit cards and loyalty programs, Target started sending her ads for diapers and baby supplies, effectively outing her to her family. Like the girl in the Target store, I knew that similar systems would infer my status based on my actions. So keeping my secret required new habits, both online and off.
She didn’t mention her pregnancy on Twitter or Facebook and forbade her friends and family to mention it. She also used an anonymous web server, Tor, so she could shop for baby stuff without leaving identifying tracks. She bought baby stuff with cash or gift cards, and even that created a problem:
For months I had joked to my family that I was probably on a watch list for my excessive use of Tor and cash withdrawals. But then my husband headed to our local corner store to buy enough gift cards to afford a stroller listed on Amazon. There, a warning sign behind the cashier informed him that the store “reserves the right to limit the daily amount of prepaid card purchases and has an obligation to report excessive transactions to the authorities.”
When opting out of Big Data is deemed suspicious, that’s a real problem:
No one should have to act like a criminal just to have some privacy from marketers and tech giants. But the data-driven path we are currently on, paved with heartwarming rhetoric of openness, sharing and connectivity, actually undermines civic values, and circumvents checks and balances. The President’s report can’t come soon enough. When it comes to our personal data, we need better choices than either “leave if you don’t like it” or no choice at all. It’s time for a frank public discussion about how to make personal information privacy not just a series of check boxes but a basic human right, both online and off.
But big data raises serious questions, too, about how we protect our privacy and other values in a world where data collection is increasingly ubiquitous and where analysis is conducted at speeds approaching real time. In particular, our review raised the question of whether the “notice and consent” framework, in which a user grants permission for a service to collect and use information about them, still allows us to meaningfully control our privacy as data about us is increasingly used and reused in ways that could not have been anticipated when it was collected.
Big data raises other concerns, as well. One significant finding of our review was the potential for big data analytics to lead to discriminatory outcomes and to circumvent longstanding civil rights protections in housing, employment, credit, and the consumer marketplace.
The report also noted several noteworthy benefits:
Big data is saving lives. Infections are dangerous – even deadly – for many babies born prematurely. By collecting and analyzing millions of data points from a NICU, one study was able to identify factors, like slight increases in body temperature and heart rate, that serve as early warning signs an infection may be taking root—subtle changes that even the most experienced doctors wouldn’t have noticed on their own.
Big data is making the economy work better. Jet engines and delivery trucks now come outfitted with sensors that continuously monitor hundreds of data points and send automatic alerts when maintenance is needed. Utility companies are starting to use big data to predict periods of peak electric demand, adjusting the grid to be more efficient and potentially averting brown-outs.
Big data is making government work better and saving taxpayer dollars. The Centers for Medicare and Medicaid Services have begun using predictive analytics – a big data technique – to flag likely instances of reimbursement fraud before claims are paid. The Fraud Prevention System helps identify the highest-risk health care providers for waste, fraud, and abuse in real time and has already stopped, prevented, or identified $115 million in fraudulent payments.
And Big Data may have saved lots of lives during this week’s tornado outbreak:
Drawing on open government data sources, including Census demographics and NOAA weather data, along with their own demographic databases, Esri, a geospatial technology company, has created a real-time map showing where the twisters have been spotted and how the storm systems are moving. They have also used these data to show how many people live in the affected area, and summarize potential impacts from the storms. It’s a powerful tool for emergency services and communities. And it’s driven by big data technology.
You can read the entire White House Big Data Privacy Report here. Among their recommendations is to ask for more feedback on the Consumer Privacy Bill of Rights, a first step toward what Alex Pentland calls a New Deal on Data. In that 2009 paper for the World Economic Forum, Dr. Pentland proposed three fundamental principles:
- You have a right to possess your data. Companies should adopt the role of a Swiss bank account for your data. You open an account (anonymously, if possible), and you can remove your data whenever you’d like.
- You, the data owner, must have full control over the use of your data. If you’re not happy with the way a company uses your data, you can remove it. All of it. Everything must be opt-in, and not only clearly explained in plain language, but with regular reminders that you have the option to opt out.
- You have a right to dispose or distribute your data. If you want to destroy it or remove it and redeploy it elsewhere, it is your call.
Dr. Pentland says your data should be held in a Personal Data Store like openPDS:
Many of the initial and critical steps towards individuals data ownership are technological. Given the huge number of data sources that a user interacts with on a daily basis, interoperability is not enough. Rather, the user needs to actually own a secured space, a Personal Data Store (PDS) acting as a centralized location where his data live. Owning a PDS would allow the user to view and reason about the data collected. The user can then truly control the flow of data and manage fine-grained authorizations for accessing his data.
The proposed Consumer Privacy Bill of Rights doesn’t go quite that far. It would still allow companies to gather and store the data, subject to guidelines on how it can be used and with whom it can be shared. I guess that’s hardly surprising, as some of the companies who gave input on the Consumer Privacy Bill of Rights derive most of their income from gathering, sifting, and selling consumer data.
The White House Big Data Study Group specifically asked for ideas on how to strengthen that bill. I’m going to recommend they adopt Dr. Pentland’s New Deal on Data, and I hope you will too.
But I’m not going to push Some People’s Frisbee out of Árbol Squirrel. It makes a nice sun deck for our drey. If Some People want it back, they can relearn how to climb trees. Meanwhile, I’m going to go sit on my deck and relax with a few macadamias. Because it’s Non-Cynical Saturday and I’m not grumpy.
Good day and good nuts