Google has figured out that I shop for a lot of children’s clothing online, as my two children grow like weeds. Every time I launch a search, my banner ads link to brands that I have bought previously or similar brands that other consumers may have purchased. That is Big Data at work, as it is being used to identify other brands that I might be interested in purchasing based on shoppers with similar consumer profiles to mine. But let’s say that the next banner ad I receive isn’t for children’s clothing, but is instead for an all-inclusive Caribbean vacation. Well, I have never searched for Caribbean vacations, why would this be turning up? Again, this is Big Data at work, because patterns in human behavior have informed Google that people with small children are likely good targets for a quick getaway vacation. This is an example of the value of Big Data in predicting individual consumer behavior based on the behavior of many.
“Big Data” is the somewhat uncreative but accurate term for the process of collecting, culling, and categorizing of data from diverse sources on a massive scale. Through the application of algorithms, companies are analyzing Big Data in order to see patterns in human behavior, and (most commonly) using it to develop targeted, individualized marketing. The primary goal of Big Data is to learn from a large body of information things that we could not comprehend when we used only smaller amounts. Recent trends point to an increase in the use of Big Data, but there are several cautionary points from a legal and privacy perspective to consider.
What are the uses of Big Data, and who uses it? The potential benefits are wide ranging, but can be categorized as follows:
- Identifying Consumer Habit: Companies use Big Data to understand customer preferences, anticipate future behavior and develop individualized marketing campaigns.
- Identifying Patterns in Human Behavior: Big Data is being used to provide insight in human behavior (outside of just shopping). For example, Big Data has been used to identify infections in premature infants before symptoms appear by monitoring 16 different vital signs, and finding the correlation between minor and major problems.
- Increasing Efficiency: Companies and government entities are using Big Data to improve internal operations and reduce costs. For example, NYC is using Big Data to determine which building are most at risk for fires for overburdened city inspectors.
How is this different than the statistical analysis that companies have been engaged in long before the advent of the Internet? Plenty of organizations have been handling and sifting through massive amounts of data for years. Why is the use of Big Data on the rise with no sign of slowing?
- Lower Costs and Increased Accessibility: It is becoming easier to analyze, store, and access data through cheaper computer memory, powerful processors, and smart algorithms.
- Rise of the Smartphone and Social Media: The increased use of geolocation services on mobile devices and new ways of communicating and interacting are driving up the volume of data available for analysis.
- Massive increase in digitalization of information: As of the year 2000, one-quarter of all the world’s stored data was digital. Today, less than 2% of stored data is non-digital.
However, with the rise of Big Data, privacy and legal concerns have risen as well. Julie Brill, Commissioner of the Federal Trade Commission has voiced a number of concerns about privacy of consumers in the context of Big Data:
- “De-Identifed” Information Can Be “Re-Identified”: Data collectors claim that the aggregated information has been “de-identified,” however, it is possible to re-associate “anonymous” data with specific individuals, especially since so much information is linked with smartphones.
- Possible Deduction of Personally Identifiable Information: The non-personal data could be used to make predictions of a sensitive nature, like sexual orientation, financial status, and the like. FTC believes that collecting and using sensitive information requires more robust notice to the individual than non-personal information, which may not have been obtained as part of the initial consent.
- Risk of Data Breach Is Increased: The higher concentration of data, the more appealing a target it makes for hackers, and the greater impact as a result of the breach. The notification requirements to individuals in the event of a breach vary from state to state, but it can very quickly add up to a substantial cost to an enterprise. As a result of this potential cost exposure, companies may need to invest in increased security and insurance to protect their data assets.
- “Creepy” Factor: Consumers are often unnerved when they feel that companies know more about them than they are willing to volunteer. There is a sliding scale between tangible benefits that consumers appreciate (e.g., loyalty programs, rewards cards) and feeling that a company has stepped beyond personal boundaries (the anecdote of Target sending baby related coupons to a teenage girl before she had even told her immediate family members about her new bundle of joy still stands as the benchmark horror story of invasive marketing).
- Big Brother or Big Data: Municipalities are using Big Data for predictive policing, and tracking potential terrorist activities. Concerns have been raised that such uses could become a slippery slope to using Big Data in a manner that infringes on individual rights, or could be used to deny consumers important benefits (such as housing or employment) in lieu of credit reports.
The general legal concerns about Big Data are just as complex as the privacy concerns. Naturally, determining which issues are of greatest concern to you or your clients is dependent on your role in the relationship – are you the data miner, analyzer, or licensee? As the laws and best practices still evolving, here are a few key issues to analyze and address when you or your clients are considering the use of Big Data:
- What are your intellectual property rights in the data? Data analytics requires copying the data, so you will need to ensure that your ownership or license rights are sufficiently broad to cover the intended use with clear ownership rights in the data and any derivative work that is created from the data.
- Who bears responsibility for inaccurate data? If a party relies on a pattern developed as a result of analyzing inaccurate Big Data, which party bears responsibility for the results? Since Big Data’s very nature relies on a massive volume, there is almost always going to be some degree of inaccurate information included.
- Have you obtained the appropriate level of consent from the individual? Make sure that any consent that you have obtained from the individual to use data covers your intended purpose, including licensing that information to another party. As a best practice, advocate for full disclosure to the individual about your use of their data.
The legal risks engendered by using Big Data are also complicated by the myriad of state and Federal laws that are staking out regulatory territory with regard to privacy issues. While Congress mulls over a standard Federal law to address data breach notifications, there are a number of privacy related Federal laws that address the use of certain types of data and end users, such as HIPAA and the Children’s Online Privacy Protection Act. As noted above, the FTC has been vocal about its concerns with Big Data use, and has provided its own guidelines on data collection, including calling upon data brokers to provide consumers with more transparency on the use of their data. In addition, States are also weighing in with their own privacy laws (e.g., the California Online Privacy Protection Act). Finally, there are multi-country issues, as data privacy laws vary tremendously from country to country, with the EU imposing more onerous restrictions than the U.S. and higher burdens on companies in the event of a data breach.
Big Data can tell us many things, one of which is that perhaps we are not the mad cap, free spirits we might think ourselves to be. Our behavior in the aggregate is predictable. The benefits of deriving behavior patterns in Big Data are many, and there is the potential for even more as data analytics becomes more commercially available and commonplace. When considering the use of Big Data at your enterprise, advocate to 1) define clear ownership in the data with data collectors and individuals, 2) establish transparency to the individual with regard to the purpose and use of data, 3) tap into resources to monitor for State and Federal regulatory changes, and 4) avoid “creeping out” your customers.