Summary: Every organization that processes data about any person in the EU must comply with the GDPR. Newly published GDPR Guidelines clarify that whenever an organization makes a decision using machine learning and personal data that has any kind of impact, a human must be able to independently review, explain, and possibly replace that decision using their own independent judgment. Organizations relying on machine learning models in the EU should immediately start planning how they are going to deliver a level of machine model interpretability sufficient for GDPR compliance. They should also examine how to identify whether any groups of people could be unfairly impacted by their machine models, and consider how to proactively avoid such impacts.
In October 2017, new Guidelines were published to clarify the EU’s GDPR (General Data Protection Regulation) with respect to “automated individual decision making.” These Guidelines apply to many machine learning models making decisions affecting EU citizens and member states. (A version of these Guidelines can be downloaded here—for reference, I provide page numbers from that document in this post.)
The purpose of this post is to call attention to how the GDPR, and these Guidelines in particular, may change how organizations choose to develop and deploy machine learning solutions that impact their customers.
Every organization that processes data about any person in the EU, including companies based outside the EU, must comply with the GDPR or risk an enforcement action (possibly including fines) by EU regulators. This is true whenever personal information is processed, even when no money changes hands, and even if EU citizens aren’t the intended targets of the data processing services. In addition, vendors subcontracting certain data processing services must provide their services in a compliant way in order for any organizations that use them to be compliant.
On a foundational level, the GDPR holds human beings accountable for machine decisions involving personal data. What is more, the regs give people who believe they have been harmed by machine learning decisions the right to an appeal. A human who works for the organization must be able to provide a “meaningful explanation about the logic involved” for any machine model decision (Guidelines, p. 9), then confirm or modify/reverse the decision.
Three elements within the new Guidelines jumped out at me because of their potential impact on how organizations develop and deploy machine learning solutions.
First element: Interpretability. The new Guidelines don’t provide a clear definition of what constitutes a “meaningful explanation”. But ultimately the Guidelines require a fairly demanding form of interpretability (also known as “understandability” and “explainability” in machine learning circles) of the machine learning models being used.
What’s Important About “Interpretability”? With machine learning models the “why” of a machine decision (as distinguished from the “how”, which is the math and architecture) usually can’t be explained in a way that enables a human to decide whether the machine’s own decision was justified according to some established standard.
This inability to “understand” machine decisions wouldn’t be important if a machine learning model could be trusted completely (imagine a scenario where machine decisions are almost always faster and more accurate than human decisions, perhaps for smart traffic light timing), or if nothing would be lost when a “bad” decision was made (imagine an art installation showing nothing but abstract geometric patterns).
But, much like decisions made by humans, machine models make mistakes, some of which (e.g. being turned down for credit) can have a negative impact.
Initially, the Guidelines say that organizations using machine model decisions “should find simple ways to tell the data subject about the rationale behind, or the criteria relied on in reaching the decision without necessarily always attempting a complex explanation of the algorithms used or disclosure of the full algorithm,” and that this should be done by “provid[ing] details of the main characteristics considered in reaching the decision, the source of this information and the relevance” (Guidelines, p. 14). An example is given of a car insurance company explaining why it makes sense for them to use an applicant’s driving record to make policy decisions (because unsafe driving is correlated to accidents), without delving into how precisely that concept was applied in a specific instance.
In addition, if called upon to do so, the organization would have to provide an expert to explain the math (Guidelines, p. 29), if possible without “revealing trade secrets or intellectual property” (Guidelines p. 24).
With this wording, black box models (decisions where the “why” of decision logic is indecipherable to humans) may be OK as long as a human can describe the features (types of data) and math (if an expert is called in) that produced the decision. You know the old saying about making sausage? As long as you know there was a pig on one end of the process and a sausage on the other, you don’t want to know how the one became the other. This standard seems consistent with the current state of the art. Thus, it should (at least temporarily) lead to sighs of relief for organizations invested in highly efficient machine learning models that don’t lend themselves to human interpretation.
This reassuringly flexible guidance strikes me as a bit of a sleeper, however, because the regs also require a human to be able to independently review and possibly overturn any machine model decision (“Any review must be carried out by someone who has the appropriate authority and capability to change the decision” – Guidelines, p. 15). How can any person reliably assess the merits of a specific machine learning decision—on the one hand, complying with the Guidelines by providing more than a perfunctory review, on the other hand, advancing the organization’s business objectives—without understanding the chain of logic being used by the machine model well enough to duplicate it? For example, how can an employee of a lending institution review and potentially reverse the decision of a complex machine learning model which decided to deny a loan application without understanding why the loan was denied?
Certain methods have been developed which enable people to retroactively speculate about what a machine model “was thinking” when it made a particular decision (these include LIME and Stripe’s explanation model for random forests as explained by Sam Ritchie). Although such methods have been characterized as merely providing seeds for post-facto justification, not the exact logic of a particular decision (see Zachary Lipton’s excellent paper that touches on this topic), such methods may still be adequate to help customer service reps explain and review (and potentially reverse) decisions made by machine models. However, these interpretability tools are seldom used at present, and may not suit certain machine models. Which is to say, organizations relying on machine models in the EU should immediately start planning how they are going to deliver a level of interpretability sufficient for GDPR compliance.
It’s interesting to anticipate how this will play out over time after the regs go into effect. As a small number of adversely impacted people begin to “appeal” machine learning decisions, customer support teams will begin interpreting (provide educated guesses) and, in some cases, substitute their own judgment for the decisions of their organizations’ machine models. As the number of individual appeals goes up, the cost to the organization goes up, both in terms of human labor and in terms of sub-optimized decisions.
Worst case scenario could mean an enforcement action if the EU finds an organization has failed to comply with the letter or spirit of the GDPR based on customer complaints. It could also mean high costs in terms of human labor, customer dissatisfaction, and ad hoc business decisions made by poorly informed customer service staff. Starting now, the question for organizations using machine learning models is where and how much to invest in preparation for anticipated interpretability demands down the road.
Second element: Protection for Vulnerable Groups. I think it’s significant to note that someone doesn’t have to be a member of a what under US law is considered a protected class (referring to protection against discrimination by virtue of one’s race, gender, age, etc.) in order to seek protection under GDPR. It’s enough to demonstrate that they are part of a “vulnerable” group of people, for example, the Guidelines offer the example of people struggling to pay bills who are being targeted with casino ads (Guidelines, p. 11).
I’m not an expert in EU law, so I don’t have an understanding of how this might play out in the EU, but in the US, this would have implications in areas like healthcare and lending where certain groups tend to be singled out for differential treatment (eligibility, pricing) because of their vulnerabilities (such as the traditional handling of pre-existing medical conditions in healthcare, or the incredibly high interest rates of pay-day loans). It will be interesting to see what, if any, kinds of claims are made against US-based businesses in the EU under this provision.
And while I’m posing this from a risk management point of view—the steps an organization must take to limit its exposure to enforcement actions—the flip side is also worth considering. Namely, this is an opportunity for corporate social responsibility. Just as many organizations make commitments to social justice in their employment practices, and in the communities where they do business, in order to put their corporate values into practice they may adopt policies of fairness in automated decision-making that go above and beyond legal minimums. Indeed, before making a commitment to continuously monitoring machine learning models for unfair impact on protected (or unprotected) groups, organizations must first decide and declare what they wish to protect against. Organizations can begin to deliberately “bake-in” their ethical values (to paraphrase Sam Ritchie) within their machine learning models.
Third Element: Algorithmic Audits. The Guidelines also recommend (but stop short of requiring) that whenever an organization processes personal information it has a proactive obligation to monitor for whether discriminatory impact is caused by their machine learning models (Guidelines p. 30). Specifically, the Guidelines reference “algorithmic audits” and “testing the algorithms used and developed by machine learning systems to prove that they are actually performing as intended, and not producing discriminatory, erroneous or unjustified results”. In due time this recommendation should contribute to an increase in training, hiring, and consulting, for people preparing and delivering algorithmic audits and related services (Cathy O’Neil, of Weapons of Math Destruction fame, has founded such a consultancy).
When Can Decisions Be Automated? Under GDPR, decisions are considered “automated” if no human is involved in arriving at the decision, or if a human customarily rubber stamps a machine model’s decisions without taking other factors into consideration. GDPR prohibits a fully automated decision unless:
• The decision has no significant effect (example: pure research that individual consumers are unaware of), OR
• It’s a business necessity (example: the volume of data is too high for human decision makers), OR
• An EU government has specifically approved it (example: monitoring for tax evasion), OR,
• The individual consumer explicitly consents (the process for obtaining this consent is complex enough to be the subject of its own Guidelines), AND,
• Anyone affected by an automated decision can appeal the decision to a human who is able to provide an explanation and has the power to overturn the machine decision.
Drawing from these three elements, here are my preliminary thoughts about what organizations using machine learning to process personal data should do to prepare for GDPR.
1. Take an inventory of all automated decisions that customers will have direct or indirect exposure to. This includes:
• decisions that determine what types of personalized content (web landing pages, advertisements, promotional offers, direct mail pieces) each customer will be exposed to;
• all personalized determinations (eligibility for services, variable pricing) that will impact them;
• all automated decisions of either of the first two types performed by vendors (for example digital advertising vendors, or vendors providing credit ratings or other financial background information).
2. For each of these decisions:
• define how information about every individual decision (for example, a decision to deny credit to a particular individual), at a level of detail that is sufficient for customer support representatives to make their own independent decisions, can be retroactively routed to customer support at the request of any customer;
• define how an audit can be conducted (see above), including an examination of what groups of people might be impacted who might be classified as “vulnerable” under EU law;
• decide whether
(a) the current level of insight is adequate, or,
(b) it will be necessary to switch from the current machine model to a more clearly interpretable model (for example, Bonsai’s machine teaching and decomposability methodology is designed for interpretability), or,
(c) an interpretability system (such as LIME) can be spliced into the business process, piggy-backing on the existing machine model, to provide the customer service team with insights necessary to review machine learning decisions.
3. If an organization doesn’t intend to comply with GDPR, it must take steps to prevent its digital presence from reaching citizens of the EU (via IP blocking, etc.) in order to avoid unintentional impact.
Additional Notes:The Guidelines aren’t yet final, and are open for comment until November 28, 2017.
Obviously there are many other issues concerning GDPR compliance, such as obtaining consent to use personal data, that are not covered by this post.
The recommendations I make above will overlap with an organization’s DPIA [Data Protection Impact Assessment] obligations under the GDPR.