The recent news that Amazon inadvertently created gender-biased software for screening job applicants is a significant wake-up call for all organizations using AI. The software, which used machine learning to rank incoming resumes by comparison to resumes from people Amazon had already hired, could have discouraged recruiters from hiring women solely on the basis of their gender. Amazon, of all entities, should have known better. It should have expected and avoided this. If this can happen to Amazon, the question we really need to ask is: how many others are making the same mistake?
If you work for an organization that uses data and artificial intelligence (AI), or if you are a consumer using data and AI-powered services, what do you need to know about data ethics?
Quite a bit, it turns out. The way things are going, it seems like every few days new ethics controversies, followed by new commitments to privacy and fairness, arise from the ways that businesses and government use data. A few examples:
• Voice assistants like Amazon’s Alexa, Siri, and “Hey Google” are everywhere, on smart phones, computers, and smart speakers. Voice commands satisfy more and more of our needs without resorting to keyboards, touch screens, or call centers. But recently one such assistant, while listening in on a family’s private conversations, recorded a conversation without the family’s knowledge and emailed that recording to a family member’s employee.
Part II of The Completely Non-Technical Guide to Machine Learning and AI
My previous post raised the question “what is machine learning and artificial intelligence (AI)?” and answered with a functional definition: computer systems that combine measurements and math to make decisions so complicated that until recently only humans could make them.
Part I: What is Machine Learning? Combining Measurements and Math to Make Predictions
The labels “machine learning” and “artificial intelligence” can be used interchangeably to describe computer systems that make decisions so complicated that until recently only humans could make them. With the right information, machine learning can do things like…
• look at a loan application, and recommend whether a bank should lend the money
• look at movies you’ve watched, and recommend new movies you might enjoy
• look at photos of human cells, and recommend a cancer diagnosis
Machine learning can be applied to just about anything that can be counted/measured, including numbers, words, and pixels in digital photos.
Takeaways from Sam Charrington’s May, 2017 interview with Jennifer Prendki, senior data science manager and principal data scientist for Walmart.com
I am very grateful to Sam Charrington for his TWiML&AI podcast series. So far I have consumed about 70 episodes (~50 hours). Every podcast is reliably fascinating: so many amazing people accomplishing incredible things. It’s energizing! The September 5, 2017 podcast, recorded in May, 2017 at Sam’s Future of Data Summit event, featured his interview with Jennifer Prendki, who at the time was senior data science manager and principal data scientist for Walmart’s online business (she’s since become head of data science at Atlassian). Jennifer provides an instructive window into agile methodology in machine learning, a topic that will become more and more important as machine learning becomes mainstream and production-centric (or “industrialized”, as Sam dubs it). I’ve taken the liberty of capturing key takeaways from her interview in this blog post. (To be clear, I had no part in creating the podcast itself.) If this topic matters to you, please listen to the original podcast – available via iTunes, Google Play, Soundcloud, Stitcher, and YouTube – it’s worth a listen.
Jennifer Prendki was a member of an internal Walmart data science team supporting two other internal teams, the Perceive team and the Guide team, delivering essential components of Walmart.com’s search experience. The Perceive team is responsible for providing autocomplete and spell check to help improve customers’ search queries. The Guide team is responsible for ranking the search results, helping customers find what they are looking for as easily as possible. Continue reading “Lessons in Agile Machine Learning from Walmart”
Summary:Every organization that processes data about any person in the EU must comply with the GDPR. Newly published GDPR Guidelines clarify that whenever an organization makes a decision using machine learning and personal data that has any kind of impact, a human must be able to independently review, explain, and possibly replace that decision using their own independent judgment. Organizations relying on machine learning models in the EU should immediately start planning how they are going to deliver a level of machine model interpretability sufficient for GDPR compliance. They should also examine how to identify whether any groups of people could be unfairly impacted by their machine models, and consider how to proactively avoid such impacts.
In October 2017, new Guidelines were published to clarify the EU’s GDPR (General Data Protection Regulation) with respect to “automated individual decision making.” These Guidelines apply to many machine learning models making decisions affecting EU citizens and member states. (A version of these Guidelines can be downloaded here—for reference, I provide page numbers from that document in this post.)
The purpose of this post is to call attention to how the GDPR, and these Guidelines in particular, may change how organizations choose to develop and deploy machine learning solutions that impact their customers.
Be it the core of their product, or just a component of the apps they use, every organization is adopting machine learning and AI at some level. Most organizations are adopting it an ad hoc fashion, but there are a number of considerations—with significant potential consequences for cost, timing, risk, and reward—that they really should consider together.
That’s why I developed the following framework for organizations planning to adopt machine learning or wanting to take their existing machine learning commitment to the next level.