Let’s suppose your organization (or some part thereof) has decided to take a more principled approach towards the data and/or algorithms it uses by establishing ethics-based ground rules for their use. Maybe this stems from concerns expressed by leadership, legal counsel, shareholders, customers, or employees about potential harms arising from a technology you’re already using or about to use. Maybe it’s because you know companies like Google, Microsoft, and Salesforce have already taken significant steps to incorporate data and AI ethics requirements into their business processes.
Regardless of the immediate focus, keep in mind that you probably don’t need to launch the world’s best program on day one (or year one). The bad news is that there is no plug and play, one-size-fits-all solution awaiting you. You and your colleagues will need to begin by understanding where you are now, visualizing where you are headed, and incrementally building a roadway that takes you in the right direction. In fact, it makes sense to start small—like you would when prototyping a new product or line of business—learning and building support systems as you go. Over time, your data and AI ethics program will generate long term benefits, as AI and data ethics increasingly become important for every organization’s good reputation, growth in value, and risk management.
In the following 4 part series about initiating a functional data and AI ethics program we will cover the basic steps you and your team will need to undertake, including:
Part I: What is Machine Learning? Combining Measurements and Math to Make Predictions
The labels “machine learning” and “artificial intelligence” can be used interchangeably to describe computer systems that make decisions so complicated that until recently only humans could make them. With the right information, machine learning can do things like…
• look at a loan application, and recommend whether a bank should lend the money
• look at movies you’ve watched, and recommend new movies you might enjoy
• look at photos of human cells, and recommend a cancer diagnosis
Machine learning can be applied to just about anything that can be counted/measured, including numbers, words, and pixels in digital photos.
Takeaways from Sam Charrington’s May, 2017 interview with Jennifer Prendki, senior data science manager and principal data scientist for Walmart.com
I am very grateful to Sam Charrington for his TWiML&AI podcast series. So far I have consumed about 70 episodes (~50 hours). Every podcast is reliably fascinating: so many amazing people accomplishing incredible things. It’s energizing! The September 5, 2017 podcast, recorded in May, 2017 at Sam’s Future of Data Summit event, featured his interview with Jennifer Prendki, who at the time was senior data science manager and principal data scientist for Walmart’s online business (she’s since become head of data science at Atlassian). Jennifer provides an instructive window into agile methodology in machine learning, a topic that will become more and more important as machine learning becomes mainstream and production-centric (or “industrialized”, as Sam dubs it). I’ve taken the liberty of capturing key takeaways from her interview in this blog post. (To be clear, I had no part in creating the podcast itself.) If this topic matters to you, please listen to the original podcast – available via iTunes, Google Play, Soundcloud, Stitcher, and YouTube – it’s worth a listen.
Jennifer Prendki was a member of an internal Walmart data science team supporting two other internal teams, the Perceive team and the Guide team, delivering essential components of Walmart.com’s search experience. The Perceive team is responsible for providing autocomplete and spell check to help improve customers’ search queries. The Guide team is responsible for ranking the search results, helping customers find what they are looking for as easily as possible. Continue reading “Lessons in Agile Machine Learning from Walmart”
Summary:Every organization that processes data about any person in the EU must comply with the GDPR. Newly published GDPR Guidelines clarify that whenever an organization makes a decision using machine learning and personal data that has any kind of impact, a human must be able to independently review, explain, and possibly replace that decision using their own independent judgment. Organizations relying on machine learning models in the EU should immediately start planning how they are going to deliver a level of machine model interpretability sufficient for GDPR compliance. They should also examine how to identify whether any groups of people could be unfairly impacted by their machine models, and consider how to proactively avoid such impacts.
In October 2017, new Guidelines were published to clarify the EU’s GDPR (General Data Protection Regulation) with respect to “automated individual decision making.” These Guidelines apply to many machine learning models making decisions affecting EU citizens and member states. (A version of these Guidelines can be downloaded here—for reference, I provide page numbers from that document in this post.)
The purpose of this post is to call attention to how the GDPR, and these Guidelines in particular, may change how organizations choose to develop and deploy machine learning solutions that impact their customers.
Be it the core of their product, or just a component of the apps they use, every organization is adopting machine learning and AI at some level. Most organizations are adopting it an ad hoc fashion, but there are a number of considerations—with significant potential consequences for cost, timing, risk, and reward—that they really should consider together.
That’s why I developed the following framework for organizations planning to adopt machine learning or wanting to take their existing machine learning commitment to the next level.
I discovered an interesting video recently while helping a client demonstrate how users of a SharePoint document management system can share information about the documents they are managing. The video is by Michael Gannotti, a technology specialist at Microsoft, and it apparently shows how Microsoft uses SharePoint 2010’s social media features in-house. The video covers other SharePoint 2010 features as well, but I found 2 segments particularly relevant.
Social Media features in SharePoint (from timestamp 6 minutes 49 seconds to 15 minutes 50 seconds):
people search — users can find people who are experts on the subjects they’re researching;
publishing — via wikis, FAQS, and blogs;
user home pages — users can fill out their own profiles, add types of content, see their friend and group feeds;
viewing other users’ pages — users can find out more about co-workers and their work;
adding meta-information — tagging, liking, and adding notes or ratings to alert others about the relevance of content to oneself, to a project, or to a topic; and,
publishing (blogging) options — users can post to SharePoint either via a rich web-based text authoring environment or direct from a Word document.
Using One Note For Sharing (from timestamp 17 minutes 34 seconds to 18 minutes 34 seconds):
New tools are putting the collaboration into “collaboration software” by creating a social media-inspired user experience for Enterprise knowledge management. But it’s taken a long time to get here.
Old school collaboration: floppy-net and shared drives
Until around ten years ago, when people talked about using software for “collaboration” in an Enterprise setting they usually meant transferring files point-to-point by email or handing off a diskette, aka “floppy-net” (or worse, by passing paper that would require re-typing). Advanced collaboration involved establishing shared “network drives” where documents could be stored in folders accessible to everyone on the local network. But under this “system” for collaboration, even when people devoted a significant amount of time to maintaining document repositories it could be difficult for others to find useful documents, or even know whether useful documents existed in the first place. Labeling was limited, document sets might be incomplete or out of date, authors, owners, or other contextual information might be unclear. Much like the internet before Google-quality search, folks could spend a lot of time browsing without getting any payoff.
Such collaboration systems are still quite common even though they aren’t very efficient because of the way that they rely on limited personal connections, memories and attention spans. In such a system the best strategy when hunting for a document is to ask around to try to figure out who might know where to find useful documents. People asked for help – if they have time – try to remember what documents are available, then either hunt through the repository themselves or point towards likely places to look. This system obviously doesn’t scale very well because there is a linear relationship between the number of documents being managed and the time and expertise required to manage them. Emails sent by people seeking help finding information can become a significant burden, particularly in the inboxes of the most knowledgeable or best connected. And because managing documents in this system is relatively time consuming and unrewarding, most people have little incentive to use or contribute to document management. Countless document repositories under this model suffered from neglect or abandonment simply because they were so impractical. And unless a critical mass of use and contribution is achieved, the appearance that a repository is abandoned or neglected in turn reduces the incentive of new or returning community member to participate. Instead people would rationally choose to “reinvent the wheel”, recreating documents or processes from scratch simply because the barriers to finding out whether what they need already exists are too high.
SharePoint and other web-like Information Management solutions
The rise of the internet has helped propel Enterprise collaboration forward, thanks in part to a new generation of internet-inspired collaboration software exemplified by Microsoft’s SharePoint. Sharepoint offers features such as alerts, discussion boards, document libraries, categorization, shared workspaces, forms and surveys, personal pages and profiles, and the ability to pull in and display information from data sources outside of SharePoint itself, including the internet (“web parts”). Access controls have also evolved, enabling people to have access to the files and directories that pertain to them, while limiting access to others. Meanwhile data storage capacity has exploded, costs have plummeted, and access speed has rocketed. Naturally, for most organizations the volume of documents being managed has ballooned exponentially. But we still need to ask: have knowledge management and collaboration scaled in proportion to the volume of information that is available and could be useful if more people could get their hands on it?
Notwithstanding features like Enterprise search, notifications, and improved metadata, many information management hubs are, in effect, still data silos where information is safe and organized but inconvenient to explore and share. In truth, despite powerful automated solutions now available, effective collaboration is still largely dependent on the quality of user participation.
Adoption and Engagement
Two key elements of effective collaboration are adoption, which corresponds to the percentage of team members who are able to use the system, and engagement, which corresponds to how many of them use the system regularly.
For a collaboration system to be effective it must maintain a critical mass of active users or risk becoming ignored and thus irrelevant. There’s a chicken and egg relationship here. A collaboration system must achieve and maintain a critical mass of adoption and engagement to be self-sustaining. Few people are going to adopt and engage if nothing of value is happening on the system because not enough other people have adopted and engaged. To attract this level of participation the experience should be easy (low frustration), useful (practical results are usually obtained), and emotionally rewarding (users experience satisfaction or even enjoy using it). Otherwise a collaboration system risks turning into a quiet information cul de sac no matter how impressive its technology.
Enter social media
Lessons learned from the social media phenomenon – examining the virtual footprints of the hundreds of millions of people using Facebook – are radically enhancing Enterprise knowledge management by promoting ease of use, practical results, and emotional gratification within collaboration systems. To get more information about this development I recently met with J. B. Holston, CEO of NewsGator, whose Social Sites solution adds Facebook-like features to SharePoint. Available for only 3-1/2 years, Social Sites’ committed customers already include Accenture, Novartis, Biogen, Edelman, and Deloitte, among others.
The basic idea behind Social Sites (my take, not necessarily J. B.’s) is that SharePoint users experience less frustration, find better quality material, and receive more emotional gratification when their SharePoint experience is more like Facebook. And because a social media approach to collaboration is both useful and gratifying, more people use the collaboration system – adoption increases – and they use it more often for more purposes – engagement increases. Teams get more done while having more fun. Additional benefits of a social media overlay on top of a standard SharePoint install is that it to draws attention to and promotes increased use of available resources and encourages users to find out about and experiment with collaboration options they weren’t using before, which may convert them into more valuable collaborators themselves.
Social Sites extends the functionality of SharePoint in a number of respects. The first generation of Social Sites added features including:
marking and tagging items;
providing custom streams of their “friends” activity updates (imagine keeping up with important developments with key people down the hall, in other regions or departments as they happen);
making it easier to move content in and out of SharePoint; and
making it easy for people to connect with the people who posted specific items with a single click.
The latest generation of Social Sites offers even more features (70 webparts in all are available), such as:
idea development (“ideation”);
the ability to follow people and events;
automatic updates when specific things of interest happen;
the ability to ask questions;
the ability to make requests; and
the ability to pass word along about things that are happening.
An open API makes it possible to customize activity streams open to groups of users that is also accessible from mobile devices. Social Sites also lends itself to community management and governance.
As icing on the cake, Newsgator also offers iPhone and iPad applications for Social Sites to enable everywhere, all of the time mobile interaction with SharePoint (including Social Sites social media features), completing the Facebook-like user experience.
For companies already using SharePoint, Social Sites allows them to upgrade their team’s collaborative performance without fundamentally reengineering their current knowledge management systems. For example the way information is stored and structured and integrations like workflows can be preserved. They can also avoid the costs of migration, retraining employees on new systems, or hiring specialists to manage the new systems. On the flip side, to the extent that Social Sites upgrades SharePoint to make it competitive with, or superior to, other collaboration options, the combination improves SharePoint’s attractiveness to companies considering swicthing over from competing knowledge management solutions. Finally, customers who seek to make this level of interaction widely available within their organizations may buy even more SharePoint licenses and invest in more customization.
Special thanks to J.B. Holston @jholston and Jim Benson @ourfounder for many of the ideas and information that found their way into this post.