Perspectives|Business At The AI Ethical Crossroads
The Biggest AI Ethical Issues Businesses Need To Address Now—And How
Recent headlines have drawn attention to problems caused by disproportionate
or lack of representation in the data sets selected and used to train machine
learning models. Such bias can lead to unfair outcomes—such as a hiring
algorithm that favors male over female applicants. It can come about when
underlying data are either not a good representation of the world or reflect
existing unfair patterns.
The impact of bias on corporate decision-making tools can lead companies to
miss out on hiring the best talent and bringing satisfied customers back for
more. It can result not only in lost revenue from misdirected ads,
scorched-earth headlines and loss of customer trust—but also in entire
submarkets missed.
For example, voice recognition software trained on English or majority
dialects may be suboptimal for minority-dialect speakers, including when it
comes to ascertaining whether a customer’s tone of voice is pleased or
displeased. Data curated online inherently give preference to active internet
users over those still coming into the digital world. User patterns, such as
what constitutes a meaningful interaction with a video recommendation tool or
spikes in certain search questions during elections, are more likely to reflect
the behavior and priorities of internet users over nonusers and the literate
over the nonliterate. Given that two-thirds of countries have more men than
women online and that a significant literacy gap still exists, this potentially
leaves out well over a billion prospective users.
Addressing bias early avoids having it built into key strategic decisions,
business plans and road maps, with an impact that could reverberate for years.
While bias is a challenging problem to solve, much good work is underway.
Companies can draw upon that work as they integrate machine learning tools into
executive decision-making.
The following highlights five promising practices:
Use An Analyzer Or Third Parties To Audit For Bias
Bias analyzers are increasingly coming onto the market. IBM Watson has built
an an analyzer and mitigation toolkit that makes a machine
learning model’s decision-making process transparent and detects potentially
unfair outputs. For example, if a model recommends denying a loan to a
woman-owned business, the analyzer might show the decision was based in part on
sensitivity for gender, that a disproportionate number of loans were denied to
women or that the training set lacked data on women-owned businesses. The
analyzer might query whether the factors used to generate the outcome are the
appropriate criteria for denying the loan, or make suggestions for additional
data to be incorporated.
Other analyzers include Facebook’s Fairness Flow, an internal tool to determine whether a
machine learning algorithm is systematically providing poorer results to
certain protected classes. Accenture has released a tool to help its customers assess artificial intelligence
models for discrimination and suggest adjustments, while also assessing
trade-offs in accuracy.
In some cases, the recommended course of correction might be to get an
entirely new data set.
However, tools alone won’t get the job done. As noted by Anna Bethke, head
of Intel Corp.’s AI for Social Good initiative, “A common misconception is the
belief that bias can be solved by algorithms, when bias is also a cultural
issue that requires cultural responses like dialogue and debate.”
Inhi Cho Suh, general manager of IBM Watson Customer Engagement, describes a
use case that aptly illustrates the need for diversity: A model built to handle
inbound risks around a supplier might be trained using the company’s
established playbooks and focused on optimizing cost and location. Other
corporate concerns, such as sourcing from a country with unsavory labor
practices, could inadvertently be overlooked.
For this reason, it’s critical that companies involve diverse human teams in
their process: internal or external auditors whose incentives differ from those
of the users of the algorithms and who can pose counterfactuals that challenge
underlying assumptions and results. Ensuring human judgment remains integral
can help developers both responsibly manage their AI tools and harness their
benefits.
Train Models With High-Quality Data Sets
Many concerns regarding unfairly biased algorithmic outcomes can be traced
back to the data set and the lack of inclusion and diversity in it. Researchers
in the AI community are calling for governments to provide access to
high-quality data sets.
In addition, the private sector is coming up with its own solutions. Mighty
AI, an Intel Capital portfolio company, is one of a handful of startups
labeling and cleaning customer data sets for training and validation of
computer vision models. Earlier this year, IBM released a data set of over 1 million facial images to counterbalance
the lack of diversity in those that are currently available.
In applying a machine learning algorithm to a particular data set,
businesses need to ask themselves: What demographic is intended to be captured
by the data set? What characteristics may be over- or underrepresented, and is
this model appropriate for this data set? If data are lacking in
representation, what additional data are needed to balance the data set? If
such data do not exist, consider alternatives, such as incorporating
synthetically created data or addressing the original problem in another
manner.
Another approach is to select data sets that limit biased inputs. A Palo
Alto-based company, still in stealth mode, is building deep learning models to
identify malignant tumors. It chose to train its models with images of actual
human tissue samples, rather than with doctor notes culled from electronic
health records, which may be inherently biased given that doctors write them
knowing their patients will read them.
Finally, selecting and scoping the right data set from a project’s outset
also helps mitigate the risk of biased outcomes. Instead of opportunistically
processing available data through deep learning models, executives should
identify the specific business problem to be solved. Only then can their teams
look for and scope the appropriate data set and, if one is not available,
consider alternatives.
Use Model Cards To Help Standardize Developer And User
Decision-Making
Google researchers have proposed a model card
to accompany every machine learning model. Their model card, like a nutrition
information label, discloses a standardized set of information to enable
machine learning developers and users to make informed decisions about the
appropriateness of a particular model for a user case, as well as to evaluate
and implement its outcomes.
Such information includes how a model was built, the assumptions made, the
primary intended use case and end users—such as labeling data for entertainment
versus enterprise solutions—and how the model might perform across different
cultural, demographic or phenotypic groups. Information should not only include
single categories such as “men,” “women” and “nonbinary” gender groups, but
also consider intersectionality, simultaneously looking at two or more
cultural, demographic or phenotypic characteristics.
Additional information could include model date, version, type, training
data and a matrix of error classifications: false positive, false
negative, false discovery rate and false omission rate,
and their relative importance for particular data sets. A model trained to
identify smiling older men, for example, would be more likely to produce false
omissions for a data set of mixed ages and genders, and it would likely not be
the appropriate model for the latter.
The researchers also propose including a “toxicity” score that rates the
model’s performance across sensitive groups, as well as ethical considerations,
challenges and recommendations. Preservation of data privacy should also be considered.
Transparency can help companies avoid making critical decisions based on the
biased results of a machine learning model. However, transparency itself is not
the end goal. As Dr. Amir Khosrowshahi, Vice President and AI chief technology
officer for Intel Corp., has noted, “Humans are not transparent, and
transparency can sometimes be harmful.” Transparency potentially allows a human
operator to game a model’s recommendations, undercutting the very safeguards it
was intended to provide. Other guardrails should accompany model cards and
increased transparency, such as diverse auditing teams that can evaluate what
aspect of bias is being solved for and the risk of introducing new biases.
Add Randomness To Recommendations
Companies whose business models revolve around making good
recommendations—books, movies, services—are constantly considering the balance
between serving the user more of the same and providing expansive, diverse or
novel choices. These recommendations are typically made based on collective
user behavior and personal characteristics. If “Avengers: Infinity War” is the
movie of choice for an enthusiastic online demographic, then this film may be
more likely to be served up widely to all, regardless of actual interest.
Patients from a poorer state might be shown more cost-effective treatments,
rather than expensive alternatives offered to patients from wealthier states.
Recommendation engines are one of the most heavily researched areas in
machine learning, with the Netflix Prize open competition taking place nearly a decade
ago. Neural network models developed by 20th Century Fox combine historical
customer data with temporal sequencing in a film—such as long versus short
shots—that conveys information about movie type, plot and characters.
Other encouraging developments in this space are underway. One rule of
thumb, for ease of application, is for companies to incorporate a specific
level of serendipity, say 10%, into their recommendations. Then, even if a
user’s prior clicks or demographics lead the person into a bubble of “Star
Wars” and action film recommendations—or more problematically, conspiracy
theory videos—the engine can also throw in some historical fiction and
documentaries, or in the case of patients, state-of-the-art treatments. To the
extent the user selects from the random offerings, his or her choice would
further improve the model’s overall effectiveness.
Weigh Human Bias Against Machine Bias
A machine learning algorithm that makes end-of-life predictions might not
fit a patient from a demographic that differs from the patients in the
underlying training data, which could be as broad as all patients in the
electronic health records. Certain patients from minority populations could be
underidentified for end-of-life care and palliative services. At first glance,
hospitals may hesitate to deploy such tools until this potential for biased
outcomes has been fully corrected—not an easy task.
However, as noted by Dr. Stephanie Harman, founding medical director of
Palliative Care Services and co-chair of the Health Care Ethics Committee at
Stanford School of Medicine, doctors are constantly making similar judgments
based on their professional experience—essentially, a doctor’s personal data
set of patients and case studies encountered in practice. Studies have shown
that physicians tend to under-refer patients for palliative care, for reasons
such as overoptimism, time constraints and inertia.
Before throwing the baby out with the bathwater, perhaps the more relevant
consideration for hospitals is not whether the machine’s output is biased, but
whether the machine augmenting the human decision reduces overall bias. Such a
tool, though still limited, can nonetheless provide important safeguards
against human biases.
Looking Forward
As companies increasingly rely on machine learning solutions to inform key
corporate decisions, AI ethical frameworks and risk management guidelines are
being developed at board levels. This is a long-term conversation with broader
implications.
Ultimately, bias in machine learning algorithms comes from humans.
Technology is uncovering latent biases deeply rooted in our history as a human
race, bringing them into the light in a way never brought before. Cognitive
biases, such as information biases, blind spots, confirmation bias and tricky
inherent biases, and their impact on machine learning models, have only begun
to be addressed.
As we seek to mitigate these to better serve the next billion customers, we
also have a unique opportunity to reflect on our own humanity and challenge our
human biases and assumptions—to create not only better AI, but a better world
as well.
About the Author
The views expressed in this article are those of the author and do not
necessarily reflect the views or policy positions of her employer.
Abigail Hing Wen serves as counsel to the Office of the AI CTO, Intel
Corp., focused on emerging AI technologies and the ecosystem. She also partners
closely with investors for Intel Capital’s AI investments and has worked with
over 100 Silicon Valley startups, from incorporation to acquisition or IPO. Her
debut novel is forthcoming in 2020. Twitter: @abigailhingwen.
Credit: Thomas Barwick/GettyImages
You may know us for our processors. But we do so much more. Intel invents at
the boundaries of technology to make amazing experiences possible for business
and society, and for every person on Earth.
Harnessing the capability of the cloud, the ubiquity of the Internet of Things,
the latest advances in memory and programmable solutions, and the promise of
always-on 5G connectivity and artificial intelligence, Intel is disrupting
industries and solving global challenges. Leading on policy, diversity,
inclusion, education and sustainability, we create value for our stockholders,
customers, and society.
No comments:
Post a Comment