UX researchers’ role in shaping AI products and services

Published in

UX Collective

8 min readJul 25, 2022

Today, Artificial Intelligence (AI) and, more specifically, Machine Learning are pervasive in our daily lives. From Facebook ads to YouTube recommendations; from Siri to Google Assistant; and from automated translation of device notice to marketing personalization tools; AI now deeply permeates both our work and personal lives.

This article is a compilation of three small pieces that advocate for renewed UX research efforts in ML apps.

With ML facing so many users, there is a case to approach the conception and design of ML-powered applications from a UX research perspective.

This lies on three main reasons:

Mental models of users haven’t caught up with how ML and AI truly work.

UXR can uncover existing mental models and help design new ones that are more suited to this new tech.

2. ML and AI can have an insidious and deep impact on all users’ lives

UXR reveals the myriad of intended and unintended effects of apps on people’s life — and help build more ethical AI.

3. ML and AI can have disparate impacts on individuals based on their ethnicity, religion, gender, sexual orientation:

UXR can also help address some of the sources of bias.

A shape made of blue dots looking like a brain outline on a dark background

How can UXR help build trust in AI systems and increase users’ engagement?

ML and Real Users

Users’ attitudes towards ML-powered apps are complex. Algorithm aversion has been well studied and documented:

In a wide variety of forecasting domains, experts and laypeople remain resistant to using algorithms, often opting to use forecasts made by an inferior human rather than forecasts made by a superior algorithm. Indeed, research shows that people often prefer humans’ forecasts to algorithms’ forecasts (Diab, Pui, Yankelevich, & Highhouse, 2011; Eastwood, Snook, & Luther, 2012), more strongly weigh human input than algorithmic input (Önkal, Goodwin, Thomson, Gönül, & Pollock, 2009; Promberger & Baron, 2006), and more harshly judge professionals who seek out advice from an algorithm rather than from a human (Shaffer, Probst, Merkle, Arkes, & Medow, 2013).

Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126. https://doi.org/10.1037/xge0000033

However, their research shows that this algorithm aversion phenomenon appears only once humans witness, or are made aware of, forecasting errors. In 2019, Logg J.M., Minson J.A., Moore D. A. demonstrated the contrary, that humans show an initial appreciation towards algorithm advice compared to fellow humans:

Our participants relied more on identical advice when they thought it came from an algorithm than from other people. They displayed this algorithm appreciation when making visual estimates and when predicting: geopolitical and business events, the popularity of songs, and romantic attraction. Additionally, they chose algorithmic judgment over human judgment when given the choice. They even showed a willingness to choose algorithmic advice over their own judgment.

Logg J.M., Minson J.A., Moore D. A. (2019). Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, vol. 151, March 2019, 90–103. https://doi.org/10.1016/j.obhdp.2018.12.005

ML and The Theory of Machine

One possible explanation still being investigated is “The Theory of Machine” (equivalent to the “Theory of Mind” for humans) that people operate with. The Theory of Machine, or more simply, mental models, as designers call it, is a series of assumptions humans make on how an application works internally.

One such assumption is the idea of a fixed mindset. Having a fixed mindset in psychology means you believe people have a certain amount of intelligence or skills, and they can’t do anything to increase that amount. Applied to a Theory of Machine, it means that people believe that a computer program output is fully determined by the initial input and not capable of learning or evolving.

The fixed mindset applied towards traditional software was appropriate for a long time. Your typical software, word processor, or spreadsheet was not capable to improve on its own and learn from its mistakes. The user might expect changes following an update, but otherwise, they expect the program to behave consistently over time.

When confronted with ML-powered applications, users continue to apply the classic fixed mindset mental model. So, once they experience what they perceive as the app making a mistake, they completely lose trust in the system’s ability to give accurate results. This is possibly what triggers the shift to algorithm aversion, after an initial appreciation.

Numerous ML apps present themselves as an assistant. They draw on the mental model of a relationship with a person, hoping to change the assumptions users make on how the program works.

This choice of mental model presents several challenges:

AI is not (yet) powerful enough to pass for a human: Users’ expectations are shaped by how they expect a human to respond, and users typically end up extremely disappointed, if not infuriated, by the AI behavior.
Even for their fellow humans, people tend to apply a fixed mindset and rarely allow for the possibility of growth and change in capabilities, at least not in any short time frame.
If users do have a growth mindset in relation to humans, meaning that they believe humans can improve provided they are given the opportunities to learn or they are taught what to do, this mindset doesn’t transfer well to AI assistants, because the learning modalities of humans and AI are so different.

Mental Model and User Engagement with ML Apps

What mental model should you use then? There is no one-size-fits-all answer to this question. This is where User Experience Research is required:

to uncover the existing mental models associated with specific tasks,
to experiment with multiple UI metaphors beyond the assistant, and
to help users adjust their existing mental models and expectations to the reality of ML-powered apps.

Outlines of multiple types of charts on a dark background

How to assess the impact of ML-apps on users and meet AI ethical standards?

Most ML algorithms are supposed to assist humans in their decision-making process, but not make the decision themselves. However, more and more, AI systems do not content themselves in making recommendations — they make decisions. This is the case from sifting through resumes to selecting which neighborhoods to patrol for the next police shift.

Given the scale AI systems operate, the potential impacts on specific individuals, groups, or even society, is deep and wide. While harmful human practices have always existed, they have evolved alongside social and legal guidelines that mitigate them. Not so much for AI-driven systems — yet.

Ethics in AI

Research on ethical AI has been conducted by academic researchers, industries, or governments, and they have produced a series of guidelines.

Microsoft lists 6 principles for a responsible AI: Fairness, reliability & safety, privacy & security, inclusiveness, transparency, accountability.

And a High-Level Expert Group on Artificial Intelligence published Ethics Guidelines for Trustworthy AI for the European Union in 2019 that explains:

Develop, deploy and use AI systems in a way that adheres to the ethical principles of: respect for human autonomy, prevention of harm, fairness and explicability. Acknowledge and address the potential tensions between these principles.

While the research on these high-level principles is well engaged, there is a gap between policies and guidelines and their implementation. When should data scientists concern themselves with these considerations? How can product managers integrate them in their roadmap? What practical steps can they take to ensure their app will be responsible and trustworthy?

The Partnership for AI, a non-profit that include industry leaders, universities and civil society groups acknowledges that all AI stakeholders need to be actively involved to prevent potential harm resulting from such research:

Through our work, it has become clear that effectively anticipating and mitigating downstream consequences of AI research requires community-wide effort; it cannot be the responsibility of any one group alone. The AI research ecosystem includes both industry and academia, and comprises researchers, engineers, reviewers, conferences, journals, grantmakers, team leads, product managers, administrators, communicators, institutional leaders, data scientists, social scientists, policymakers, and others.

How can UXR help avoid discrimination bias in ML models?

ML and Bias

Discrimination in ML occurs when a model makes a systematic difference between individuals of various ethnicity, religion, gender, or sexual orientation.

This can take the form of disparate treatment when data directly reflecting these characteristics is present in the data set. Or, disparate impact when such data is not there, but proxy are used (ZIP code for example).

Sources of Bias in ML

Often, the most obvious source of bias is in the data itself. Different data collection methods can result in a heavily skewed data set with some parts of the population vastly under- or over- represented. The exploratory phase of data analysis should reveal such issues. There are known tools and methods to identify and remedy a biased dataset.

But this is not the only source of bias:

Although the heart of a classification system is the training step, it is generally well understood by data scientists that training the actual algorithm comprises the minority of a data scientist’s time. The majority of time spent is generally focused on making somewhat subjective decisions, such as what events to predict, where and how to sample data, how to clean said data, how to evaluate the model, and how to create a decision policy from the algorithm’s output. Our taxonomy will show that discrimination can creep in at any one of these stages, so persistent vigilance and awareness is advised throughout the process.

Brian d’Alessandro, Cathy O’Neil, and Tom LaGatta. Conscientious Classification: A Data Scientist’s Guide to Discrimination-Aware Classification, Big Data, Vol. 5, №2, Jun 2017, pp 120–134. http://doi.org/10.1089/big.2016.0048

Even with good data collection and modeling processes, data reflects social contexts where discrimination practices are heavily entrenched. Tools exist to detect such systematic disparate treatment. How to fix it, though, becomes a political decision rather than a technical one, as it often implies a trade-off between accuracy and fairness. In this case, domain knowledge and awareness of the individual and social context is essential.

Experts from the World Economic Forum put it this way:

Achieving AI fairness is not just a technical problem; it also requires governance structures to identify, implement and adopt appropriate tools to detect and mitigate bias in data collection and processing on the one hand, and frameworks to define the necessary and appropriate oversight for each specific use case on the other.

UXR can help data scientists and expert increase their awareness of who their users are.

What then?

Data scientists are often far away from users. Domain knowledge experts might understand them better, but they often have a distorted view, too. UXR can help data scientists and expert increase their awareness of who their users are, including users’ behaviors, attitudes, expectations; the overall social context in which they live thanks to personas, empathy, and journey map. Our research-based insights help decision-makers reach the best possible outcomes from both a business and user perspective.

UX research and design can also help mitigate effects at the user level through specific UI elements and interactions.

UX research and design can also help mitigate effects at the user level through specific UI elements and interactions. When the app itself contextualizes the recommendations, user awareness of potential bias increases and they can monitor the impact, play with counter-factual, and give targeted and valuable feedback.

I’d love to hear your thoughts and connect with anyone interested in these topics.