Is ChatGPT eliminating the need for visual interfaces (and UX design)?

ChatGPG introduces an effective conversational UX that could surpass visual interfaces for specific tasks.

Jacopo Cargnel
UX Collective

--

Conceptual image showing an astronaut surrounded by interfaces
Image generated by Prompthunt. AI interpretation of the concept “conversational interfaces vs. visual Interfaces”. Liked the “Blade Runner” vibes here.

ChatGPT and other large language models based are now “the big thing” and promise to revolutionise most of the activities we do every day digitally. As UX professionals, though, our focus is on users, and we believe that any innovation should provide value and solve problems for the intended users. This concept of “value” is essential in determining the usefulness and innovation of a product or technology.

In recent UX industry news, there has been a trend towards utilising ChatGPT for tasks such as generating content for design mock-ups, conducting user research (!), and creating compelling copy. While these advancements can enhance designers’ experiences and productivity, our primary concern remains the users themselves. We want to ensure that any implementation of ChatGPT or similar technologies truly improves the user experience and brings them closer to the value they seek.

In the book “Innovare con il Design: Il Caso del Settore dell’Illuminazione in Italia” (F. Zurlo et al., Paper, 2002), “innovation” (in a product, artifact or interface, …) is defined as a positive difference compared to existing similar products. This delta is based primarily on solving user problems. We could say that this attribute of usefulness is a fundamental condition to define that a product, a feature, a technology, is innovative. Obviously, there are many innovations in the field of pure research that do not have an immediate impact on users, but every conducted research, at the very end of its path, is supposed to solve a problem for someone. Otherwise, what would be its purpose?

Therefore, the overarching question concerning technologies like ChatGPT is for me:

How (and where) can ChatGPT, or similar technologies, improve the user experience and introduce a positive difference that brings users closer to the value they seek?

ChatGPT and interfaces.

Looking specifically from the perspective of user interface, which is of utmost interest to UX professionals, the question becomes: What benefits can ChatGPT bring compared to existing interface types? In what sense can we define it as innovative?

ChatGPT represents an example of a conversational interface. Conversational interfaces “allow people to interact with software, apps, and bots like how they interact with real people. Using natural language in typing or speaking, they can accomplish certain tasks with ease.” (from What is a conversational interface?” By Jesse Martin, Zendesk Blog).

Woman interacting with a laptop computer through a microphone.
Image generated by Prompthunt. “Human interacting with computer with voice”. Makes sense.

These interfaces have existed in two main forms since the early 2000s:

  • Chatbots: These utilize predefined answers and pathways for common questions, often employed in troubleshooting and customer support scenarios.
  • Voice assistants: They allow users to interact with interfaces through natural speech, enabling the execution of common predefined tasks. Even before the advent of popular voice assistants like “Alexa,” early cell phones featured voice commands for initiating calls. Today, this interaction possibility is integrated into modern smartwatches, car infotainment systems, and smart TVs.

Although the previous applications of conversational interfaces have been somewhat limited and often disappointing for users, technologies like ChatGPT now promise to elevate this type of interaction to the next level. Unlike limited scripted approaches in fact, ChatGPT can generate responses based on domain knowledge, similar to how a real person would respond. With this significant advancement in conversational interaction, do we still require visual interfaces?

In his recent article “AI Is First New UI Paradigm in 60 Years,” Jakob Nielsen himself defines the interaction introduced by ChatGPT and similar AI technologies as “intent-based outcome specification”, which represents the latest and emerging interaction model. In this model, the user communicates the desired result to the computer without specifying the steps to achieve it. This approach can be incredibly powerful and applicable across the entire spectrum of digital applications, including email clients, online banking, informative websites, specialised cloud-based software, social network platforms, internet browsers, and search engines. On the other hand, visual interfaces are categorised as one of the types belonging to the second paradigm, the “command-based interaction paradigm.”

So, will visual interfaces still be necessary to operate these systems, or will everything become “conversational”?

User interfaces serve as a medium between users and machines, and the more immersive an interface is, the more user-friendly it becomes. Immersive interfaces mimic human behaviors, gestures, interaction patterns, and mental models, aligning with how humans naturally act and requiring little to no learning. They offer immediate, user-centered, and intuitive experiences. (They are defined immersive exactly because users feel no relevant interruption in the relation with the machine; it´s like being part of the same environment, and users feel immersed in the system. The best example of immersion is, by definition, Virtual Reality, but this concept fits to describe every interface that feels natural and reduces the “distance” that users perceive from the system).

Interfaces and tasks.

If we think about it, visual interfaces have been designed as imperfect ways to mimic human interaction. We can imagine every visual interface as an ideal dialogue between the user and the system. The system requires specific inputs from the user and, at the same time, provides guidance and explanations on what these inputs are, how to format them properly for the system to understand, where they can be found, what they will be used for, and so on. The system then provides users with the (hopefully) expected outputs.

Now, let’s consider three tasks that users commonly perform digitally:

1. Opening a bank account:

This task is complex and requires users to understand how the bank website is structured and find the necessary content to initiate the process. The system requires simple inputs like name, surname, etc., but also more complex information like data from ID card or other documents to be uploaded. It may even require extremely complex inputs like answers to FATCA or KYC questionnaires that demand attention, knowledge, specific documents, and sometimes detailed research.

This is not much different from a conversation in a physical bank branch, where a client or prospect wants to open a bank account and interacts with a teller who knows the procedures and what information needs to be provided. The valuable output (the bank account) is eventually issued.

UX design professionals often use something called a “Wizard” or “Guided Path” nowadays to break down complex tasks into smaller ones and guide users step by step on the required inputs. Wizards are essentially the digital version of the branch teller asking for client information and documents in the right sequence.

However, imagine if we could leverage ChatGPT or another AI based conversational interface for this task. We could ask in plain words what we need and let the system guide us.

Would wizards still be the most effective interface for this type of complex task? While an AI-based interaction (conversational interface) could sometimes perform better as an interface, a well-designed wizard (visual interface) still has its benefits.

2. Monitoring the transactions of the newly opened account:

In this second scenario, we want to track the balance trend of our bank account, how much we have saved, the days we spent the most, why, and the general spending trend.

Will ChatGPT provide a better interface for this task? Probably not. Tasks like these, which require comparison and understanding of a big picture, are better resolved with visual interfaces, graphs, and powerful representations that allow for multiple interpretations.

3. Catching a bus: obtaining fundamental information:

Let’s imagine we are getting ready for work and want to know the time of the next bus at our stop so that we can catch it. Currently, we have several ways to obtain this information, such as checking the timetable on the public transportation website or using apps like Google Maps. Again, we can use the public transportation app to retrieve this information.

Image showing a bus stop in a foogy day in country side, no people around.
Image from Pexels. Potential effects of a fail in the experience of retrieving the arrival time of your bus.

Would ChatGPT perform better than the interfaces we currently use?

In this case, ChatGPT might deliver much more value than a regular interface. It can provide a straight answer to a specific question, perform the task of retrieving the information, and provide a precise output.

Conversational vs Visual, when to use what, when to use both.

In these three scenarios, we saw how different types of user tasks may or may not benefit from a conversational interaction like the one with Chat GPT and other AI systems. Trying to generalise, we could say that:

  • Tasks like setting-up, onboarding, configuring, and whenever the system requires complex and structured inputs that are challenging to understand and digest, benefit from good conversational interfaces.
  • However, tasks that require interpreting complex outputs (such as observing graphs to spot trends and make decisions, checking emails, monitoring a social network feed) still need visual interfaces with good UX design principles.
  • Ultimately, tasks that require a precise and limited output, such as knowing the arrival time of a bus, checking if it will rain tomorrow, or searching for specific information through a search engine, work much better when the experience is handled by ChatGPT, which can mimic a real person and make the user’s path to the desired information much shorter.

If we consider the quantity of inputs and outputs involved in solving a task, we can summarise this as follows:

  • High quantity of inputs required: conversational interfaces work better.
  • Low quantity of inputs required (or very clear sequence): visual interfaces work better.
  • High quantity of outputs generated: visual interfaces work better.
  • Low quantity of outputs generated: conversational interfaces work better.
Diagram showing for Quantity & complexity of required inputs and provided outputs which interaction model is more effective.
Original Image created with Miro. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

It is also worth mentioning that conversational and visual interfaces can (and should) coexist in order to enhance the user experience by leveraging their respective strengths.

Let’s consider again the context of opening a bank account:

Rather than searching for the account opening option on the bank’s website, directly stating our intention to open an account through a conversational interface can be more immersive:

Wireframe of a banking site interface: homepage with a text box saying “How can SUPERBank help you today?”.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

When it comes to selecting the account that best suits our needs and budget, a visual interface outperforms a conversational one, because it enables comparison and analysis:

Wireframe of a banking site interface: accounts selection, three options.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

Once the process begins, a wizard interface proves to be the most efficient, particularly for collecting simple inputs:

Wireframe of a banking site interface: account configuration, personal data collection.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

However, when things become more complex, providing “humanised” assistance can greatly enhance the ease of use:

Wireframe of a banking site interface: account configuration, documents collection.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

When it’s time to monitor our finances, a visual graph presented through a visual interface provides the perfect representation, again allowing fast comparison and analysis:

Wireframe of a banking site interface: account page showing spending trends.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

But when we have specific questions, nothing can beat a LLM Artificial Intelligence (presented as conversational interface):

Wireframe of a banking site interface: account page showing spending trends and a conversational interface on the left side.
Original Wireframe created with the software Figma and the library Wireframe Kit by André Moura. It can be utilised without authorisation, provided proper attribution is given to the source (this article and / or my name and link to my Linkedin profile).

Conclusions

It is still early to fully understand the potential of ChatGPT and LLMs and it is crucial to embark on experimentation and exploration with these technologies. Hopefully, this brief essay can be useful for UX and product professionals in determining when and where to effectively utilise these technologies for user interface design, in order to provide users with real value. It should help understanding how to combine these technologies with traditional visual interfaces and determine which approach works best, based on the type of tasks and the required inputs & generated outputs.

To quote J. Nielsen once again, “Future AI systems will likely have a hybrid user interface that combines elements of both intent-based and command-based interfaces while still retaining many GUI elements.”

The greatest UX challenge now lies in understanding how to properly design these hybrid experiences.

--

--