Qualitative smoke testing — a theoretical product research method

How to use a QA testing method to enhance the generative phase of product research.

Meghan Skapyak
UX Collective

--

A vector illustration of a hand writing a check mark on a clipboard paper, with a small woman in front of the clipboard holding a magnifying glass
source.

Sometimes digital experiences are so large and issue-ridden that you don’t even know where to start improving them. There are dozens of features and interactions you could possibly check, and you don’t want to go about usability testing without any semblance of organization.

I’ve been experimenting with combining a quality assurance testing method, smoke testing, with qualitative surveys to alleviate this issue and better orient your team before moving on to further stages of product research.

The qualitative smoke test is designed to thoroughly search through an existing digital experience for pain points by methodically checking each of its functions in order to determine where best to focus your research efforts on large projects.

What is smoke testing?

Smoke testing is a software acceptance testing process in which you test the most crucial aspects of a program for any “showstoppers” (i.e. issues so bad an aspect program will no longer function and is useless) in a rapid and organized format.

In my experience smoke tests are usually laid out in a spreadsheet, in which each major functionality is listed out and has a couple of cells laying out the steps needed to perform it. Testers will complete each step, reporting a “pass” or “fail” depending on whether that step causes an issue.

This organized format allows testers to fully verify the functionality of a build very quickly, as the focus on two actions (completing the task/step and writing the short and sweet result, sometimes with a comment if the result requires it) breaks testing huge systems up into tiny bits that teams can divide and conquer effectively.

For more detailed explanations on smoke testing and why it’s important to do in the software development process, check out this article by TribalScale Inc.

I should also note that this theoretical method lies somewhere between smoke testing and sanity testing. Sanity regression testing is done on a software build after bug fixes have occurred to verify those fixes, and is structured pretty similarly to smoke tests. Here’s a more in depth look at sanity tests by ImpactQA.

This theoretical research method is not meant to regress and verify previously uncovered usability issues, though, and is meant to identify pain points as part of the design acceptance process. Because of this, I’m sticking with the smoke testing name.

How can this apply to usability testing?

You want to take time usability testing with your research participants, and want to focus your efforts on the interactions that might be hurting your system the most. As stated earlier though, determining this in a huge system in which there are tons of pain points (some of which might still be hidden to your team) is difficult.

I was inspired to experiment with combining smoke tests with my product research when I first started on a redesign of my college’s student engagement mobile app and my team was in the generative stage of research. We didn’t know what we didn’t know, so we certainly had no idea what we should have been asking our research participants to do during usability testing.

It was then that I remembered my days working in quality assurance testing, and just how quick and effective smoke testing was at determining where the major issues were in the games I would work on. I determined that smoke testing had a great amount of potential in the field of product research if it was made qualitative, so I made my first qualitative smoke test spreadsheet.

We completed the smoke test within an hour, and it helped us determine where the most pain points were clustered in the functionality of the application. We found some bugs, but mostly our focus was on how each function and task associated with it felt to us.

This really aided in aligning our team in determining which functions and tasks we should be focusing on during our usability testing, streamlining what could have ended up an unorganized and chaotic process of guessing at our major pain points into something much more methodical.

The qualitative smoke test method is meant for these cases in which the pure scale of and issues within a digital product are a barrier in themselves. I want to make sure that teams are able to effectively align themselves before interacting with research participants, and this is one way to go about it.

Method inspiration

Two columns of text, one titled “Ordinal Likert Scale” with the text “How much do you agree or disagree with the following statement? ‘I found the website easy to navigate.’ Strongly agree. Somewhat agree. Neither agree nor disagree. Somewhat disagree. Strongly disagree.” under it. The other column is titled “Non-Likert Ordinal Scale” with the text “How would you rate your experience navigating the website? Excellent. Good. Neutral. Not that good. Terrible” under it.
Examples of an ordinal Likert scale and a non-Likert ordinal scale (source).

Apart from the level of detail and rapid testing process associated with smoke tests, I took inspiration from user surveys in order to translate this binary testing method into something qualitative.

I decided that a survey rating scale would be the best qualitative indicator that would be easy to judge and write out throughout testing, which gave me the choice between ordinal Likert scales and non-Likert ordinal scales. Here’s a review on Likert scales by Lauren Heartsill Dowdle in case you need it, as well as an article that covers ordinal data by Raghunath D.

I went with a non-Likert ordinal scale with the values of Excellent, Good, Neutral, Not that good, and Terrible since they best fit the ease of judgement and communication requirements stated above.

Using these values instead of a “pass” or “fail” for each stage of an interaction still allows testers to indicate where issues are, as well as making the process qualitative and introducing priority.

How to use this method — set up stage

A screenshot of the Google sheets qualitative smoke test spreadsheet, completely empty except for the column titles and a few placeholders.
The blank version of my qualitative smoke test template (source).

For ease-of-use, I’ve included a link to my qualitative smoke test template here as well as at the bottom of this article. Feel free to copy, edit, and use it for your projects!

The set up phase involves going through your product, identifying its major pages and setting up a tab for each, identifying the major functionalities therein, and the steps needed to complete them.

The functionalities go into the “Test Case” column of the spreadsheet, the steps and descriptions go into the columns after that, and you can fill in a short description of what is supposed to happen when you complete that step in the “Expected Result” column.

For example, if a homepage major functionality is contacting a business owner using a button in their navigation bar, steps for that might be “Press the ‘Contact Me’ button”, “Enter information in appropriate fields”, and “Press send” with their expected results being “Contact Me page opens”, “User can enter their information in the text boxes”, and “Contact Me page closes/message sends to business owner” respectively.

How to use this method — implementation

A screenshot of the Google Sheets qualitative smoke test template, completely filled out with data about the basic functionality, joining groups function, tab navigation function, and group page functionality of an application page.
A filled-out example of my qualitative smoke test template (source).

Once you have the spreadsheet set up, divide pages and functionalities however you deem appropriate within your team. Have your team test through each functionality and step as written.

For each step, have them rate the actual result of completion in the “Rating” column using the values Excellent, Good, Neutral, Not that good and Terrible depending on what their experience with it was.

In my template, conditional formatting is set up so that rating cells will turn dark green for Excellent, light green for Good, yellow for Neutral, light red for Not that good, and dark red for Terrible for ease of reference and analysis.

Testers can also indicate what the actual result is (this is more important when the task gets a Not that good or Terrible rating, my team just used a left angle bracket when the result was the same as expected), and leave any comments they feel are necessary for that step.

Analysis and interpretation

A cropped screenshot of the template with only the “Test Case” through “Rating” columns visible, with most text blurred out. The ratings are visible, and the function with the most negative ratings has a rectangle around it titled “Pain point”. Another function with all “Good” ratings has a rectangle around it titled “Works okay”.
A quick reference on how to identify pain points by looking at the “Rating” column (source).

Once everything has been tested and rated, you can now go through your pages to determine which functionalities are causing the most pain points. Functions with multiple steps that have been rated Not so good or Terrible have clearly caused some issues in the experience of your testers, and ones that have all Good and Excellent ratings are (thus far) non-issues.

Note each function that is standing out as a source of pain points and issues down for later use. For these functions, the “Actual Results” and “Comments” columns provide you with interactions to pay attention to during usability testing, serve as an easy to reference directory of what the root cause of some pain points might be, and can be great inspiration for brainstorming solutions.

Looking through your qualitative smoke test to determine how each of your project’s functions have been affecting your team’s interactions with it gives you a good jumping off point for determining where you want to focus your usability testing and other user research efforts, aligning your team before you enter this stage of design.

Strengths, limitations, and further steps

The qualitative smoke test is a fast, comprehensive, and organized way to make sense of a large digital product with the potential for multiple pain points in its use. It works well in the generative and alignment stage of product research, helping your team discover your internal goals and how to proceed with further research.

Its strengths lie in just how comprehensive it is: checking everything minimizes the effect of any bias going in, you might find issues you wouldn’t have without this level of detail and organization, and you get data involving each step of major functionalities to more precisely determine where pain points are occurring.

While I believe that the qualitative smoke test method is a great way to align your team before moving onto testing and research with users, though, it isn’t perfect. Due to its rapid testing nature, there isn’t much time to include tester feedback on each step unless its super important. Due to this, it’s not as suited to use with actual product users, as their time is more well spent elaborating on their opinions during usability testing and interviews. There is also the risk of some subjectivity from testers within your research group, but that risk could be minimized by creating some overlap in which functions each tester goes through.

My further plans for improvement on this research method include creating a more comprehensive guide for its use, experimenting further with its use in product design teams, and determining additional areas for improvement. I also want to find a way to make this method more suitable for use with people outside of the product design team, which will include testing this method with users and asking for their feedback on improving the process.

Template Link

An illustration of two blue hands, each holding a red link, reaching out to connect them.
source.

Here’s a link to my qualitative smoke testing template if you’d like to try it out! Let me know what you think, what some of the use cases you can think of for this method are, and how you think I could improve this method for further use!

👋 If you have any thoughts on this subject or think I missed anything, let me know by responding to this post or sending me a message on LinkedIn! I’m more than happy to discuss anything related to UX, user research, and child-computer interaction.

--

--

I’m an interaction design student, user research enthusiast, and former functional QA video game tester. — https://www.meghancskapyak.com/