Contact  |  Login  Volunteer

Usability Testing

Usability testing involves observing users while they perform tasks with a hardware or software system.

The product may be a paper sketch, a wireframe, a storyboard, a display mock-up, a product in development, a working prototype, or a completed product. Usability testing can also be conducted on competitive products to understand their strengths and weaknesses.

A usability test can be a formative evaluation<, which is conducted early in the design process to find problems improve the product, or summative evaluation<, conducted to validate the design against specific goals.

Testing involves recruiting targeted users as test participants and asking those users to complete a set of tasks. A test facilitator conducts the testing via a test protocol while the test sessions are typically recorded either by a video operator and/or an automated testing tool.

Usability testing should be conducted with participants who are representative of the real or potential users of the system. For some tests, users must have certain domain, product and application-specific knowledge and experience.

Usability testing consists of five primary phases:

  • Planning
  • Pretest or pilot
  • Test sessions
  • Post-test or debrief
  • Analysis, interpretation, and presentation of the results.

These phases are described in the procedure section below.

See also:

Related Links

Formal Publications

  • Dumas, J. & Redish, J. (1999). A practical guide to usability testing (Revised edition). Exeter, UK: Intellect. This book provides excellent practical advice for groups who want to initiate usability testing. Both new and experienced usability professionals will find this book useful. Topics include planning and preparing for usability tests, analyzing data, and communicating the results.
  • Nielsen, J. (1994). Usability engineering. San Francisco, CA: Morgan Kaufmann. Nielsen's Usability Engineering is highly recommended as a solid introduction to the design of usable products. The book details how usability issues must be considered throughout the development process and provides techniques for gathering usability data. There is excellent information on low-cost usability testing techniques.
  • Rubin, J., & Chisnell, D. (2008). Handbook of usability testing (Second Edition): How to plan, design, and conduct effective tests. New York: Wiley. This handbook is a step-by-step guide to effective usability testing. Rubin and Chisnell provide many tips that will benefit both the new and the experienced usability practitioner.

Nielsen, Why You Only Need to Test With 5 Users<

Perfetti & Landesman, Eight is Not Enough<

Clustering for Usability Participant Selection< Journal of Usability Studies, Volume 3, Issue 1, November 2007, pp. 41-53. Conducting user studies can be very expensive when selecting several experiment participants. The approach of collecting participant demographics and other information as input into clustering is a mechanism for selecting ideal participants.

Published Studies

Unexpected Complexity in a Traditional Usability Study< Journal of Usability Studies, Volume 3, Issue 4, August 2008, pp. 189-205

The study recommends that usability specialists expand our definition of traditional usability measures so that measures include external assessment by content experts of the completeness and correctness of users' performance. The study also found that it is strategically indispensable for new clients to comprehend the upper end of complexity in their products because doing so creates a new space for product innovation.

Other materials available from UPA

Coyne, Kara Pernice. Testing More Than ALT Text- Techniques for Testing Usability and Accessibility. UPA 2002 Conference

Faulkner, Laura Lynn. Reducing Variability - Research into Structured Approaches to Usability Testing and Evaluation. UPA 2002 Conference.

Gorlenko, Lada. Long-term (longitudinal) research and usability. UPA 2005 Conference.

Harris. Effects of Social Facilitation & Electronic Monitoring on Usability Testing. UPA 2005 Conference.

Jarrett, Caroline.; Quesenbery, Whitney. How to look at a form - in a hurry. UPA 2006 Conference.

Laberge, Jason.; Hajdukiewicz; John.;Ming Kow, Yong. An integrated approach to competitive usability assessments. UPA 2005 Conference.

Lee, Dong-Seok.; Woods,David D.; Kidwell, Daniel. Escaping the design traps of creeping featurism: Introducing a fitness management strategy. UPA 2006 Conference.

Mitchell, Peter P. An Inventory and Critique of Online Usability Testing Packages. UPA 2002 Conference.

O'Brien, Eileen. ; Roessler, Peter. A Tale of Two Customers. UPA 2009 Conference.

Robillard-Bastien, Alain. Reporting usability studies: are we using optimal reporting formats? UPA 2002 Conference.

Sova, Deborah Hinderer; Kantner, Laurie E. Technology and Techniques for Conducting Instant-Messaging Studies. UPA 2005 Conference.

Sundy, Mary G.;Wayman, Lauren M.; Schwendeman, Charlotte. Two Birds With One Document: Delivering Results With Maximum Impact. UPA 2002 Conference.

Tedesco, Donna.;McNulty, Michelle.; Tullis, Tom. Usability Testing with Older Adults. UPA 2005 Conference.

Tullis, Tom.; Fleischman, Stan.; McNulty,Michelle.; Cianchette, Carrie.; Bergel, Marguerite. An Empirical Comparison of Lab and Remote Usability

Testing of Web Sites. 2002 UPA Conference.

Turner, Carl W.; Nielsen, Jakob.; Lewis, James R. Current Issues in the Determination of Usability Test Sample Size: How Many Users is Enough? UPA 2002 Conference.

Wei, Carolyn.; Barrick,Jennifer.; Cuddihy, Elisabeth.; Spyridakis, Jan. Conducting Usability Research through the Internet: Testing Users via the WWW. UPA 2005 Conference.

Cost-Effectiveness

*Bias, R. G., & Mayhew, D. J. (2005). Cost-justifying usability, second edition: An update for the internet age. San Francisco, CA: Morgan Kaufmann.

How to do it

Carlson, Jennifer Lee; Braun,Kelly; Kantner, Laurie. When Product Teams Observe Field Research Sessions: Benefits and Lessons Learned. UPA 2009 Conference.

Gellner, Michael.; Oertel,Karina. Efficiency of Recording Methods for Usability Testing. UPA 2001 Conference.

Hawley, Michael. Evaluating and Improving Your Research Interview Technique. UPA 2009 Conference.

Kaniasty, Eva. Web Analytics and Usability Testing: Triangulate your way to better design recommendations. UPA 2009 Conference.

Patel, Mona.; Paulsen, Christine Andrews. Strategies for Recruiting Children for Usability Tests. UPA 2002 Conference.

Schall, Andrew. Eyeing Successful User ExperienceMeasuring Success & Failure Through the use of Eye-Tracking. UPA 2009 Conference.

Summers, Michael; Johnston, Gavin; Capria, Frank. Donít Tell Me, Show Me: How Video Transforms the Way We Analyze and Report Data From Field Studies. UPA 2001 Conference.

Travis, Kristin.; Yepez, Cindy. One on One Usability Studies vs. Group Usability Studies: Is There a Difference? UPA 2002 Conference.

Wehner, Mark and Wailes, Tom. Field Day: A Simple Method for Connecting Teams to Their Users. UPA 2009 Conference.

Variations

Travis, Kristin.; Yepez, Cindy. One on One Usability Studies vs. Group Usability Studies: Is There a Difference? UPA 2002 Conference.

Labs

Barlow-Busch, Robert., and Day-Hamilton, Tobi. To Build or Not To Build: Considerations For Constructing a Usability Lab.

Bevis, Korin., Henke,Kristine., & OíKeefe, Dr. Timothy. Thinking ìOut of the Labî: Using a Customer Advisory Council to get Real-World Feedback. UPA 2009 Conference.

Lafleur, Andy. Camera-less Video: Usability Test Recording and Presentation in Direct Digital Form. UPA 2001 Conference.

Participants

Bzostek, Julie. Testing a Web Site with Elementary School Children. UPA 2001 Conference.

Conklin, Sara.; Doutney, Joan. Hiding a Paradigm Shift: Maintaining task simplicity in a selfservice application. UPA 2008 Conference

Klee, Matthew. A Match Made in Heaven: Recruiting the Right Users for Usability Testing. UPA 2001 Conference.

Problems

Conklin, Sara.; Doutney, Joan. Hiding a Paradigm Shift: Maintaining task simplicity in a selfservice application. UPA 2008 Conference

Karn, Keith S. Subjective ratings of usability: Reliable or ridiculous? UPA 2008 Conference.

Gregory, Keith. Seven-Layer Model of Usability Attributes. UPA 2001 Conference.

Detailed description

Benefits, Advantages and Disadvantages

Advantages

  • You can get feedback that reveals possible design flaws and other issues.
  • You can get reliable measures of usability (see summative usability testing).
  • Experienced test facilitators can elicit feedback from users to help understand why they had problems.
  • Low and medium-fidelity prototypes are cost-effective to test.
  • It is easy to have project manager and developers as observers.
  • You can produce video clips from test sessions to show problems.

Disadvantages

  • Not all problems will be found with small samples of users.
  • You may not have access to users that match the user profile.
  • Not all tasks may be "right" for all users.
  • Lab testing takes users away from their natural work environment.
  • Technical setup may be complex and require domain experts and additional time for setup and debugging.
  • An inexperienced facilitator can influence the results by using too many hints, asking biased questions, or providing nonverbal cues about the tasks.

Appropriate Uses

  • Major usability problems are identified that may not be revealed by less formal testing, including problems related to the specific skills and expectations of the users.
  • Measures can be obtained for the users' effectiveness, efficiency and satisfaction.

How To

Planning

  1. Write a usability test plan to define the goals, users, tasks, procedures, test setup, data collection and reporting requirements.
  2. Select the most important tasks and user groups to be tested. Task can be chosen based on what features are available for testing, frequency of use, criticality, and other factors.
  3. Recruit users who are representative of each user group. The number of users will depend on on your goals (finding problems versus comparing performance to benchmarks), the impact of the product on users, and other factors. For formative testing, there is much debate about how many participants should be in a usability test with numbers ranging from 5 to more than 50.
  4. Produce task scenarios and input data and write instructions for the user (tell the user what to achieve, not how to do it). You can also create basic tasks which you customize for each participant, or allow participants to select their own tasks.
  5. Plan the test sessions allowing time for giving instructions, running the test, answering a questionnaire, and post-test interview. Test sessions longer than about 2 hours will require dedicated participants.
  6. Invite members of the product team to observe the sessions if possible. An alternative is to videotape the sessions, and show edited clips of the usability problems to team members who could not attend the sessions.
  7. For formative evaluation, the facilitator will normally be with the user to prompt and question when necessary. For summative evaluation, the facilitator is generally observing from another room so as not to interfere with the participant's work.
  8. Prepare additional written test materials including informed consent (for participation, for recording), pre- and post-test questionnaires, and any observer data recording sheets.

Pretest or Pilot

  1. Conduct pilot tests with internal users to debug instructions and tasks, verify that the hardware and software are working, and determine if there is adequate time for the session.
  2. Resolve any technical or logistical problems with the test plan and setup. Fix any problems with written test materials.
  3. Finalize the schedule and send it to all the observers.

Running sessions

  1. Welcome the user, sign informed consent form(s), and the nondisclosure agreements (NDAs) if needed, and fill out pretest questionnaire which can be used to verify screening information and gather additional background information.
  • Let the user know about any test observers.
  • Give the task instructions let user know how their questions will be handled.
  • Observe the user working through the tasks. Do not give hints or assistance unless necessary.
  • Time each task, if this is part of the session protocol.
  • At the end of the session, ask the user to complete a satisfaction questionnaire.
  • Interview the user to confirm they are representative of the intended user group, to gain general opinions, and to ask about specific problems encountered, if this is part of the session protocol.
  • Assess the results of the task for accuracy and completeness, if this is part of the session protocol.

Post-test or Debrief

  • Ask the user if they would like to meet the observers and ask questions.
  • After user leaves, the test team including observers discuss what was observed.

Variations

Summative usability testing is used to obtain measures to establish a Usability benchmark or to compare results with usability requirements.

Group Usability Testing (Journal of Usability Studies, Volume 2, Issue 3, May 2007, pp. 133-144). Several to many participants individually, but simultaneously, performing tasks, with one to several testers observing and interacting with participants.

Remote evaluation may be set up with a portable test lab. This setup enables more users to be tested in their natural work environment. It also means testing can be done at user conferences, customer sites as well as part of Beta test programs.

Variations may also include:

  • Varying the order tasks are presented to users.
  • Testing only one user to base design decisions on (RITE method).
  • Allowing users to self-report.

Participants and Other Stakeholders

The people primarily involved in usability testing include:

  • Test facilitator who conducts the pilot and test sessions.
  • Test participants
  • Test observers
  • Test monitor who operates recording equipment and may also take notes

Materials Needed

For some testing there will not be any technical requirements, just written test materials. In general, the materials needed to run a usability test include:

  • The system (paper sketch, model, display mockup, software, website)
  • Physical or portable test lab (camera setup, observation room)
  • Written test materials (informed consent, questionnaires, task scenarios, observation data sheets)
  • Technical setup (servers, "live" or simulated test data)
  • Connections for remote observers

Who Can Facilitate

An experienced test facilitator is someone who is:

  • Knowledgeable about the system and the tasks being tested.
  • Knows how to avoid giving unnecessary hints.
  • Able to develop rapport with all kinds of people.
  • Flexible and organized.
  • An active listener.

Common Problems

  • Test participants do not truly match the user profile in test plan.
  • Insufficient number of participants to draw conclusions from.
  • Incidence of Hawthorne Effect in test participants (see below).
  • Unsure how to handle "outliers" or problems noted by only one user.
  • Hints given that interfere with users completing tasks on their own.
  • Glitches in test setup (e.g. server goes down, missing simulated data).
  • Problems with recording equipment.

Opinions vary on the number of participants that should be recruited for a usability test, from a few as 1 to as many as 15. It is better to perform multiple usability tests with fewer users each time rather than a single test late in the development lifecycle.

The Hawthorne effect refers to a danger that participants in any human-centered study may exhibit atypically high levels of performance simply because they are aware that they are being studied.

Usability studies and the Hawthorne Effect Journal of Usability Studies, Volume 2, Issue 3, May 2007, pp. 145-154

The Hawthorne effect can be (mis)used as a basis on which to criticize the validity of human-centered studies, including usability studies. Therefore, it is important that practitioners are able to defend themselves against such criticism. A wide variety of defenses are possible; depending on which interpretation of the Hawthorne effect is adopted. To make an informed decision as to which interpretation to adopt, practitioners should be aware of the whole story regarding this effect.

Data Analysis and Reporting

  • Produce a list of usability problems, categorized by importance, and an overview of the types of problems encountered.
  • Arrange a meeting with the project manager and developer to discuss whether and how each problem can be fixed.
  • If measures have been taken, summarize the results of the satisfaction questionnaire, task time and effectiveness (accuracy and completeness) measures.

The type of objective and subjective data collected during testing may include:

  • Ability and time to complete a task.
  • Sequence and number of steps to complete a task.
  • Types and numbers of errors.
  • Number of repeated errors.
  • Number of design issues.
  • Ratings of ease of performing a task.

Categories of usability problems include:

  • Preventing users from performing tasks (ìdead ends, lack of functionality).
  • Slowing users down (lack of feedback, items not in expected places, terminology not understood).
  • Increasing userís workload (recall required from multiple screens, typing rather than selecting).
  • Inconsistencies (use of color, layout of information).
  • Insufficient error handling (hard to correct errors, missing undo function, cryptic error messages).

Severity rankings can also be assigned to each problem. These rankings can be determined by how frequent a problem occurred, the impact of the problem and the persistence of the problem.

If a full report is required, the Common Industry Format provides a good structure. There is a detailed example of a usability report using the Common Industry Format.

Next Steps

  • Collect feedback from users after release to inform any redesign.
  • Determine need to test with more users.
  • Determine what design issues cut across related product lines.

Special Considerations

Costs and Scalability

People and Equipment

The costs for usability tests vary depending upon what type of prototype is being tested. For traditional lab testing the costs include:

  • The recruiting cost per participant.
  • The cost of payment or incentives for test participation.
  • Travel costs if conducting testing in multiple sites.
  • Equipment costs for a portable lab. This is a one-time cost however.
  • The cost of any full transcription of test sessions.

Time

  • The time involved to meet with the project team, develop the test plan, run pilot tests vary.
  • The time to run the test sessions varies depending upon nature and scope of the test plan.

Ethical and Legal Considerations

Informed consent forms are needed for participation and recording.

Facts

Lifecycle: Evaluation
Sources and contributors: 
Cathy Herzon, Eva Kaniasty, Karen Shor, Nigel Bevan (incorporating material from UsabilityNet)
Released: 2011-05
© 2010 Usability Professionals Association