Discover more from Depth by Drill Bit Labs
Selecting the right tasks for UX benchmarking
Practical tips for choosing top tasks that align with business goals over the long-term
You can go a long way with the classic “discount” UX methods like in-depth user interviews, usability testing, and well-designed surveys. These perennial approaches are popular for good reason—they deliver valuable insights without requiring a lot of resources.
But when it comes to showing the long-term impact of UX efforts, teams are often asked to provide more concrete evidence. While some argue that other departments don’t face the same pressure to prove their value, we often say our contributions are important for business outcomes—and it’s only fair to back those claims up.
This is where UX benchmarking can come in. As a quantitative method, benchmarking uses larger sample sizes to provide more precise estimates of key metrics. It uses a more tightly-defined protocol, which makes the findings and comparisons more reliable over time than those generated by qualitative methods alone. And when properly done, benchmarking can help to show the return on investment (ROI) of your team’s UX efforts over time.
But challenges can arise in the first few steps—take, for example, selecting the right tasks. What should you measure? How many can you include? In this article, we’ll guide you through the considerations that will help you craft effective UX benchmarks. We’ll cover how to identify tasks that align with your business objectives, measure up against the competition, ensure feasibility and repeatability, and create a seamless participant experience.
List the tasks that matter to your users
A task is one action or set of actions that a user must take to achieve a specific outcome. For example, for a fitness app, a key task could be logging a workout or setting a fitness goal. For an e-commerce site, it might be applying a discount code at checkout.
Before you can identify your top tasks, you need a comprehensive list of what users are doing, or could be doing, with your product. Established UX teams often already have an understanding of this. But if you’re starting from scratch or just beginning to chart a poorly-defined problem space, there are several ways to begin gathering this information:
Explore your app, website, or sitemap to identify key actions users can take.
Investigate competitor products to see what users might accomplish there.
Review customer support tickets, app store reviews, and feedback from social media.
Analyze web analytics data and common search queries.
You’ll likely compile a long list of potential tasks, so narrowing down is key. Unlike traditional usability testing, where participants often dedicate an hour, online UX benchmarking requires brevity. Best practices suggest you can expect around 10 to 20 minutes of engagement from remote participants, many of whom are fitting your study into busy schedules. This translates into roughly 4 to 8 tasks, if each takes about 2 or 3 minutes to complete.
To narrow down your task list, focus on two main factors: frequency and importance. What do users do most often, and what matters most to them? Keep in mind that what’s important to your users may not always align with your business priorities—that’s a separate consideration we’ll address in the next step.
If you have existing product analytics, use that data to identify frequently performed tasks. If not, consider collecting user data through methods like top-tasks surveys, stack ranking, or pairwise comparisons. These techniques allow users to weigh in on the relative importance and frequency of the actions they take with your product, giving you a clear picture of where to focus.
Select tasks that matter to the business
As UX teams, we represent the voice of our users, particularly in the ways their needs and goals overlap with our organizational goals. So it’s not enough to choose tasks that users perform frequently—the tasks should also align with what drives business success.
After narrowing your list of frequent and important tasks, the next step is to assess their value to the business. In broad terms, these are the tasks that ultimately result in either generated revenue or reduced costs. A few examples that lead to a sale might include:
Locate pricing comparison page
Complete a purchase
Reach out for a quote
And here are a few others that reduce burden on customer support channels:
Find FAQs and common problems
Locate community support forum
Start a conversation with a chatbot
To identify which tasks are most closely tied to business objectives, start by listing the company’s key goals and performance metrics (or KPIs). These are often shared during all-hands meetings, goal-setting sessions, or found in investor relations materials such as annual reports.
Next, for each of the tasks in your shortlist, determine which KPI it relates to, if any, and how strongly it influences that KPI. This simple method will help you prioritize the tasks that matter most to both users and the company.
Consider competitive tasks
Your product isn’t the only one your users will interact with. Competitive benchmarking places your data in the broader context of your industry, helping you see how well your experience holds up against both direct and indirect competitors.
Your competitors likely cover much of the same core functionality as your product, but they may also offer services or features you haven’t considered. Every competitor has unique value propositions, whether competing on price, features, or specific customer segments.
Most teams will begin with a baseline benchmark focused solely on their own product. Competitive benchmarking adds complexity, but it’s worth at least thinking about how to build flexibility into your study design for future iterations. Even if you’re not ready to implement competitive comparisons right away, having a plan will prepare you to answer the inevitable question from stakeholders: “How do we compare to the rest of the industry?”
While industry benchmarks are sometimes available from third-party services, they often lack the specificity needed to inform your product decisions. The most valuable data comes from directly benchmarking your competitors, using tasks and metrics tailored to your product’s unique goals. This approach will produce far more actionable findings than generic industry data.
Competitive benchmarking reveals where your competitors may excel and where you might have an advantage. More importantly, it reveals opportunities to improve the user experience that might otherwise be missed. For example, if a competitor incentivizes automatic payments with a monthly discount, that could inspire a quick solution to boost adoption of your own payment services.
By collecting this data, you empower your team to make informed, strategic decisions that improve both your product and your competitive edge.
Present tasks clearly to participants
Once you’ve prioritized the tasks, the next step is structuring their presentation in the study. This includes drafting clear task scenarios, determining start and end points, and success criteria.
In unmoderated sessions, participants are on their own—there’s no facilitator on hand to clarify instructions or troubleshoot technical issues. For this reason, stress-testing your tasks is essential. Every word matters, and instructions should be vetted by pilot testers (even, if necessary, colleagues) who are unfamiliar with the study’s goals. This makes sure the tasks are clear for all participants, even those with no prior knowledge of the product or task.
For comparison’s sake, it’s important that all participants begin and end in similar places. For example, you might direct participants to the homepage of your website (the task start) and ask them to find specific information that resides on only one page (the task end). Participants would then need to navigate the site to the task end to locate the required information.
You can then verify success in a few ways. One option is to use your testing platform to confirm that participants reached the correct URL. Alternatively, if your platform doesn’t support tracking (or the task isn’t tied to a unique URL), you can ask participants to identify the information they found in a multiple-choice question with distractors. This allows them to recognize the correct answer rather than having to write it down or rely on memory.
Whether through URL tracking or knowledge-based questions, validating success ensures an apples-to-apples comparison across participants, regardless of how they accomplish the task.
Make sure the tasks are feasible
At this point in the process, take care that your task and study design are actually possible within the constraints of your chosen study platform or tool.
For example, not all platforms allow webcam or microphone recordings for thinking aloud during the task. Even among those that do, some impose participant limits or additional fees. Being aware of these limitations early will save time and prevent headaches later on in the process.
Complex experimental designs—such as assigning participants to different conditions or segmenting users based on specific criteria—are often important for the integrity of the study. While some platforms support these designs seamlessly, others may require clunky workarounds, such as creating multiple studies. You’ll need to carefully plan for any platform limitations to avoid introducing sample bias or reducing the accuracy of your results.
Once your study is built, thoroughly test it to ensure that all logic, conditions, and task flows are working properly. For example, one common use of study logic is to ask an open-ended follow-up question only when a participant rates a task’s difficulty below a certain threshold (such as less than 6 on a 7-point scale)—an effort towards improved data quality. If you’ve built in conditions such as these, you’ll want to try each possibility and confirm the platform is recording data as intended.
By focusing on feasibility from the outset, you create a smoother, more reliable benchmarking study.
Select today’s tasks with tomorrow’s in mind
When designing your first benchmark, think beyond the immediate study. Consider the second and third benchmarks—which could take place several years from now, depending on your chosen cadence. Selecting tasks with repeatability in mind ensures that your future benchmarks will remain relevant and offer comparable data points.
Incorporating tasks for features that are still in development or slated for redesigns is another way to future-proof your benchmarks. These tasks often support broader strategic initiatives for the product’s long-term success. Examples of forward-thinking tasks might include:
A wireless provider expanding into home internet services might test whether customers can find a waitlist to sign up for the offering. This task could become subscription-focused in future efforts.
A financial services firm adding comparison tools for retail investors could benchmark how users currently research securities on (or even outside of) their platform. This task could examine the new research tools in later versions.
With an eye to the future from day one, you’ll make sure your benchmark results have a long shelf life and remain valuable points of comparison in future iterations.
The bottom line
Benchmarking is a powerful tool to demonstrate the long-term impact of your UX efforts, but it requires careful planning and thoughtful execution… beginning with selecting the tasks. Here are some of the key considerations to keep in mind:
Identify top tasks: Start by understanding what your users are trying to accomplish and narrow your list based on frequency and importance.
Align with business objectives: Prioritize tasks that directly contribute to revenue generation or cost reduction.
Consider the competitive context: Keep in mind that your product is judged against competing alternatives, shaping user expectations and preferences.
Focus on the participant experience: Ensure tasks are clear, easy to understand, and allow for consistent comparisons in an unmoderated environment.
Stay focused on feasibility: Make sure your tasks and study design are compatible with the testing tools you’re using.
Build for repeatability: Create tasks that can be repeated across future benchmarks to track long-term progress.
Benchmarking not only helps show the value of your UX work but also gives immediate insights for improving your product and staying competitive in the market. By aligning tasks with both user needs and business objectives, and ensuring repeatability, you can see how your team’s contributions improve the experience over time.
Drill deeper
These are just a few key factors when designing your first UX benchmark—but as always, your specific context might have additional challenges. If you’re looking for a partner to help overcome those obstacles or ensure you’re measuring the right things, Drill Bit Labs offers the expertise you need.
Whether you need guidance on specific tasks or broader strategic support, we’re here to help you achieve your goals. How we help: user research projects to inform confident design decisions and optimize digital experiences, live training courses that teach teams user research skills, including a deep-dive on UX benchmarking, and advisory services to improve UX processes and strategy.
Connect with us to discuss your upcoming projects or ongoing UX needs.
Join us for a webinar hosted by Great Question
Join us on October 23rd for a webinar hosted by Ned Dwyer, Co-Founder and CEO of Great Question, as we discuss our recent series of reports on Building the First Data-Driven Career Ladder for UX Research.
Tune in for a conversation about:
The motivations behind our career ladder
Key areas of responsibility that distinguish seniority levels in UX research
Benchmark qualifications, including required education and experience
Ways in which job seekers and hiring managers can leverage these findings
Plus: we’ll answer your questions live
We hope you’ll join us. Save your spot here!
Subscribe to Depth by Drill Bit Labs
A newsletter about UX research and digital strategy. Advancing the practice of user research one issue at a time.