Rating scales, the more points the merrier?

Examining the data on an impassioned topic

Mar 31, 2024

How many rating scale points should I use in UX studies?

Conflicting opinions are circulated in discussions about rating scales in UX research. Particularly, there are spirited takes on how many response options should be used in rating scales. That can lead to confusion and uncertainty for practitioners trying to follow best practices.

This article examines what the science says about rating scales. By the end, you’ll understand how the number of rating scale points used in a survey impacts its participant experience and the data it produces.

The debate

The central point in this debate about rating scales is that some argue that longer rating scales (i.e., those with more response options) confuse participants. What if participants can’t conceptualize the difference between an 8 or 9 on an 11-point scale? In turn, they argue that confusion might impact the validity of survey responses and lead to a poor participant experience. The endpoint of this line of thinking is that we should use fewer response options in our scales and ensure they’re somewhat categorical (e.g., 3-point scales labeled yes/maybe/no, positive/negative/neutral -or 2-points - yes/no, agree/disagree).

But, what do the data say about this argument? Do more scale points confuse our participants? How do shorter (i.e., 3-point) or longer (i.e., 5-point or more) scales impact the respondent experience?

How rating scale points impact the participant experience

So, what does the data say about the participant experience? Research shows that participants rate 3-point scales as quicker than longer (e.g., 5, 7, or 11-point) scales. There are also (inconsistent) findings that participants rate 3-point scales as easiest to use (though some studies find 5-point scales to be easiest).

Participants may perceive 3-point scales to be faster, but the actual difference in time is tiny. In one study, Sauro demonstrated that despite participants rating 3-point scales as faster than 11-point scales, the actual difference was just 0.3 seconds or “about the same amount of time it takes to blink!”

Ease and speed of response are two meaningful measures of the participant experience, but what about the ability to accurately express one’s attitudes? Shorter rating scales stifle respondents; participants find them limiting. Turns out, extra scale points aren’t confusing; longer scales are rated as easier for expressing feelings adequately. Participants not only understand but prefer longer scales; the additional response options help them express where their sentiments fall on a continuum (e.g., one person might wholeheartedly agree with a statement another might only marginally agree with).

In short, while participants might perceive shorter scales as quicker to respond to, the practical difference in timing is negligible. Further, the data show that longer scales are preferred because they allow participants to express themselves adequately.

How rating scale points impact data quality

Participants aren’t confused by larger rating scales; they find them easier for expressing their thoughts. Further, increased scale points are better for data quality. A literature review of relevant research on scale points shows that increasing from 3 scale points up to 5-11 has many benefits, including:

Increased reliability: Larger scales perform better than scales with few response options for producing consistent results within the same test and across retests.
Increased validity: Larger scales produce results that better quantify participants’ subjective feelings than those with few response options. The larger scales help to detect the intensity of responses, which increases predictive validity.

Knowing this, you might be tempted to use 101-point scales in all your research going forward, but there are bounds to these effects. There are clear and drastic gains in reliability and validity when you go from 3 to 5-point scales but diminishing returns set in after you increase to 11 points (though some studies show benefits of expanding up to 20 points). Together, these data show using 5, 7, or 11-point scales is the sweet spot.

Not only are rating scales with more options better for the participant, but they’re also better for the research; it’s a win-win. So, the practitioner should use at least a 5-point rating scale in their research.

The bottom line…

While some debate about scale points gets circulated in the UX community, the data paint a pretty convincing picture. Rating scales, and the numbers they use to convey a range of attitudes, aren’t confusing for participants. While participants feel like 3-point scales are faster, the practical effect of speed is negligible. More importantly, participants prefer larger scales because they can express feelings adequately. UX research practitioners should take advantage of the added validity and reliability that 5, 7, and 11-point scales contribute to their data without worrying about any detriment to participant experiences.

There are many ways a survey can go wrong (poorly worded questions, ill-considered response options, biases in design & sampling, etc.), but using a 5-to-11-point rating scale is not one.

Drill Deeper

Depth is produced by Drill Bit Labs, a consulting firm on a mission to advance the field of user research.

We partner with leaders in UX and digital product development to improve their team’s processes, user experiences, and business outcomes. How we help:

Research projects to inform confident design decisions and optimize digital experiences.
Training courses that teach team members advanced user research skills.
Advisory services to improve UX team processes & strategy; particularly around UX maturity, research process efficiency, structuring your people, hiring & interviewing, measuring UX, and demonstrating the business outcomes & ROI of UX initiatives.

We’d be happy to open a conversation to discuss your specific needs & goals for this year. These paid services help support our free articles and industry reports.

Bonus: Quizzes as an interview

My colleague,

Lawton Pybus

, and I have done quite a bit of research on the UX job market, hiring practices, & interviews [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]. At some point, we were asked an interesting question about interview techniques: Can quizzes be a valuable assessment tool in the interview process?

In the broader literature of interview practices, these are called “job knowledge tests.” We didn’t observe any instances of them being used to assess UX job candidates, but in other industries like finance, accounting, and law they are common and typically formalized into a licensing exam administered by a third-party, not the hiring entity.

Generally, these job-knowledge tests can have high predictive validity (in certain settings), but they have heavy tradeoffs:

The pros:

Can have high predictive validity of job performance
Useful when specific technical expertise or knowledge is required on day one of the job

The cons:

Not widely used in the UX field
May be perceived negatively by job seekers
Expensive development and administration are required to make a valid test
Not as valuable if you can train, or plan to train, the applicant on the skills post-hire

In short, well-administered job knowledge tests are valid but logistically challenging. Additionally, they are uncommon in our field and applicants may view them negatively. If you’re considering using job knowledge tests in your interview process, decide if their added utility is worthwhile by identifying if there are specific skills you need candidates to have on day one, that can’t be trained during employee onboarding, and cannot be assessed through your existing interview processes. I’d caution against using them in most cases.

Shan Yu

Apr 22

Very interesting post!

I would like to nominate another relevant and debatable question regarding the rating scale, i.e., odd v.s. even scale.

There are also some debates about whether the rating scale should be odd- versus even-based, which forces people to vote for a slight attitude inclination, instead of staying neutral. I would like to hear how you think of this :).

Expand full comment