{"id":16125,"date":"2023-04-30T10:37:12","date_gmt":"2023-04-30T08:37:12","guid":{"rendered":"https:\/\/www.centigrade.de\/?post_type=blog&#038;p=16125"},"modified":"2023-05-16T15:31:29","modified_gmt":"2023-05-16T13:31:29","slug":"how-to-get-started-with-ux-metrics","status":"publish","type":"blog","link":"https:\/\/www.centigrade.de\/en\/blog\/how-to-get-started-with-ux-metrics\/","title":{"rendered":"How to get started with UX Metrics"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16142\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/ux-metrics-illustration-small.jpg\" alt=\"ux metrics illustration\" width=\"1456\" height=\"816\" srcset=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/ux-metrics-illustration-small.jpg 1456w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/ux-metrics-illustration-small-300x168.jpg 300w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/ux-metrics-illustration-small-768x430.jpg 768w\" sizes=\"auto, (max-width: 1456px) 100vw, 1456px\" \/><\/p>\n<p>Welcome to my first blog post ever. How did we get here?<\/p>\n<p>I\u2019ve recently joined Centigrade and (at the risk of this sounding like an ad) I can truly say that it has been a great experience. Employees are not just treated like a \u201chuman resource\u201d but like real people. I\u2019ve been given autonomy, responsibility and above all trust. I have to say that really feels great! Instead of overloading me as a new employee with a million projects, I\u2019ve been onboarded to just one and I\u2019ve had the opportunity to research a topic of my choice and write this blog post about it. UX metrics have always been of interest to me, particularly how to integrate it at a low cost\/effort in order to get more stakeholder buy in so it was an easy choice for me. A quick note before I get started for the spelling sharks among us: I always capitalize User for obvious reasons, so don\u2019t get hung up on that. Here we go!<!--more--><\/p>\n<h2>Why UX metrics?<\/h2>\n<p>First, let\u2019s start with a definition. I like this one:<\/p>\n<p><strong>\u201cUX metrics are a set of quantitative data that are used to measure, <\/strong><strong>compare and improve the User experience over time.\u201d [1]<\/strong><\/p>\n<p>For me, what stands out here, is the \u201cover time\u201d part. It means, in order to employ UX metrics, you need to convince your stakeholders to invest in <strong><u>continuous manual and automated research<\/u><\/strong> in order to obtain valuable data. So probably that would be the first important step to do. Let\u2019s look further of why I think UX metrics are an invaluable tool in the UX expert\u2019s toolkit.<\/p>\n<h3>1. Qualitative vs Quantitative<\/h3>\n<p>Qualitative research is great to better understand our users, generate ideas, navigate the problem space and come up with solutions. In short, qualitative data answers the <strong><em>Why?<\/em><\/strong> But how do we measure how bad of a usability problem we have discovered really is, using some kind of scale? Have we managed to improve our digital product over time and how do we select the best of different solutions? That\u2019s where quantitative data comes in, it answers the \u201c<strong><em>How many?\u201d<\/em><\/strong>,<strong><em> \u201cHow often?\u201d<\/em><\/strong> and \u201c<strong><em>How much?\u201d<\/em><\/strong> questions.<\/p>\n<h3><strong>2. Buy-in<\/strong><\/h3>\n<p>So, you\u2019ve studied User research or you work with dedicated User researchers, who have collected some years of experience, are aware of biases and how to avoid them (<a href=\"https:\/\/www.centigrade.de\/en\/blog\/cognitive-bias-in-ux-research-a-survival-guide\/\">see my colleague\u2019s article<\/a>), have run tight scientific studies and then together created well-backed findings and solutions, summarized them in an appealing way and honed your storytelling skills so that you present your insights interactively and convincingly to your business stakeholders. Yet you\u2019re still not getting the buy-in from them? I promise, you\u2019re not alone. Getting the buy-in from stakeholders solely based on qualitative results can be tricky. But once we can quantify the problems that Users are facing, answer the <strong><em>How many?<\/em><\/strong> question, we speak the stakeholder\u2019s language. Metrics are the translation of User research into tangible, comparable numbers and visually appealing graphs. And who could stand the persuasiveness of a graph well-done? As User experience professionals, we always consider our audience. So why not in the case of our stakeholders?<\/p>\n<h3>3. Pearson\u2019s Law<\/h3>\n<p>Karl Pearson an academic in statistics that was world renowned for his insights. Pearson\u2019s law states<\/p>\n<p><strong>\u201cWhen performance is measured, performance improves. When performance is measured and reported back, the rate of improvement accelerates.\u201d<\/strong><\/p>\n<p>In other words, measuring User experience, will not just get you stakeholder buy-in but also actually improve the User experience. And if you include business stakeholders in your metrics and report back to them about the metric\u2019s trends over time, User experience will improve even faster. Improving the User experience being the ultimate goal of every User experience professional, I can find no better reason to start using UX metrics as soon as possible.<\/p>\n<p>So, let\u2019s explore with which metrics and how to best get started employing metrics in projects.<\/p>\n<h2>What UX metrics are there?<\/h2>\n<p>There are plenty of different metrics, that I will not go into detail in this blog post. What I will do instead, is identify the \u2013 \u00a0in my opinion \u2013 \u00a0metrics that are the simplest to measure and get started with. Metrics that don\u2019t employ very complicated formulas, require a lot of coding or specialized equipment. If you already conduct User research, especially continuously, the metrics I\u2019ve identified shouldn\u2019t be too difficult to implement into your workflows.<\/p>\n<p>Metrics in general can be grouped by their different measuring goals. There are different naming conventions that different frameworks use, and they can be sometimes grouped slightly differently, but I like this summary of UX measuring goals: <strong>Performance, Preference, Perception [2].<\/strong><\/p>\n<ul>\n<li>Performance metrics are employed to measure how well the User can achieve their goals.<\/li>\n<li>Preference metrics measure what the User prefers or likes.<\/li>\n<li>Perception metrics measure what the User thinks. In order to get a better and more accurate understanding of the User experience of your product, it is always a good idea to choose a mix of metrics from these three categories.<\/li>\n<\/ul>\n<p>As always in user research, we must keep in mind biases and differentiate between metrics that measure <em>what the User says<\/em> (attitudinal) and <em>what the User actually does<\/em> (behavioral). It is always best to use a mix of attitudinal and behavioral metrics in order to get more accurate results.<\/p>\n<p>Here we go with a list of the easier metrics to get started with. I\u2019ve grouped them by their measuring goals as well as divided them into behavioral and attitudinal metrics so that you can choose them more easily and try and build you own metric framework that can measure the User experience of your project most accurately.<\/p>\n<h3>Performance<\/h3>\n<p><strong>Behavioral metrics: Task success rate, task completion time, task error rate<\/strong><\/p>\n<p>These measurements are simply calculated averages of the success rate, completion time or error rate across all Users. If you don\u2019t do continuous User research or don\u2019t have the buy-in for User testing (yet), it is often easier to measure the error rate, since all User experiences throw error messages, and simple code hooks can be employed to count these. Remember to keep in mind your error margin (that can be calculated be using a simple online calculator).<\/p>\n<p><strong>Attitudinal metrics:<\/strong> SEQ or SUS (expectation\/pre task and experience\/post task)<\/p>\n<p>SUS stands for System Usability Scale and is a common way of measuring the Users perceptions. SEQ is a very simplified (one question) version of the SUS, so it might be the easier one to get started with. <a href=\"https:\/\/www.centigrade.de\/de\/blog\/deepsight-ein-blick-in-die-zukunft-von-augmented-reality\/\">Here<\/a> is an example on how we applied it to a metaverse research project. The trick to turning this perception measure into a performance measure is to ask it twice. If we ask the User to fill in the survey before completing a task and again afterwards, it can be used as a measurement of performance. If we complete this survey for more than one task and input the data of different tasks in a graph, it can even give a visual representation of what issues to prioritize. You can find an example below.<\/p>\n<div id=\"attachment_16129\" style=\"width: 577px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16129\" class=\"wp-image-16129 size-full\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild5-3.png\" alt=\"Average Expecation vs Average Rating Diagram\" width=\"567\" height=\"383\" srcset=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild5-3.png 567w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild5-3-300x203.png 300w\" sizes=\"auto, (max-width: 567px) 100vw, 567px\" \/><p id=\"caption-attachment-16129\" class=\"wp-caption-text\">Albert, William &amp; Dixon, E. (2003). Is this what you expected? The use of expectation measures in usability testing.<\/p><\/div>\n<h3>Preference<\/h3>\n<p><strong>Behavioral &amp; Attitudinal: Prototype, A\/B Testing or multivariant testing<\/strong><\/p>\n<p>When testing the User\u2019s preference, we need to think of the return on investment (ROI). In the beginning or your product journey, there are many open questions, and we can\u2019t just start off with an A\/B test of two button placement variants. That would be assuming that the journey to get to that button is what the User prefers. Hence it is most common to start with a prototype test and then work our way to the A\/B test for detail improvement. The following graph offers a great visualization, what kind of User test is how appropriate in what phase of your project in order to reduce the risks of assumptions and uncertainty.<\/p>\n<div id=\"attachment_16131\" style=\"width: 615px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16131\" class=\"wp-image-16131 size-full\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild6-4.png\" alt=\"Progress vs Uncertainty Diagram\" width=\"605\" height=\"340\" srcset=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild6-4.png 605w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild6-4-300x169.png 300w\" sizes=\"auto, (max-width: 605px) 100vw, 605px\" \/><p id=\"caption-attachment-16131\" class=\"wp-caption-text\">The Fountain Institute: Getting started in User testing and experimentation as Designers Guide (<a href=\"https:\/\/www.thefountaininstitute.com\/blog\/getting-started-in-testing-and-experimentation-a-designers-guide\">thefountaininstitute.com\/blog\/getting-started-in-testing-and-experimentation-a-designers-guide<\/a>)<\/p><\/div>\n<p>Even after presenting the above graph to your stakeholders, it\u2019s possible that due to a gap in understanding of UX, you have no buy-in for User testing, then maybe this will help to convince stakeholders:<\/p>\n<p>Statistically you can catch 85% of usability issues with only five Users [3] (or depending on the complexity of your project, five users per persona, read more <a href=\"https:\/\/www.centigrade.de\/en\/blog\/sample-size-in-usability-tests-and-user-interviews-less-is-more\/\">here<\/a>) so a low-fi prototype test doesn\u2019t take as much effort or time as your stakeholders might think. Always remind them: testing early is cheaper than developing a product based on assumptions and finding out that it doesn\u2019t solve the User\u2019s problems.<\/p>\n<h3>Perception<\/h3>\n<p><strong>Behavioral: Tapping while completing task<\/strong><\/p>\n<p>There are many tools today that make the observation of user behavior easy, even without complicated equipment. I still find this very simple tool to measure the cognitive load that a User is experiencing while completing a task very efficient: Have your Users tapped their finger repetitively and quite fast while they\u2019re completing a task? When their cognitive load increases, their tapping frequency will slow down or even stop momentarily while they need to focus entirely on the task.<\/p>\n<p><strong>Attitudinal: SEQ or SUS (w\/competitors)<\/strong><\/p>\n<p>I already introduced the SEQ and SUS surveys earlier. These are great existing tools that already have established general guidelines on how to interpret the scores of participants:<\/p>\n<div id=\"attachment_16133\" style=\"width: 310px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16133\" class=\"wp-image-16133\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild7-4.png\" alt=\"SUS Score Rating\" width=\"300\" height=\"227\" \/><p id=\"caption-attachment-16133\" class=\"wp-caption-text\">Hadi Althas 2018: How to Measure Product Usability with the System Usability Scale (SUS) Score (<a href=\"https:\/\/uxplanet.org\/how-to-measure-product-usability-with-the-system-usability-scale-sus-score-69f3875b858\">uxplanet.org\/how-to-measure-product-usability-with-the-system-usability-scale-sus-score<\/a>f)<\/p><\/div>\n<p>These categories are very easy for stakeholders to understand and that\u2019s why we can create a very convincing graph with them.<\/p>\n<p>And that\u2019s it. Yes, these are really the first metrics you (and I) could get started with, that are not all that tricky to implement. If you\u2019re a pragmatist, like me, then you\u2019re probably asking yourself now: Okay but what does that <em>really <\/em>look like? What\u2019s my first step here? Which one do I start with?<\/p>\n<h2>How to really get started?<\/h2>\n<p>I wish I could now introduce you to one framework, like the Google HEART framework, USER framework or any other framework and tell you, just implement these metrics one by one, and you\u2019re good to go! However, none of such frameworks works for all projects. For example, the Google HEART framework uses different metrics to measure <strong>H<\/strong>appiness, <strong>E<\/strong>ngagement, <strong>A<\/strong>doption, <strong>R<\/strong>etention and <strong>T<\/strong>ask success and while that might work great for Google or other B2C products, adoption and retention measurements often don\u2019t apply to B2B products.<\/p>\n<p>In Jared Spool\u2019s words<\/p>\n<p><strong>\u201cUnfortunately, the theory that a grand unified metric can tell us how well our products and services are doing is just a myth. <\/strong><strong>No such metric actually exists. <\/strong><strong>There are many ways to measure success.\u201d<\/strong><\/p>\n<p>And in my words: you need to come up with your own metrics to measure (and define) success for each project \u2013 or even for each usage context.<\/p>\n<p>Measuring your goals is possible on different levels as pictured in the graph below. If we measure a goal that is too broad, it will be more difficult to identify what we can to in order to improve the metric. The more precise or further down in the levels we can go, the more accurately we can measure, evaluate and improve incrementally.<\/p>\n<div id=\"attachment_16135\" style=\"width: 510px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16135\" class=\"wp-image-16135\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild8-4.png\" alt=\"System Level\" width=\"500\" height=\"451\" srcset=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild8-4.png 735w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild8-4-300x271.png 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><p id=\"caption-attachment-16135\" class=\"wp-caption-text\">Centigrade (Thomas Immich, Britta Karn) 2020: Usage data analysis<\/p><\/div>\n<p>In order to identify our possible metrics, Google\u2019s Goals, Signals, Metrics process combined with our measuring goals of Performance, Perception and Preference could come in.<\/p>\n<h3>Goals Signals Metrics x Performance Perception Preference<\/h3>\n<p>Imagine you\u2019re starting a new project, have conducted generative User research and have identified one or more goals for your persona(s), then you\u2019ve already done your first step towards defining the metrics for your project. Yay! Now, using Goals, Signals, Metrics process and choosing at least two out of the Performance, Preference and Perception metrics while paying attention to using a mix of behavioral and attitudinal measures, you can easily create your very own first set of metrics to get started with.<\/p>\n<p>That sounds a bit complicated but let\u2019s have a look at how Goals Signals Metrics works and combine these with the Performance, Preference and Perception metrics.<\/p>\n<p>Goals Signals Metrics is very simple: First you define the Goals you want to measure, then identify the Signals that show you that this goal was reached. Some others call these \u201cobjectives\u201d, but the essence is the same. Signals can be multiples for the same goal. For example: In the image below, I\u2019ve used the very general goal of the User being able to complete his task. Now for this there can be multiple signals. It could be that there is a low error rate, it could however also be that the time on task is low. So, we really need to be sure when to choose which signals. We should always ask ourselves, which Signals in fact signify the goal that we\u2019re trying to measure and does it apply in the context of the User journey. (Sometimes a long time on task is preferable as long as the task is completed in the end, for example if the goal is for the User to explore their options). This also means that in the long run and to measure more accurately, it is actually even preferable to measure one goal by different signals (and me in order to make sure that they are moving in the same direction and correlate our data. In the short run or in order to get started with metrics, it is acceptable to measure our goal with one signal as long as we can be really sure that this signal in fact signifies if our goal is being reached.<\/p>\n<p>On the board below, I\u2019ve visualized how the metrics could look like for some very high level and general goals. Obviously, you don\u2019t ever want to use general goals like these for your actual product and be as specific as possible instead. Remember that choosing your goals and what to measure is just as important as what you measure them by.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-16137\" src=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild9-3.png\" alt=\"Performance vs Preference vs Perceptipn Metrics \" width=\"907\" height=\"652\" srcset=\"https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild9-3.png 907w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild9-3-300x216.png 300w, https:\/\/www.centigrade.de\/wordpress\/wp-content\/uploads\/Bild9-3-768x552.png 768w\" sizes=\"auto, (max-width: 907px) 100vw, 907px\" \/><\/p>\n<h3>Buy-in starts with the implementation of a metric<\/h3>\n<p>Already when deciding on which metrics to measure, business stakeholders should be included so it is not just the UX team that owns the metric. Everyone should be aware of it and agree, that this is a metric that is useful, this is where using the metrics to improve stakeholder buy-in starts. It is a bit similar to an acceptance criteria or \u201cdefinition of done\u201d in an agile development context.<\/p>\n<h3>Goodhart\u2019s law<\/h3>\n<p>Of course, it wouldn\u2019t be User research, if there weren\u2019t biases or laws that we need to keep in mind. Specifically: Goodhart\u2019s law.<\/p>\n<p>Goodhart\u2019s law applied to UX metrics means that once we introduce a metric to measure a certain behavior, we can get so carried away with trying to improve the numbers that we forget why we chose it in the first place and if it still measures what we want to measure. Hence, it is important to continuously look at metrics from different angles, employ various different metrics in order to measure one goal and question the metrics we\u2019ve chosen, especially over time.<\/p>\n<p>So that\u2019s almost it from me, I hope you enjoyed my blog post and that this will help you as much as it will help me when employing UX metrics in your projects. I don\u2019t want to finish without giving credit where credit is due. My article is simply a summary of research many other UX experts have done, so thank you for that! You\u2019ll find a detailed list of my sources below the post.<\/p>\n<p>And be sure to check back as I\u2019m looking forward to implementing my own suggestions in my current and future projects and reporting back on how it went in a UX Metrics 2.0 blog post!<\/p>\n<p>&nbsp;<\/p>\n<p>&#8212;<\/p>\n<p><a href=\"#_ednref1\" name=\"_edn1\">[1]<\/a> Ratkliff &amp; Kelakar 2020: <a href=\"https:\/\/www.userzoom.com\/ux-blog\/what-ux-metrics-and-kpis-do-the-experts-use-to-measure-experience\/\">https:\/\/www.userzoom.com\/ux-blog\/what-ux-metrics-and-kpis-do-the-experts-use-to-measure-experience\/<\/a><\/p>\n<p><a href=\"#_ednref2\" name=\"_edn2\">[2]<\/a> Jeff Humble 2022: <a href=\"https:\/\/www.thefountaininstitute.com\/free-masterclass-ux-metrics?utm_source=webinar&amp;utm_medium=talk+slide&amp;utm_campaign=DPE+Fall+2022\">https:\/\/www.thefountaininstitute.com\/free-masterclass-ux-metrics?utm_source=webinar&amp;utm_medium=talk+slide&amp;utm_campaign=DPE+Fall+2022<\/a><\/p>\n<p><a href=\"#_ednref3\" name=\"_edn3\">[3]<\/a> Jakob Nielsen 2000: <a href=\"https:\/\/www.nngroup.com\/articles\/why-you-only-need-to-test-with-5-users\/\">https:\/\/www.nngroup.com\/articles\/why-you-only-need-to-test-with-5-users\/<\/a><\/p>\n<p>The Fountain Institute: Choosing the Right Metrics <a href=\"https:\/\/www.youtube.com\/watch?v=wBxnuk4sIns\">https:\/\/www.youtube.com\/watch?v=wBxnuk4sIns<\/a><\/p>\n<p>Bill Albert, Tom Tullis 2008: Measuring the User Experience: Collecting, Analyzing, and Presenting UX Metrics<\/p>\n<p>Ben Davison 2019: UX Metrics <a href=\"https:\/\/www.youtube.com\/watch?v=PU5i-Y1m1l4\">https:\/\/www.youtube.com\/watch?v=PU5i-Y1m1l4<\/a><\/p>\n<p>Jared Spool 2017: Is Design Metrically Opposed? <a href=\"https:\/\/www.youtube.com\/watch?v=aMqgTAlpVVc&amp;t=2646s\">https:\/\/www.youtube.com\/watch?v=aMqgTAlpVVc&amp;t=2646s<\/a><\/p>\n","protected":false},"author":73,"featured_media":0,"template":"","tags":[787,773,11,616],"class_list":["post-16125","blog","type-blog","status-publish","hentry","tag-big-data-2","tag-usage-data-analysis","tag-user-research","tag-ux-research"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/blog\/16125","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/users\/73"}],"version-history":[{"count":8,"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/blog\/16125\/revisions"}],"predecessor-version":[{"id":16127,"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/blog\/16125\/revisions\/16127"}],"wp:attachment":[{"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/media?parent=16125"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.centigrade.de\/en\/wp-json\/wp\/v2\/tags?post=16125"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}