← Back

What Does Your Whoop Actually Measure?

From green light to recovery score — and why the gap matters

I have been using a Whoop for a while now, and at some point I started wondering what it was actually measuring. Not because something felt wrong, but because the more I thought about it, the less obvious the answer was. The recovery score shows up every morning and it is easy to just accept it as a fact about your body. But once I started looking into how it works, the question became genuinely interesting.

It turns out that the number is not a measurement in the way that a blood test is a measurement. It is an estimate, the output of a chain of transformations that begins with a small green light pressed against your skin and ends, several steps later, in a percentage that feels authoritative. At each step, assumptions are made. The interface projects a lot of confidence: clean percentages, colour coded scores, decimal point precision. But that confidence is partly a product of design, not measurement accuracy. The sensor knows less than the app implies, and that is worth understanding.

The green light on your wrist

Most wearables measure your heart by shining a small LED into your skin and measuring how much light bounces back. Blood absorbs green light. Blood volume in the capillaries under your skin changes with every heartbeat. So the reflected light pulses in a pattern that can be used to infer the timing of your heartbeats.

Notice the word infer. The sensor does not directly measure your heart. It measures light. That light then passes through several processing steps — filtering out movement interference, detecting the peaks in the signal, converting those peaks into a heart rate estimate — before anything resembling a cardiac measurement emerges. The device on your wrist is doing considerably more computation than the polished interface suggests.

From that processed signal, a further metric is calculated: Heart Rate Variability, or HRV. HRV does not measure how fast your heart beats. It measures the variation in the time between consecutive beats. A heart beating at exactly 60 beats per minute with robotic precision has an HRV of zero. A real human heart at 60 beats per minute will have slight irregularities between beats, and the size of those irregularities turns out to be a surprisingly useful indicator of how recovered your body is. Higher variation generally means your nervous system is in a rest and recovery state. Lower variation often signals stress, fatigue, or illness.

But even HRV is not what the app shows you. Whoop takes your HRV, compares it to your own rolling baseline over the past few months, weights it alongside sleep duration, sleep quality, and recent training load, and produces a recovery score. The chain from green light to percentage is several steps long and none of those steps are visible to the user.

What can affect the reading

The sensor is picking up a signal in conditions that are rarely perfectly controlled. Things like ambient temperature, how well hydrated you are, and how the device sits on your wrist during sleep can all affect the quality of the reading. Movement is a known source of interference for this type of sensor, which is why wearables generally measure HRV during sleep when you are relatively still rather than during the day.

That does not mean the device is unreliable. It means that any individual reading can be noisier than it looks. The percentage on your screen carries more certainty than the underlying measurement always justifies.

What the recovery score is actually doing

The recovery score is more sophisticated than a simple daily reading. It already incorporates your personal baseline, your recent sleep, and your training history. In that sense it is doing a lot of the heavy lifting for you.

But it is worth keeping in mind that all of that context gets compressed into a single number. A green 83% and a red 42% feel like clear verdicts. What they actually represent is a model's best estimate, built on several layers of data that each carry their own uncertainty. The score of a single day can be affected by a night where the strap fit slightly differently or sleep was more disrupted than usual.

That does not make the recovery score useless. It makes looking at how it moves over several days more informative than reacting to any single morning. A persistent drop over a week tells you something real. A single low score after a night where you barely slept is probably telling you exactly what you already know.

You are not a single data point. You are a trend.

HRV is extraordinarily individual. A reading that is completely normal for one person can be low for another, depending on age, fitness level, genetics, and dozens of other factors. Comparing your numbers to someone else's makes about as much sense as comparing your height to a friend's to determine whether you had a good night's sleep. The number is real. The comparison is not.

This also matters for how sports teams use wearable data. A physiotherapy team might notice that across the squad, lower average HRV tends to precede higher injury rates the following week. That is a useful pattern at the group level. It does not automatically follow that any individual player with a lower reading is at elevated risk, because the variation between people is so large that individual conclusions become much harder to draw. What works as a predictor across a squad does not automatically apply to any individual player in it.

The prediction problem

The commercial pitch for wearables goes beyond monitoring. It is about prediction — the idea that your morning score can tell you whether to train hard or rest. This is a more demanding claim.

There is genuine research supporting the idea that HRV trends can help guide training decisions. The relationships are real, if modest. But there is a gap between finding a pattern in a study and reliably predicting outcomes for a specific person on a specific day. Most of the supporting research is based on relatively small groups measured under controlled conditions, which is not the same as someone checking their phone in the morning and deciding whether to go to the gym.

There is also the question of what the score is not measuring. An athlete who performs brilliantly despite a red recovery score has not disproved anything. They have illustrated something important: physiological readiness is one input among many. Motivation, how well you slept three nights ago, what you ate, and pure chance are also inputs. Even a perfect readiness measurement would explain only part of what actually happens on a given day.

A useful tool, used well

Consumer wearables have put genuinely interesting physiological information into more hands than ever before. That is a real development. For a lot of people, seeing their sleep data or recovery trends is what motivates them to take rest more seriously, build more consistent routines, or notice when something is off before it becomes a bigger problem. That has real value.

The point of understanding how the score is built is not to distrust it. It is to use it better. The recovery score already does a lot of work to give you context. What you can add is the habit of reading it across time rather than as a daily verdict, and the awareness that a single unusual reading is worth noting but not necessarily worth reacting to.

I still check mine every morning. I just no longer treat a green number as permission and a red one as a sentence. I look at where things have been moving over the past week or two, and I pay attention when something shifts and stays shifted. And on the days when the score says one thing and my body says another, I have learned to trust the one that has been running longer — usually the one that shows up as a cramp halfway through warmup.