How we test reliability.
Every mileage tracker claims to be accurate. Almost none of them tell you how they checked. Here’s how we check ours — and what our first real drive actually measured.
The claim that means nothing
“99% accurate.” You’ve seen it on every tracker’s marketing page. It’s a number with no test behind it — no drive, no reference, no method. It’s the kind of number a marketing team writes, not a number an engineer measured.
The problem with mileage accuracy specifically is that it’s easy to look accurate and hard to be accurate. A tracker that smooths its GPS trace aggressively will produce a clean-looking line that quietly under-counts your distance on every bend. A tracker that doesn’t bridge signal gaps will silently lose the tunnel you drove through. You’d never know — the app shows you a tidy map and a number, and you trust it, because what else are you going to do, drive the route again with a measuring wheel?
So before we put a number on anything, we built a test we could actually fail.
It’s easy to look accurate and hard to be accurate.
The capture-acceptance gate
Internally we call it the ยง4 gate. It’s a set of pass/fail checks a real drive has to clear before we’ll let the app anywhere near a launch. The important ones:
- Distance vs odometer. The app’s tracked distance is compared against the car’s actual odometer reading for the same drive. This is the headline honesty check — the odometer is the closest thing to ground truth a normal person has.
- Zero phantom trips. The app must not invent a second trip, split one drive into two, or log a journey that didn’t happen. One drive in, one trip out.
- Gap handling. When GPS drops — a tunnel, an underground car park, a tall-building street — the app has to bridge the gap sensibly and flag it, not silently lose the distance or teleport the line across the city.
- Background survival. The whole point is that you don’t have to open the app. The drive has to be captured with the phone in a pocket, the screen off, the app closed.
A drive that fails any of the hard checks doesn’t get rationalised away. It gets root-caused and fixed, and then we drive again.
The first real drive
On a 38-mile mixed route — motorway and suburban, the kind of drive a real user actually does — we ran the gate for the first time with both the app tracking and the odometer noted at start and finish.
38.4 tracked vs 38 on the clock
First real-world capture-acceptance drive. Within about 1%. 2,373 GPS samples, the full route captured in the background. The two big GPS gaps both turned out to be the car sitting still (a parking warm-up and a traffic light), not lost signal.
I want to be careful about how much weight that carries, because being careful is the whole point of this essay. It is one drive. The car’s odometer only reads to the nearest mile, so the “truth” I’m comparing against has its own half-mile of slop built in. I could compute a tidy-looking percentage to two decimal places, and internally we did, but publishing it as a precise headline would be exactly the kind of marketing-number dishonesty this essay is complaining about.
What I’ll say instead is the honest version: on the first real drive, the app came within about a percent of the odometer, captured the whole route in the background, and didn’t invent or lose anything. That’s a genuinely good first result. It is not yet a reliability claim, and I’m not going to call it one.
What it would take to actually claim a number
A real accuracy claim — the kind we’d be willing to print on the product page — needs more than one good drive. It needs:
- Many drives, many conditions. City, motorway, rural, stop-start traffic, bad weather, tunnels, multi-storey car parks. Accuracy that holds only on a clear-sky motorway run isn’t accuracy, it’s luck.
- Multiple phones. A Pixel, a Galaxy, an iPhone. Each manufacturer handles background apps differently — Samsung especially is aggressive about killing them — and a tracker that only works on one phone isn’t reliable, it’s a demo.
- Battery honesty. Background GPS costs power. We want to measure the cost properly and tell you what it actually is, not bury it.
- The worst case, not the average. We care more about the drive where it nearly went wrong than the one where everything was perfect. The number that matters is the floor, not the mean.
Why we’re telling you this before we’ve finished
Because the alternative is to say nothing until launch and then announce a confident percentage, and you’d have no way to know whether it came from a lab or a copywriter.
This is the deal we’re trying to make with the kind of person Keepwright is for: we’ll show you the method while it’s still in progress, we’ll tell you what we’ve measured and what we haven’t, and we won’t put a number on the box until the number is real. When we do claim an accuracy figure, this is the test it will have come from — and we’ll tell you how many drives, on what phones, in what conditions.
Until then: one good drive, measured honestly, openly hedged. That’s where we are.