Accuracy Tests, Per-App Lab Reports
Published May 8, 2026 · Updated May 23, 2026
What this is. Five supporting lab reports accompanying the 2026 Calorie Counter App Accuracy Benchmark. Each report focuses on an individual app, detailing the complete 40-meal results per app, pooled MAPE by category, identified failure modes with suggested causes, logging speed, and the specific categories where that app excels. This is not a ranking, but an in-depth examination for readers to understand how their tracker performs prior to relying on the daily figures.
How to read these reports
The five reports presented below utilize the same base dataset, which includes 40 weighed reference meals (10 single items, 10 packaged goods, 10 from restaurant chains, and 10 mixed home recipes), evaluated against USDA FoodData Central and published nutritional information from chains. Each app independently logged the 40 meals using its own workflow. The pooled mean absolute percentage error (MAPE) indicates, on average, how much the app's calorie estimate varies from the weighed reference, while the per-category analysis reveals the sources of the discrepancies.
Lab reports are not evaluations. We do not provide an overall score, we do not prioritize features, and we do not assess user experience. For a comprehensive scored review of any app, refer to Reviews. For information regarding the methodology of our testing protocol, see the v1.0 protocol document. For the raw CSV that supports every statistic on every page, visit the dataset page.
Pooled MAPE ranking (40 meals, Q2 2026)
A lower number is preferable. The values listed below are aggregated from all 40 meals for each app and correspond to the figures detailed in the v1.2 dataset. Each row links to the detailed lab report for that app.
| Rank | App | Pooled MAPE | In one sentence |
|---|---|---|---|
| 1 | Nutrola | ±0.7% | Volumetric depth-based portion estimation; recorded the lowest pooled error in the testing group; sacrifices raw micronutrient breadth to Cronometer. |
| 2 | Cronometer | ±2.8% | Utilizes a manual-entry, USDA-anchored database with the most extensive micronutrient panel examined (84+ tracked nutrients). |
| 3 | MacroFactor | ±2.9% | Features an adaptive TDEE algorithm and structured manual entry; designed for periodised cuts rather than casual logging. |
| 4 | Lose It! | ±7.7% | Provides a gentle introduction for new users; the most affordable paid tier in the group at $39.99/year. |
| 5 | MyFitnessPal | ±9.7% | Contains the largest crowdsourced food database in the group (over 18M entries) and the widest coverage of US chain restaurants. |
Note: MAPE values are sourced from the v1.2 dataset. Per-meal figures are deterministic, recalculating the data from the raw CSV will yield the same number to one decimal place. We release the dataset under CC BY 4.0 specifically to enable verification.
What each lab report contains
- TL;DR card, app version tested, date of the test, pooled MAPE, a key finding summarized in one sentence, and a link back to the dataset.
- Test snapshot table, includes app version, operating system, locale, tester's name, date range, and meal count.
- Per-meal results table, detailing all 40 meals for that app: reference kcal, app's estimate, and absolute percentage error.
- Pooled accuracy breakdown, overall MAPE along with per-category MAPE (single / packaged / restaurant / mixed).
- Failure modes, highlighting two to four meals where the app had its largest errors, including a hypothesized cause.
- Logging speed sidebar, median time taken per meal for the app's native workflow.
- Where this app wins, identifying the categories where the app truly excels in comparison to others. No single app is claimed to be superior in all aspects.
- Compared to, a brief comparison paragraph against the other four apps within the group.
- Re-test schedule, indicating when the app will be retested next.
- Limitations, clarifying what the test does not measure (long-term adherence, behavior change, coaching).
What this cluster does not do
This cluster does not assess long-term adherence. It does not evaluate behavior change. It does not determine coaching quality, social features, depth of recipe management, or the difficulties associated with switching between trackers after years of data accumulation. It does not analyze micronutrient accuracy (that will be covered in a separate future test). It does not evaluate macronutrient accuracy at the gram level. The focus is solely on calories, assessed across forty meals that represent typical eating habits of a US-based user.
If a complete feature evaluation is what you seek, refer to the per-app Reviews where we utilize our 100-point rubric. For the methodology behind the lab reports, consult the v1.0 protocol. If you wish to replicate the figures yourself, the raw CSV is available here.