Portion Estimation
Portion Estimation, Portion estimation refers to the AI task of determining the amount of food on a plate from an image. In calorie tracking applications, portion estimation often represents the most significant source of calorie inaccuracy, as two plates that appear similar can vary by more than 50% in actual weight due to factors such as dish density, concealed ingredients, and the angle of the photograph.
What is portion estimation?
Portion estimation is the AI task of estimating the number of grams (or ounces, or “servings”) of food present in a photograph. This is a different challenge from food classification, which simply determines what is depicted in the image. Portion estimation seeks to answer how much, which is a more complex question.
The challenge lies in geometry. A photograph presents a 2D representation of a 3D object with varying densities based on the dish. For instance, a plate of pasta with the same surface area can weigh 200g or 350g depending on the height of the noodle pile. A glass of olive oil and a glass of water may look almost identical in a photo but differ by 1,800 kcal per cup. Contemporary portion-estimation models strive to address this using depth cues from the camera (some applications utilize the iPhone’s LiDAR), reference items (such as a plate of known dimensions, or a fork visible in the frame), or learned norms from extensive training datasets (“a typical plate of pasta weighs 250g”).
How is it measured?
In our methodology, we evaluate portion estimation by calculating the mean absolute percentage error (MAPE) between the estimated portion (in grams) provided by the app and the portion weighed in a laboratory setting. Every plate in our 30-plate photo battery is weighed on a calibrated kitchen scale (with a precision of 0.1 g) prior to photography, ensuring that the ground truth is always accessible. We calculate the portion-MAPE across the entire battery, with an app’s portion-estimation score being compared against this MAPE.
During our 2026 evaluations, the portion-estimation MAPE displayed considerable variation across different applications. Dishes with a single ingredient (like one chicken breast or one banana) generally yield portion-MAPE in the range of 10-15%. More complex plates (such as a bowl of stir-fry) can elevate portion-MAPE to 20-30%. Mixed dishes that contain hidden fats or oils (like lasagna, biryani, or fried rice) may have portion-MAPE exceeding 40%, even in the top-performing applications. For more details about the protocol, refer to our weighed reference meals entry.
Why it matters in calorie tracking apps
For users, portion estimation typically represents the primary source of error in AI food logging. An application might accurately identify a dish as a “chicken-and-rice bowl” yet still generate a calorie estimate that is over 300 kcal off due to an incorrect estimate of rice quantity, for instance estimating 1.5 cups instead of 2.5. The clinical impact is that users aiming for precise calorie deficits or specific protein intake should verify portion estimates against manual entries for their most calorie-dense meals of the day.
Portion estimation is on the rise. Applications that utilize depth-sensing technology (where applicable) and incorporate reference objects within the frame report enhanced portion accuracy compared to those that rely solely on visual estimates. We anticipate that portion-MAPE will further tighten in 2026-2027 as multimodal models featuring improved physical-world reasoning are introduced; whether this reduction will match the accuracy of manual logging remains to be seen.