Energy Expenditure Estimation using Visual and Inertial Sensors at Home

Deriving a person’s energy expenditure accurately forms the foundation for tracking physical activity levels across many health and lifestyle monitoring tasks. In this work, we present a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor. The proposed individualindependent framework fuses information from both modalities which leads to improved estimates beyond the accuracy of single modality and manual metabolic lookup table (MET) based methods. For evaluation, we introduce a new dataset called SPHERE_RGBD+Inertial_calorie, for which visual and inertial data is simultaneously obtained with indirect calorimetry ground truth measurements based on gas exchange. Experiments show that the fusion of visual and inertial data reduces the estimation error by 8% and 18% compared to the use of visual only and inertial sensor only, respectively, and by 33% compared to a MET-based approach. We conclude from our results that the proposed approach is suitable for home monitoring in a controlled environment.

The proposed method

We propose an activity-specific pipeline to estimate energy expenditure utilising both depth and motion features as input. Importantly, our setup as shown in Figure 1 is designed to reason about activities first, before estimating calorie expenditure via a set of models which are each separately trained for particular activities. 

Figure 1. Framework Overview. RGB-D videos are represented by a combination of flow and depth features. The proposed recurrent method AS (top) then selects activityspecific models which map to energy expenditure estimates.We compare this method to a direct mapping method DM and a manual estimate via lookup tables MET (bottom).

SPHERE_RGBD+Inertial_calorie Dataset

We introduce the SPHERE_RGBD+Inertial_calorie dataset captured in a real living environment. estimation, comprising RGB-Depth and inertial sensor data captured in a real living environment. The ground truth was captured by the COSMED K4b2 portable metabolic measurement system. The dataset was generated over 20 sessions by 10 subjects with varying anthropometric measurements containing up to 11 activity categories per session, and totalling around 10 hours recording time. 

Colour and depth images were acquired at a rate of 30Hz. The accelerometer data was captured at about 100Hz and sampled down to 30Hz. The calorimeter gives readings per breath, which occurs approximately every 3 seconds. To better model transitions between activity levels, we consider 9 different combinations of the above three activity intensities in each session. Figure 2 shows a detailed example of calorimeter readings and associated sample RGB images from the dataset. The raw breath data is noisy (in red), and so we apply an average filter with a span of approximately 20 breaths (in blue). 

Figure 2. Ground truth example sequence. Raw per breath data (red) and smoothed COSMED-K4b2 calorimeter readings (blue) and sample colour images corresponding to the activities performed by the subject


Table 1 presents the detailed results for each sequence. The accuracy is calculated over the total calorie expended in each recording session. We also measure the correlation between the ground truth and the observed values. The proposed AS achieves higher accuracy and correlation in more sequences than DM and MET model based methods, and obtains better rates on average. Figure 3 illustrates an example (corresponding to sequence 6 in Table 1) of a visual trace of calorie values.
Table 1.   Ground truth and predicted calorie values in total per sequence and its accuracy and correlation. The best results for each sequence are in bold.                    
Figure 3. Example of Calorie Uptake Prediction. In comparison to DM and MET, AS(The proposed method) shows its ability to better predict calories and model the transition between activities. 

Real-time processing

The application on estimating of physical activity intensity has been deployed in the real-time multi-camera video platform of the SPHERE sensor network system. More details can be found in our paper

Publication and Dataset

The dataset and the proposed method is presented in the following papers: 
  • Lili Tao, Tilo Burghardt, Majid Mirmehdi, Dima Damen, Ashley Cooper, Massimo Camplani, Sion Hannuna, Adeline Paiement and Ian Craddock. Energy Expenditure Estimation using Visual and Inertial Sensors. IET Computer Vision, 2017 (Accepted)
  • Lili Tao, Tilo Burghardt, Majid Mirmehdi, Dima Damen, Ashley Cooper, Massimo Camplani, Sion Hannuna, Adeline Paiement and Ian Craddock. Calorie Counter: RGB-Depth Visual Estimation of Energy Expenditure at Home. 13th Asian Conference on Computer Vision (ACCV 2016) Workshop on Assistive Vision, 2016.
  • Lili Tao, Tilo Burghardt, Majid Mirmehdi, Dima Damen, Ashley Cooper, Massimo Camplani, Sion Hannuna, Adeline Paiement and Ian Craddock. Real-time Estimation of Physical Activity Intensity for Daily Living. 2nd IET International Conference on Technologies for Active and Assisted Living (TechAAL2016), 2016.
SPHERE_RGBD+Inertial_calorie dataset can be downloaded at