Scaling Wearable Foundation Models
Published on: November 20, 2024
Author(s): Daniel McDuff, Staff Research Scientist, Xin Liu, Senior Research Scientist, Google Health
This paper reports on the application of self-supervised learning (SSL) to wearable sensor data, aiming to improve the efficiency and effectiveness of models used for consumer health tasks like exercise and activity recognition. Wearable devices generate vast amounts of physiological and behavioral data, which are often difficult for both consumers and experts to interpret. To address this, SSL models are used to extract meaningful representations from large volumes of unlabeled data, overcoming the limitations of traditional supervised models, which rely on small, labeled datasets.
The authors investigate the scalability of neural models for wearable data, building on the success of scaling laws observed in text and image domains. Using a dataset of over 40 million hours of sensor data from 165,000 users, they train a foundation model called the Large Sensor Model (LSM). Their results show that increasing compute, data, and model size significantly improves model performance. This is especially evident in tasks like imputation, temporal interpolation, and exercise/activity classification.
Key findings include that performance scales sublinearly with compute and that larger models benefit more from data scaling, particularly when training on millions of hours of data. Furthermore, the LSM model demonstrates strong label efficiency, meaning it can perform well even with a small number of labeled examples, outperforming supervised baselines in few-shot learning tasks.
The study concludes that scaling wearable sensor models with large, diverse datasets and compute power enables substantial improvements in a variety of tasks, laying the groundwork for future advancements in personal health technologies.
Self-supervised Learning for Human Activity Recognition Using 700,000 Person-days of Wearable Data
Published in: 2024
Author(s): Hang Yuan, Shing Chan, Andrew P. Creagh, Catherine Tong, Aidan Acquah, David A. Clifton, Aiden Doherty
This paper investigates the use of self-supervised learning (SSL) techniques for human activity recognition (HAR) on the UK-Biobank (UKB) dataset, which contains over 700,000 person-days of unlabelled wearable sensor data. This dataset is the largest of its kind, and the authors explore how SSL can leverage this unlabelled data to improve the performance of HAR models. Their SSL-based model outperforms strong baselines across multiple benchmark datasets, showing improvements of up to 100%, especially in smaller datasets.
The paper details the use of three SSL tasks—Arrow of Time (AoT), permutation, and time warping (TW)—which prioritize the temporal dependencies of human motion. These tasks are shown to help create a model that generalizes well across various datasets, including those collected in different environments and using different devices. Unlike previous studies, which were limited by small datasets, the authors use a massive, real-world dataset (UKB), providing a more robust evaluation of SSL's effectiveness for HAR.
Key methods include pre-training a model using SSL on the unlabelled UKB dataset, followed by fine-tuning on smaller, labelled datasets. The authors also explore the impact of the amount of unlabelled data on performance, discovering that increasing the number of unlabelled subjects generally improves performance, particularly for smaller datasets.
In terms of evaluation, the study compares the SSL model against traditional approaches, including random forests and models trained from scratch, showing that SSL provides significant improvements in accuracy. The results are consistent even when fine-tuning is performed on datasets with fewer labelled subjects.
The paper emphasizes the benefits of SSL in real-world settings where obtaining labelled data is difficult and expensive. It also introduces an open-source model that can be used by researchers and developers to build customizable and high-performing activity classifiers. The authors provide a unified framework for evaluating these models, ensuring fair comparisons across different HAR models and datasets. This contribution offers a new benchmark for future research in the field.