Self-supervised Learning for Human Activity Recognition Using 700,000 Person-days of Wearable Data

This paper investigates the use of self-supervised learning (SSL) techniques for human activity recognition (HAR) on the UK-Biobank (UKB) dataset, which contains over 700,000 person-days of unlabelled wearable sensor data. This dataset is the largest of its kind, and the authors explore how SSL can leverage this unlabelled data to improve the performance of HAR models. Their SSL-based model outperforms strong baselines across multiple benchmark datasets, showing improvements of up to 100%, especially in smaller datasets.

The paper details the use of three SSL tasks—Arrow of Time (AoT), permutation, and time warping (TW)—which prioritize the temporal dependencies of human motion. These tasks are shown to help create a model that generalizes well across various datasets, including those collected in different environments and using different devices. Unlike previous studies, which were limited by small datasets, the authors use a massive, real-world dataset (UKB), providing a more robust evaluation of SSL’s effectiveness for HAR.

Key methods include pre-training a model using SSL on the unlabelled UKB dataset, followed by fine-tuning on smaller, labelled datasets. The authors also explore the impact of the amount of unlabelled data on performance, discovering that increasing the number of unlabelled subjects generally improves performance, particularly for smaller datasets.

In terms of evaluation, the study compares the SSL model against traditional approaches, including random forests and models trained from scratch, showing that SSL provides significant improvements in accuracy. The results are consistent even when fine-tuning is performed on datasets with fewer labelled subjects.

The paper emphasizes the benefits of SSL in real-world settings where obtaining labelled data is difficult and expensive. It also introduces an open-source model that can be used by researchers and developers to build customizable and high-performing activity classifiers. The authors provide a unified framework for evaluating these models, ensuring fair comparisons across different HAR models and datasets. This contribution offers a new benchmark for future research in the field.

Share the Post:

Related Posts