Redundancy Reduction via Self-Supervised Learning

We introduce RedMotion, a transformer model for motion prediction that incorporates two types of redundancy reduction for road environments:
  1. Redundancy reduction between token sets: The first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of road environment tokens, such as road graphs with agent data, to a fixed-sized embedding.
  2. Redundancy reduction between embeddings: The second type of redundancy reduction is a self-supervised learning objective and applies the Barlow Twins redundancy reduction principle to embeddings generated from augmented views of road environments. Specifically, our model learns augmentation-invariant features of road environments as self-supervised pre-training.

Our experiments done with Waymo Open Motion Dataset reveal that our representation learning approach can outperform SOTA methods such as PreTraM, Traj-MAE, and GraphDINO in a semi-supervised setting. Our RedMotion model achieves results that are competitive with those of Scene Transformer or MTR++.

Publications on this topic

Name Material
Royden Wagner, Ömer Şahin Taş, Marvin Klemp, Carlos Fernandez Lopez. RedMotion: Motion Prediction via Redundancy Reduction. arXiv preprint arXiv:2310.17963, 2023.