Redundancy Reduction for Environment Representations

We use redundancy reduction techniques in self-supervised learning for learning effective representations of road environments. We incorporate two types of redundancy reduction in our transformer-based RedMotion model for trajectory forecasting.
  1. Redundancy reduction between token sets: The first type reduces a variable-sized set of local road environment tokens into a fixed-sized global embedding.
  2. Redundancy reduction between embeddings: The second type learns augmentation-invariant features between embeddings generated from augmented views of road environments.

We evaluate our method using both the Waymo Open Motion and Argoverse 2 Forecasting datasets. Our experiments reveal that, in semi-supervised setting, our pre-training method outperforms SOTA methods such as PreTraM, Traj-MAE, and GraphDINO improving prediction accuracy on both datasets. We compare our RedMotion model with other recent models for marginal motion prediction on the Waymo Motion Prediction Challenge. Our RedMotion model achieves results that are competitive with those of Scene Transformer, MTR++, Wayformer, and HPTR.

Similarity Maximization from Different Modalities

Joint motion forecasting aims to find the joint distribution of future motion sequences over multiple agents in a traffic scene. This requires a scene-wide understanding. In JointMotion, we a self-supervised pre-training approach that combines scene-level and instance-level objectives. The scene-level objective involves non-contrastive similarity learning between past motion sequences and environmental context (like lane and traffic light data), while the instance-level objective uses masked autoencoding to refine multimodal polyline representations.

We evaluate JointMotion by pre-training and fine-tuning several models, including the Scene Transformer, Wayformer, and HPTR on the Waymo Open Motion Dataset and the Argoverse 2 Motion Forecasting dataset. We show that our pre-training approach significantly improves the accuracy of these models and enhances their generalizability across various datasets and environmental conditions.
Publications on this topic Material
Royden Wagner, Ömer Şahin Taş, Marvin Klemp, Carlos Fernandez Lopez, Christoph Stiller. RedMotion: Motion Prediction via Redundancy Reduction. Transactions on Machine Learning Research, 2024.
Royden Wagner, Ömer Şahin Taş, Marvin Klemp, Carlos Fernandez Lopez. JointMotion: Joint Self-supervision for Joint Motion Prediction. arXiv preprint arXiv:2403.05489 , 2024.