IAVVC

Risk-Based Filtering of Valuable Driving Situations in the Waymo Open Motion Dataset

1Honda Research Institute EU, Germany
2Honda Research Institute Japan, Japan
3Honda Research & Development, Japan
hrieuhrijphonda

Abstract

Improving automated vehicle software requires driving data rich in valuable road user interactions. In this paper, we propose a risk-based filtering approach that helps identify such valuable driving situations from large datasets. Specifically, we use a probabilistic risk model to detect high-risk situations. Our method stands out by considering a) first-order situations (where one vehicle directly influences another and induces risk) and b) second-order situations (where influence propagates through an intermediary vehicle). In experiments, we show that our approach effectively selects valuable driving situations in the Waymo Open Motion Dataset. Compared to the two baseline interaction metrics of Kalman difficulty and Tracks-To-Predict (TTP), our filtering approach identifies complex and complementary situations, enriching the quality in automated vehicle testing.

We open-sourced the retrieved situations and computed risk values, examples can be found here:
https://hri-eu.github.io/RiskBasedFiltering.

Retrieving Valuable Driving Situations

Filtering Pipeline

The goal of this work is to detect valuable driving situations in large datasets. We define a situation as valuable when road users are likely to encounter high collision risks, often leading to change in behavior. The figure above shows our method for retrieving valuable driving situations. Starting from a data example containing the trajectories of road users (left part), we model the uncertainty of each vehicle's future motion (middle part). This enables us to estimate collision risks by computing the overlap of the trajectory uncertainties and integrating the risks over time (right part). The driving situation is now represented as a graph, with nodes representing road users and arrows representing the estimated risk values (lower right part). In this step, we finally identify valuable driving situations by examining how risk propagates through the interaction graph.

First-Order and Second-Order Situations

Driving Situations

We define two types of driving situations - first-order and second-order situations. In first-order situations, other vehicles directly influence the ego vehicle, creating a risk of collision. This interaction is visualized as a graph, with arrows representing risk pointing directly towards the ego vehicle's node. Second-order situations are more indirect and analyzed differently. Other cars influence the ego vehicle through an intermediate vehicle. In a graph, this is depicted by arrows showing the risk flow of from the first red car to the intermediate vehicle, and finally to the ego vehicle.

Results

Confusion Matrix

We applied our approach on the full Waymo Open Motion dataset and open-source our filtered valuable driving situations. We provide in total 4.4 million valuable first-order situations and 3.3 million valuable second-order situations. Compared to the two baselines of Kalman difficulty and Tracks-To-Predict (TTP), our approach retrieves many different valuable driving situations which helps to further improve automated vehicle testing or training. The off-diagonal entries of the confusion matrix shows the share of the differently detected situations for the dataset.

Driving Situation Examples

Valuable Situations

Valuable driving situations retrieved by our approach are shown. Valuable situations include, e.g., close car-car following or pedestrian-pedestrian passing situations and crossing situations involving cars, pedestrians and bicycles. Valuable multi-vehicle situations include lane-cutting maneuvers involving two other cars, three pedestrians moving in a crowd or cars turning in multi-lane traffic. Examples of non-valuable driving situations are, for example, situations with one vehicle waiting and not interacting, situations in which the interaction already happened or group interactions in which one road user is far away. Please look at the data examples for more details.

BibTeX

@inproceedings{puphal2025,
      author = {Puphal, Tim and Ramtekkar, Vipul and Nishimiya, Kenji},
      title = {Risk-Based Filtering of Valuable Driving Situations in the Waymo Open Motion Dataset},
      booktitle={IEEE International Automated Vehicle Validation Conference (IAVVC)}
      year = {2025}
}