Retrieving Valuable Driving Situations
Improving automated vehicle software requires driving data rich in valuable road user interactions. In this paper, we propose a risk-based filtering approach that helps identify such valuable driving situations from large datasets. Specifically, we use a probabilistic risk model to detect high-risk situations. Our method stands out by considering a) first-order situations (where one vehicle directly influences another and induces risk) and b) second-order situations (where influence propagates through an intermediary vehicle). In experiments, we show that our approach effectively selects valuable driving situations in the Waymo Open Motion Dataset. Compared to the two baseline interaction metrics of Kalman difficulty and Tracks-To-Predict (TTP), our filtering approach identifies complex and complementary situations, enriching the quality in automated vehicle testing.
We open-sourced the retrieved situations and computed risk values, examples can be found here:
https://hri-eu.github.io/RiskBasedFiltering.
We define two types of driving situations - first-order and second-order situations. In first-order situations, other vehicles directly influence the ego vehicle, creating a risk of collision. This interaction is visualized as a graph, with arrows representing risk pointing directly towards the ego vehicle's node. Second-order situations are more indirect and analyzed differently. Other cars influence the ego vehicle through an intermediate vehicle. In a graph, this is depicted by arrows showing the risk flow of from the first red car to the intermediate vehicle, and finally to the ego vehicle.
We applied our approach on the full Waymo Open Motion dataset and open-source our filtered valuable driving situations. We provide in total 4.4 million valuable first-order situations and 3.3 million valuable second-order situations. Compared to the two baselines of Kalman difficulty and Tracks-To-Predict (TTP), our approach retrieves many different valuable driving situations which helps to further improve automated vehicle testing or training. The off-diagonal entries of the confusion matrix shows the share of the differently detected situations for the dataset.
Valuable driving situations retrieved by our approach are shown. Valuable situations include, e.g., close car-car following or pedestrian-pedestrian passing situations and crossing situations involving cars, pedestrians and bicycles. Valuable multi-vehicle situations include lane-cutting maneuvers involving two other cars, three pedestrians moving in a crowd or cars turning in multi-lane traffic. Examples of non-valuable driving situations are, for example, situations with one vehicle waiting and not interacting, situations in which the interaction already happened or group interactions in which one road user is far away. Please look at the data examples for more details.
@inproceedings{puphal2025,
author = {Puphal, Tim and Ramtekkar, Vipul and Nishimiya, Kenji},
title = {Risk-Based Filtering of Valuable Driving Situations in the Waymo Open Motion Dataset},
booktitle={IEEE International Automated Vehicle Validation Conference (IAVVC)}
year = {2025}
}