ICRA ICRA

Neuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions
for Skill Learning

1Intelligent Autonomous Systems, TU Darmstadt, Germany
2Honda Research Institute EU, Germany
3German Research Center for AI, Germany
4Hessian Centre for Artificial Intelligence, Germany

iastudahridfkihessian

Abstract

Imitation learning is a popular method for teaching robots new behaviors. However, most existing methods focus on teaching short, isolated skills rather than long, multi-step tasks. To bridge this gap, imitation learning algorithms must not only learn individual skills but also an abstract understanding of how to sequence these skills to perform extended tasks effectively. This paper addresses this challenge by proposing a neuro-symbolic imitation learning framework. Using task demonstrations, the system first learns a symbolic representation that abstracts the low-level state-action space. The learned representation decomposes a task into easier subtasks and allows the system to leverage symbolic planning to generate abstract plans. Subsequently, the system utilizes this task decomposition to learn a set of neural skills capable of refining abstract plans into actionable robot commands. Experimental results in three simulated robotic environments demonstrate that, compared to baselines, our neuro-symbolic approach increases data efficiency, improves generalization capabilities, and facilitates interpretability.

Neuro-Symbolic Policies

Neuro Symbolic Policy

In our framework, policies have both symbolic and neural components. The symbolic components consist of predicates that abstract the state-space and operators that define a transition model in the abstract state-space induced by the predicates. Together, predicates and operators define a planning problem in the Planning Domain Definition Language (PDDL) and can be utilized to generate abstract plans. The neural components consist of skills that together enable the execution of abstract plans in the environment. To execute the policy on a given task, we first first abstract the low-level start and goal state using the predicates. Following, an abstract plan is computed using the operators and off-the-shelve planning algorithms. Lastly, the corresponding skill sequence is executed.

Learning from Demonstrations

Learning from Demonstration

Our approach to learning neuro-symbolic policies is divided into two phases. In the first phase, we learn the symbolic components of the policy. To learn predicates, we first generate a set of candidate predicates based on features observed in the demonstrations. Subsequently, we select among these candidates by optimizing a novel objective function. Concurrently, operators are learned based on the symbolic transitions induced by the predicates. In the second phase, we utilize the identified symbolic abstraction to learn a distinct neural skill for every operator identified in the first phase. For that, we segment the demonstrations based on the learned symbolic representation and train the neural networks using behavior cloning.

Results

Neuro Symbolic Policy

We compare our approach to two baselines and evaluate across three distinct generalization scenarios: Scenario I introduces initial object poses not seen during training. Scenario II additionally introduces unseen goals. Lastly, Scenario III introduces more objects than during training. With 300 training demonstrations, our method achieves a high success rate across all environments and generalization scenarios. Furthermore, it outperforms both baselines across all number of demonstrations, showcasing its data-efficiency. The result highlights a major advantage of the neuro-symbolic approach: Through the learned symbols, we can benefit from the generalization capabilities of symbolic planning.

Learned Predicates and Operators

Neuro Symbolic Policy

We visualize learned predicates by overlaying images of states in which the predicate is true. These visualizations allow us to assign meaningful names to all predicates, making them easier to interpret. Once predicates are named, we can interpret the preconditions and effects of each operator and assign meaningful names to them as well. With all symbols named, the abstract plans generated by the policy become fully interpretable.

BibTeX

@inproceedings{keller2025,
      author = {Keller, Leon and Tanneberg, Daniel and Peters, Jan},
      title = {Neuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions for Skill Learning},
      booktitle={IEEE International Conference on Robotics and Automation (ICRA)}
      year = {2025}
}