STEP-RL: Specializing TEmporal Planning using Reinforcement Learning

PI Stories 2024
9 May 2024
Start time 
1:30 pm
Polo Ferrari 1 - Via Sommarive 5, Povo (Trento)
Aula Garda, piano +1
Doctoral School in Information Engineering and Computer Science (IECS)
Target audience: 
UniTrento alumni
UniTrento students
Free – Registration required
Registration deadline: 
7 May 2024, 23:59
Contact details:
Andrea Micheli, Fondazione Bruno Kessler


Planning - devising a strategy to achieve a desired objective - is one of the basic forms of intelligence. Temporal planning studies the automated synthesis of strategies when time and temporal constraints matter: it is one of the most strategic fields of Artificial Intelligence, with applications in autonomous robotics, logistics, flexible production, and many other fields. Historically, research on temporal planning has followed a general-purpose framework: a generic engine searches for the strategy by reasoning on the problem statement (i.e. the starting condition and the desired objective), as well as on a formal model of the domain (i.e. the possible actions). Despite substantial progress in the recent years, domain-independent temporal planning still suffers from scalability issues, and fails to deal with real-word problems. The alternative is to devise ad-hoc, domain-specific solutions that, although efficient, are costly to develop, rigid to maintain, and often inapplicable in non-nominal situations.
In this talk, I will present the major research questions, the work organization and the acquisition phase of my ERC Starting Grant project titled STEP-RL. STEP-RL will study the foundations of a new approach to Temporal Planning that will be domain-independent and efficient at the same time. The idea is to adopt a framework based on Reinforcement Learning, where a domain-independent temporal planner is specialized with respect to the domain at hand. STEP-RL will continuously improve its ability to solve temporal planning problems by learning from experience, thus becoming increasingly efficient by means of self-adaptation. STEP-RL will advance the state of the art in temporal planning beyond the "efficiency vs flexibility'' dilemma, that I had to personally face in the many industrial projects I worked on.

About the speaker

Andrea Micheli is the head of the "Planning Scheduling and Optimization" research unit at Fondazione Bruno Kessler, Trento, Italy ( His research focuses on the development and technology transfer of automated planning technologies. He obtained his PhD in Computer Science from the University of Trento in 2016. His PhD thesis titled "Planning and Scheduling in Temporally Uncertain Domains'' won several awards including the EurAI Best Dissertation Award and the honorable mention at the ICAPS Best Dissertation award. He currently works in the field of temporal planning and is the main developer of the TAMER planner ( He is also lead developer of the pysmt ( open-source project aiming at providing a standard Python API for satisfiability modulo theory solvers. Andrea coordinated the AIPlan4EU project ( aiming to remove the access barriers to automated planning technology and to bring such technology to the European AI On-Demand Platform. He authored more than 30 papers in the Formal Methods and Artificial Intelligence fields. Andrea recently won an ERC Starting Grant for researching novel solutions in the combination of temporal planning and reinforcement learning.