Runtime Assurance and Enforcement

System safety must be ensured not only during design time, but also during runtime. Design-time verification may be too costly or make assumptions on the environment that later are not valid. This is where runtime verification comes into play.

Runtime Monitoring

Runtime monitoring in cyber-physical systems aims to identify potentially dangerous situations. Examples where runtime monitors are helpful include broken sensors or actuators, environment conditions outside the operational domain, and software errors. Typically, monitors are used in combination in scenarios where it the safety concern can be resolved by some predefined measure (e.g., stopping the car). The typical challenge is that we cannot observe the precise state of the system.

We consider model-based monitoring. In [1], the model combines partial observability, probabilistic and nondeterministic behavior. We show that the standard filtering approaches used for probabilistic or nondeterministic behavior do not scale when nondeterminism and probabilities are combined in a model, but that central decision problems in runtime monitoring remain tractable. In (missing reference), we consider continuous-time models and observations that have imprecise time stamps.

In [2], [3], we consider black-box monitoring in the context of autonomous systems.

Runtime Enforcement

Rather than just observing the system, runtime enforcement takes runtime assurance a step further. A typical setup for runtime enforcement is shielding via permissive schedulers: Given an observed trace through the system, what actions can we allow such that we can guarantee to satisfy some specification. Typically, we are interested in being minimally invasive. Thus, we want to allow as much as possible.

In [4], we rigorously applied such a setup for fully observable MDPs with quantitative formal risk specifications. We relaxed some assumptions to improve scalability in [5]. More recently, [6] considers qualitative (almost-sure) reachability specifications in a partially observable MDP. This is used to shield state-of-the-art RL algorithms in [7]

Safe Reinforcement Learning

A popular application of runtime enforcement is safe reinforcement learning. The papers above consider runtime enforcement in this context.

References

  1. Sebastian Junges, Hazem Torfah, and Sanjit A. Seshia, “Runtime Monitors for Markov Decision Processes,” in CAV, 2021.
    DOI: 10.1007/978-3-030-81688-9_26 supplemental material repo
  2. Hazem Torfah, Sebastian Junges, Daniel Fremont, and Sanjit A. Seshia, “Formal Analysis of AI-Based Autonomy: From Modeling to Runtime Assurance,” in RV, 2021.
    DOI: 10.1007/978-3-030-88494-9_19
  3. Hazem Torfah, Carol Xie, Sebastian Junges, Marcell Vazquez-Chanlatte, and Sanjit A. Seshia, “Learning Monitorable Operational Design Domains for Assured Autonomy,” in ATVA, 2022.
    DOI: 10.1007/978-3-031-19992-9_1
  4. Sebastian Junges, Nils Jansen, Christian Dehnert, Ufuk Topcu, and Joost-Pieter Katoen, “Safety-Constrained Reinforcement Learning for MDPs,” in TACAS, 2016.
  5. Nils Jansen, Bettina Könighofer, Sebastian Junges, Alex Serban, and Roderick Bloem, “Safe Reinforcement Learning Using Probabilistic Shields,” in CONCUR, 2020.
  6. Sebastian Junges, Nils Jansen, and Sanjit A. Seshia, “Enforcing Almost-Sure Reachability in POMDPs,” in CAV, 2021.
    DOI: 10.1007/978-3-030-81688-9_28 arXiv: 2007.00085 supplemental material
  7. Steven Carr, Nils Jansen, Sebastian Junges, and Ufuk Topcu, “Safe Reinforcement Learning via Shielding for POMDPs,” in AAAI, 2023.
    DOI: 10.1609/AAAI.V37I12.26723 arXiv: 2204.00755 supplemental material