Recovery strategies

In its bare essentials, deterministic MPC consists of two steps: (i) solving a finite-horizon optimal control problem with constraints on the state and the controlled inputs to get an optimal policy, and (ii) applying a controller derived from the policy obtained in step (i) in a rolling-horizon fashion. In view of the close relationship of SMPC with applications, any satisfactory theory of stochastic MPC must necessarily take into account its practical aspects. In this context, an examination of a standard linear system with constrained controlled inputs affected by independent and identically distributed (i.i.d.) unbounded (e.g., Gaussian) disturbance inputs shows that no control policy can ensure that with probability one the state stays confined to a bounded safe set for all instants of time. This is because the noise is unbounded, and the samples are independent of each other. Although disturbances are not likely to be unbounded in practice, assigning an a priori bound seems to demand considerable insight.

In case a bounded-noise model is adopted, existing worst-case analysis techniques for controlling deterministic systems with bounded uncertainties may be applied. The central idea is to synthesize a controller based on the bounds of the noise such that the target set becomes invariant with respect to the closed-loop dynamics. However, since the optimal policy is based on a worst-case analysis, it usually leads to rather conservative controllers, or even infeasibility. Moreover, complexity of the optimization problem grows rapidly (typically exponentially) with the optimization horizon. An alternative is to replace the hard constraints by probabilistic (soft) ones. The idea is to find a policy that guarantees that the state constraints are satisfied with high probability over a sufficiently long time horizon. While this approach may improve feasibility aspects of the problem, it does not address the issue of what actions should be taken once the state violates the constraints.

In view of the above considerations, developing recovery strategies appears to be a necessary step for dealing with constraint violation in SMPC. Such a strategy is to be activated once the state violates the constraints, and to be deactivated whenever the system returns to the safe set. In general, a recovery strategy must drive the system quickly to the safe set while simultaneously meeting other performance objectives. In the context of MPC, two merits are immediate: (a) once the constraints are transgressed, appropriate actions can be taken to bring the state back to the safe set quickly and optimally, and (b) if the original problem is posed with hard constraints on the state, in view of (a) they may be relaxed to probabilistic ones to improve feasibility.

One possible recovery strategy may be formulated as an optimal control problem up to an entry time, variously known as pursuit problem, transient programming, first passage problem, stochastic shortest path problem, etc. We formulate the problem as the minimization of an expected discounted cost until the state enters the safe set. An almost customary assumption in the literature concerned with stochastic optimal control up to an exit time is that the target set is absorbing. That is, there exists a control policy that makes the target set invariant with respect to the closed-loop stochastic dynamics. This is rather restrictive for control problems---it is invalid, for instance, in the very simple and canonical case of a linear controlled system with i.i.d. Gaussian noise inputs. We do not make this assumption, for, as mentioned above, our primary motivation for solving this problem is precisely to deal with the case that the target set is not absorbing.