Note: This content is accessible to all versions of every browser. However, this browser does not seem to support current Web standards, preventing the display of our site's design details.


Stochastic Optimal Control through Moments and Positive Polynomials


C. Stäheli

Master Thesis, FS15 (10486)

The objectives of this master thesis are to develop a numerical scheme to approximate the value function and the optimal policy of a nonlinear discrete time stochastic optimal control problem (SOC). The solution of the SOC is formulated through a pair of infinite dimensional linear programs. In the primal problem, the SOC problem is cast by relaxing the Bellman equation to an inequality and obtaining an equivalent infinite-dimensional linear program whose decision variable corresponds to the value function of the SOC. Using polynomial basis functions and assuming polynomial problem data as well as the knowledge of the moments of the uncertainty and initial state distribution, the sum of squares (SOS) programming technique yields the primal SDP to approximate the value function. A greedy policy can be obtained from the approximated value function. The dual problem is obtained by defining state input occupation measures leading to an equivalent infinite-dimensional linear program. Approximation of these measures by a finite sequence of non-centralized moments results in the dual SDP under the same assumptions as for the primal. A polynomial policy can be synthesized from the joint state-input moments. The proposed discrete time approaches are implemented in Matlab for finite and infinite horizon SOCs. The latter can either be handled with discounted cost emphasizing on the transient phase of the dynamics or the averaged cost focusing on the steady state. Furthermore, in order to extend the functionality of the SOC-tool to a wider field of applications, a transformation for trigonometric problem data with rational fractions to polynomial problem data is presented. The tool is validated on a number of numerical examples. Monte Carlo simulation based on the approximations of the optimal policy is performed to evaluate the performance and limitations of the approach.

Supervisors: Tyler Summers, Maryam Kamgarpour, John Lygeros


Type of Publication:

(12)Diploma/Master Thesis

File Download:

Request a copy of this publication.
(Uses JavaScript)
% Autogenerated BibTeX entry
@PhdThesis { Xxx:2015:IFA_5233
Permanent link