Learning controllers from data via kernel-based interpolation

We propose a data-driven control design method for nonlinear systems that builds on kernel-based interpolation. Under some assumptions on the system dynamics, kernel-based functions are built from data and a model of the system, along with deterministic model error bounds, is determined. Then, we derive a controller design method that aims at stabilizing the closed-loop system by cancelling out the system nonlinearities. The proposed method can be implemented using semidefinite programming and returns positively invariant sets for the closed-loop system.

that the plant-model mismatch is at most parametric.Recent contributions in this context are [2], [3] for what concerns indirect methods and [4], [5], [6] for what concerns direct methods.Dealing with nonlinear systems is arguably much more difficult.One main reason is that it becomes harder to compute finite-sample uncertainty bounds, even when the uncertainty is purely parametric.Another main reason is that controller design for nonlinear systems is itself much more complex.Recent contributions that consider parametric uncertainty tackle bilinear systems [7], [8], polynomial (and rational) systems [9], [10], [11], [12], and LPV systems [13].For general nonlinear systems, but still in the context of parametric uncertainty, we find linearly parametrized models with known basis functions [14], [15].The result in [15], in particular, introduces a controller design technique that provides, under rather mild conditions, finite-sample stability guarantees along with an estimate of regions of attraction and positive invariant sets for the closed-loop system.
Assuming the exact knowledge of the basis functions is reasonable in many practical cases such as with mechanical and electrical systems in which some prior information about the dynamics is available but the exact systems parameters may be unknown.In many other cases, however, this prior information may be unknown.Methods that consider this scenario include methods based on Gaussian process models [16], [17], methods based on linear [4], [18], [19], [20], [21] and polynomial approximations [22], [23], and methods based on linearly parametrized models with partially known basis functions [15].Despite the differences, the common idea is to describe the system via a quantity which is known up to parametric uncertainty and treat unmodeled dynamics as an error term, i.e., a remainder.The challenge is thus twofold: (i) to derive finite-sample bounds for the remainder and (ii) to design a control law that is robust to the uncertainty that this remainder introduces.
Contribution and outline of the paper.In this paper, we consider the last scenario discussed above, that is the scenario where the system to control has general dynamics (e.g.not necessarily bilinear or polynomial) and there is no prior knowledge of the true basis functions.We propose a new method that combines ideas from kernel-based identification [24] and the controller design method introduced in [15].Specifically, we consider an indirect method that consists of two steps: we first determine a kernel-based model of the system along with deterministic error bounds, in line with recent results on kernel learning [25].Then, we consider a controller design method that explicitly accounts for the uncertainty around the nominal model.Since the nominal model is generally nonlinear and lacks a specific structure, we consider a method in which the control law is designed so as to render the dynamics in closed loop nearly linear (as much as possible) by cancelling the nonlinearities of the system.We show that the method returns positively invariant sets for the closed-loop system and can be implemented via semidefinite programming.Kernel-based methods have been previously considered mostly in connection with Gaussian processes [16], [17].In a deterministic setting, contributions have been proposed in the realm of modeling and control [26], [27], [28], [29].To the best of our knowledge, our work is the first work on kernel learning that gives deterministic guarantees in the context of feedback controller design.
The rest of the paper is organized as follows: Preliminaries on kernels, RKHS and regularized interpolation are given in Section II.Section III provides the main result in which we derive a controller design method based on kernel models.Section IV presents simulation results on a nonlinear system.Conclusions and future work are discussed in Section V.
Notation.Throughout the paper, R denotes the set of real numbers, and N >0 denotes the set of positive integers.S n×n denotes the set of real-valued symmetric matrices.Given a matrix M , M 0 (M 0) means that M is positive definite (positive semidefinite), while M ≺ 0 (M 0) means that M is negative definite (negative semidefinite).Finally, we denote by |x| the 2-norm of a vector x, and by M the induced 2-norm of a matrix M .Other, less standard, notions are introduced throughout the paper.

A. Kernels and their RKHS
Given a non-empty set Ω ⊆ R n , a continuous function K : Ω × Ω → R is called a positive definite kernel on Ω if i,j α i α j K(x i , x j ) > 0 for any N ∈ N >0 , any set of pairwise distinct points x 1 , . . ., x N ⊆ Ω, and any nonzero vector α ∈ R N .It is called positive semidefinite if i,j α i α j K(x i , x j ) ≥ 0 for any N ∈ N >0 , any set of pairwise distinct points x 1 , . . ., x N ⊆ Ω, and any vector α ∈ R N , It is called symmetric if K(x, y) = K(y, x) for any x, y ∈ Ω.
Definition 1: ([30, Def.10.1]) Let H be a real Hilbert space of functions f : Ω → R. The function For every y ∈ Ω, the function K(•, y) belongs to H. 2) (Reproducing property) For every y ∈ Ω and every f ∈ H, it holds that where •, • H is the inner product in H. Fact 1: [31] To every positive semidefinite and symmetric kernel K, there corresponds a unique Hilbert space admitting K as a reproducing kernel.
A Hilbert space that admits a reproducing kernel is called a reproducing kernel Hilbert space (RKHS).By Definition 1, the kernel centred at a point a ∈ Ω, i.e., K(•, a), belongs to H.For a function of the form f (

B. Regularized interpolation and its error bound
Consider a positive semidefinite and symmetric reproducing kernel K : Ω × Ω → R and the associated RKHS H. Consider an unknown function f : Ω → R belonging to H, and let f generate the data points (y i , x i ), i = 0, . . ., T − 1, where y i = f (x i ) Our objective is to find a function s f ∈ H that minimizes the cost function where λ > 0 is the regularization parameter.By the representer theorem [32], the minimizer takes the form where α ∈ R T and The functions K(x, x i ) are called kernel-based basis functions that are the kernels centered at the data points x i , i = 0, . . ., T − 1.The number of kernel-based basis functions is equal to the number of data points, and when the dataset is fixed, determining the model s f is equivalent to computing the coefficients α.By [24, Th. 2], we have where and The following result gives a deterministic finite-sample error bound associated with (5).
Theorem 1: Consider a positive semidefinite symmetric reproducing kernel Then the interpolating function s f (x) in ( 5) provides an estimate of the function f (x) for x ∈ Ω with interpolation error satisfying Proof.See Appendix A. Similar statements are given in [33], [26].
Remark 1: The interpolation error bound ( 8) is factored into two parts: the first term f H only depends on f , while the second term is independent of f and only depends on the kernel and the data.Deterministic bounds of this type have been recently proposed in the literature [25].It is not our goal here to consider the problem of deriving optimal bounds; rather, we take (8) as an example of error bounds that can be used for control design purposes.We refer the interested reader to [25] for a more ample discussion on the problem of establishing interpolation bounds.
Remark 2: The regularization helps to avoid overfitting.The bound in (8) continues to hold if λ = 0 [33, Sec.5.1], but in this case the kernel K must be positive definite.

III. MAIN RESULTS
Consider a discrete-time affine-input nonlinear system where x ∈ R n is the state and u ∈ R m is the control input, f is the drift vector field, and B is a constant matrix.Both f and B are considered unknown.We instead assume that (x e , u e ) = (0, 0) is a known unstable equilibrium point of the system.The objective is to design a feedback controller that stabilizes the dynamics around the origin.
As anticipated in the Introduction, we will consider an indirect method that consists of two steps: we first construct a kernel-based model of the system along with deterministic error bounds (Theorem 1).Then, we will derive a controller design method that explicitly accounts for the uncertainty around the nominal model.This method is inspired by [15] but presents some differences that will be discussed later on in the paper.

A. Kernel-based functions and error bounds
To derive a model of the system, we proceed in two steps.As a first step, we set the control input u = 0 and collect from the system a dataset (10) of samples satisfying x(k + 1) = f (x(k)), k = 0, . . ., T − 1, with T > 0. We note that the samples can be computed from a single trajectory or from multiple trajectories of the system.
Let now K denote a kernel function chosen by the designer.Given K and the dataset D, let The function k(x) represents the vector of basis functions that will generate the interpolation function s f (x).
To use Theorem 1 we need the following assumption.Assumption 1: All the n components of f in (9) belong to the RKHS H associated to K.Moreover, an upper bound Γ i for f i H , i = 1, . . ., n, is known.
Methods for estimating Γ i are discussed in [25].Here we just point out that the bound can be loose, although this may render the control design step more difficult.By solving (5), the interpolation function of f (x) takes the form where A := X 1 (λI T + K X0 ) −1 and where the matrix K X0 is as in (7) with X replaced by X 0 .Let By (8), each component of the vector d thus satisfies with KX0 as in Theorem 1 with X replaced by X 0 .Hence, by letting if follows from Assumption 1 that the interpolation error on the function f satisfies the deterministic bound

B. Controller design method based on approximate nonlinearity cancellation
As a second step, we derive a control design method that exploits the bound on the interpolation error.By previous analysis, the dynamics (9) can be written as where A ∈ R n×T is known and B ∈ R n×m is still unknown.
To determine the feedback controller, we make a second experiment on the system where we apply a nonzero input sequence u and collect a new dataset where k = 1, • • • , T and T > 0. These data are grouped in the data matrices which satisfy the identity where is the (unknown) data matrix of samples of d.We assume that this second experiment is carried out with an input such that the corresponding matrix U 0 has full row rank.This can be interpreted as an excitation condition on the experiment.We will write this condition as an assumption but it is indeed a design condition.
Assumption 2: U 0 has full row rank.By letting X1 := X 1 − AK 0 , we have BU 0 = X1 − D 0 .Assumption 2 thus implies and the dynamics can be written as Arrived at this point, note that the dynamics of k(x) depend on the selected kernel.We will consider the general case in which k(x) consists of both linear and nonlinear functions, so that Ak(x) can be decomposed as Ak(x) = Ax + Âk (x) with k : R n → R S that contains only nonlinear functions.
The special case k(x) = x, gives Â = 0 n×S .In contrast, A = 0 n×n when k(x) contains only nonlinear functions.Note that for a fixed k(x), the choice of k(x) is not unique, and different choices of k(x) generate different matrices Â.
With this decomposition, (24) reads equivalently as This decomposition suggests a control law in the form which gives the closed-loop dynamics A natural way to design the control law is then to design K so as to stabilize the linear part of the dynamics, and to design K so as to try to cancel out the nonlinear terms.This approach has been originally proposed in [15], and we refer the reader to it for a discussion regarding the connections between this approach and the classic feedback linearization.By Lyapunov theory, a necessary and sufficient condition for the linear dynamics ξ = (A + ( X1 − D 0 )U † 0 K)ξ to be stable is that for any Q 0 there exists a matrix S 0 that solves the Lyapunov equation Letting P = S −1 and multiplying both sides by P , this turns out to be equivalent to ) having set Y = KP .As we will see, this form is particularly convenient because it can be expressed as a linear matrix inequality (LMI) constraint.However, we cannot implement directly (29) because D 0 is unknown.The idea is thus to ensure that the constraint is satisfied for all the matrices D in a given set D to which D 0 is known to belong, i.e., Let Since D 0 D 0 ∆ 2 , we can therefore solve (30) with respect to the set Condition ( 30) cannot be implemented directly because it involves infinitely many constraints.The next result provides a tractable (and convex) condition for (30).
Lemma 1: Given Q 0 and ∆ defined in (31), if there exist P ∈ S n×n , Y ∈ R m×n and a scalar > 0 such that ) then (30) holds.
Proof: See Appendix B. Condition (33) guarantees stability of the linear dynamics ξ = (A+( X1 −D 0 )U † 0 K)ξ with K = Y P −1 .The remaining part of the controller, i.e., the matrix K, can be determined so as to minimize the effect of the nonlinearities in the closed loop.Including the design of K, a prototypical formulation is the following: where α ≥ 0 is a design parameter.As shown, (33) ensures stability of the linear dynamics ξ = (A + ( X1 − D 0 )U † 0 K)ξ.Instead, minimizing Â + X1 U † 0 K tries to reduce as much as possible the effect of the nonlinearities in the closed loop.In this context, the term α P acts as a regularization term that permits to enlarge the estimate of the positive invariant set for the closed-loop dynamics, as detailed in the sequel.Before proceeding, we remark that (34) should be viewed as an example.An alternative is to explicitly account for D 0 for the nonlinear term as well: Also this problem can be cast as a semidefinite program.The rest of this section is devoted to show that this method guarantees the existence of a positively invariant set for the closed loop if the modelling error is sufficiently small.Definition 2: For the system x + = f (x), if for every x(0) ∈ S, it holds that x(t) ∈ S for t > 0, then S is called a positively invariant (PI) set.
Let V (x) = x P −1 x, which acts as a Lyapunov function for the linear part of the dynamics, and define for brevity Then, the Lyapunov function satisfies Bearing in mind the expressions of Ψ and Ξ, the fact that D 0 D 0 ∆ 2 , and |d(x)| ≤ δ(x), simple (although tedious) calculations give where (These expressions show that penalizing the term P in (34) may increase the estimate of the PI set since (x) scales with P −2 while the other terms scale with P −1 ).Let X := {x : l(x) + g(x, δ(x)) ≤ 0} (38) and let X c be its complement.Let R γ := {x : V (x) ≤ γ}, where γ > 0 is arbitrary, and define Z := R γ ∩ X c , which characterizes all the points in R γ for which the Lyapunov difference V (x + )−V (x) can be positive.Then the following main result holds.Theorem 2: Consider a nonlinear as in (19), and suppose that (34) is feasible with a given Q 0 and where ∆ is defined in (31).Consider the closed-loop system with the controller (26) obtained from (34).If then R γ is a PI set for the closed-loop system.Proof: Suppose (39) holds and let x ∈ R γ .The analysis can be divided in two sub-cases.If x / ∈ Z then x ∈ X and V (x 36) and (39).Hence, x + ∈ R γ .
We close this section with a few remarks.The first remark regards the comparison with [15].In [15,Th. 8], a similar result is given that takes unmodeled dynamics into account.In this respect, the results presented here give a systematic principled method for bounding modelling errors.[15,Th. 7] also shows that asymptotic stability follows when the error bound δ(x) satisfies lim |x|→∞ δ(x) |x| = 0, e.g. when δ(x) acts as remainder in a power series expansion of f about 0. The same result holds also here but we have to bear in mind that the condition lim |x|→∞ δ(x) |x| = 0 may fail to hold depending on the choice of the kernel function.In any case, invariance sets provide a safe region where we can perform additional experiments to estimate regions of attraction.
The second remark concerns the experimental conditions.Here we have assumed noise-free data, but bounds similar to the one in ( 8) can be given also in case of noisy data [25].Such bounds can be combined with existing tools for robust controller design (cf.[15, Sec.VI]) to extend the results presented in this paper.
IV. NUMERICAL EXAMPLE Consider the nonlinear system We consider a polynomial kernel of the degree 3: We set u = 0 and collect a dataset D containing T = 10 samples by performing multiple one-step experiments with initial states uniformly distributed in [−2, 2].The resulting matrix X 0 is shown in (42a).With these data we construct the vector k(x) of basis functions.The kernel K(x, y) is symmetric positive semidefinite and there exists a unique RKHS H that admits K(x, y) as a reproducing kernel by Fact 1.We just need to show that the nonlinear dynamics f 1 (x) = x 2 + x 3 1 and f 2 (x) = 0.5x 1 + 0.2x 2 2 in (40) are members of H.By Definition 1, all of the components of k(x) belong to H.Then, it is sufficient to show that f 1 (x) and f 2 (x) are linear combinations of k(x).Denote by M (x) the vector of all monomials up to degree 3. We can write Note that when the matrix M k has full column rank, there exists α i such that c i = α i M k , i = 1, 2, and this implies that f 1 (x) and f 2 (x) can be written as the linear combinations of k(x).Hence, the collected data in D should satisfy the condition that the corresponding matrix M k is full column rank, and this condition is indeed satisfied for the collected samples in (42a).Finally, in order to find an upper bound Γ on f H as in Assumption 1, we compute f H explicitly.By (1), we have For controller design we select Γ 1 = 3 and Γ 2 = 0.4, which over-approximate the true values by more than 30%.Finally, we select λ = 10 −7 .We note that large values of λ results in large bounds δ(x) (Theorem 1), and this may eventually render the controller design program infeasible.
Next, we collect a dataset D containing T = 10 samples by performing again multiple one-step experiments with input uniformly distributed in [−0.5, 0.5], and with initial states within [−2, 2].The resulting data matrices X 0 , X 1 and U 0 are reported in (42), from which we compute the two matrices K 0 and U † 0 as in (21d) and (23), respectively.Note that the first term of K(x, y), i.e. x y, produces the linear part of Ak(x), and gives and thus A = AX 0 .In addition, we set and thus Â = A. We solve (34) with Q = I 2 , and α = 1.
The resulting controller along with the matrix Â + X1 U † 0 K are reported in (44) on the next page.For the dynamics not depending on D 0 in ( 27), we obtain We note that the program (34) correctly forces u to cancel out the nonlinearity in (40a).
For this controller, we numerically determine the set X = {ξ : l(ξ) + g(ξ, δ(ξ)) ≤ 0}.Any sub-level set R γ of the Lyapunov function V (x) = x P −1 x contained in X ∪ {0} and satisfying (39) gives an estimate of the PI set for the closed-loop system.The set X and a sublevel set of V are shown in Figure 1.We can numerically verify that the PI set in Figure 1 is also a region of attraction (ROA), and one possible reason is that both k(x) and δ(x) converge to 0 when x converges to 0 since we use the polynomial kernel K(x, y).Remarkably, the obtained estimate of the ROA is almost the same as the one obtained in [15] with knowledge of the true basis functions.

V. CONCLUSIONS
We have investigated the problem of designing feedback controllers for affine-input nonlinear systems from data using kernel learning techniques.We have considered a method in which a nominal model of the system is determined using kernel-based functions, along with an explicit upper bound on the modelling error.Then, a controller design method is proposed that involves the solution of a semidefinite program.We have shown that the method ensures, despite the presence of unmodeled dynamics, the existence of positively invariant sets for the closed-loop dynamics.An important venue for future research is the problem of understanding what kernels are more suited for control goals.Let u(x) := (λI T + K X ) −1 k(x) and one obtains s f (x) = T −1 i=0 y i u i (x) by (5).By the reproducing property of f (x) and by recalling that y i = f (x i ), the modelling error satisfies By the Cauchy-Schwartz inequality, with the second term satisfying Then, (45) becomes Finally, note that This gives the desired result.

B. Proof of Lemma 1
To prove Lemma 1, we need the following result.We now proceed with the proof of Lemma 1.Let (33) hold.By applying a Schur complement to (33), we obtain By Lemma 2, we further have By applying a Schur complement to this LMI, we get (30).This gives the desired result.

Fig. 1 .and γ = 11 . 5 .
Fig. 1.The grey set represents the set X , while the blue set is the PI set Rγ ; here, P = 1.3350 0 0 1.3350 and γ = 11.5.We observe that the set Z is empty and hence the PI set also provides an estimate of the ROA for the closed-loop system.