For my quadcopter control project, I am currently in the process of system identification, trying to find a model describing the dynamics of my motor-propeller-combination. Finding the theoretical model is quite simple, but then you’re left with finding the parameters — and not all of them are directly available from the motor vendors. So I have no other choice than to determine these parameters from measurements.

In my endeavour I came across this series of videos on data-driven control by machine learning by Professor Steve Brunton of the University of Washington, Seattle. There, basic principles of modern methods for system identification are explained very well. If you are new to control systems, I can also recommend his control boot camp.

In part of the series, Professor Brunton explains the Eigensystem Realisation Algorithm (ERA) by Juang and Pappa [1]. The ERA is a procedure by which a reduced-order model for a linear and time-invariant system (LTI) can be derived from data of an impulse-response measurement.

However, while the videos do a pretty solid job of explaining the concepts and give the final formulae, I wanted to know more about their derivation. For this, I went to the original paper where the approaches were developed, and tried to reproduce the steps that led to the final results.

I follow this approach quite a lot when acquiring new skills. It helps not only better understanding — and memorizing — the final solution, but I also tend to pick up neat tricks along the way that often prove helpful elsewhere — even completely outside the domain of the original paper.

Here’s some interesting ideas you may pick up by reading this article:

- When doing system identification, we mostly work with discrete-time systems, and in some aspects they are much easier to handle than continuous-time systems.
- The discrete-time impulse response, which uniquely characterises the behaviour of LTI systems, can be easily expressed in closed form using the system matrices.
- By reorganising our data, we can express seemingly non-linear problems in linear form.
- The singular value decomposition allows us to examine the composition of a matrix and simplify it.
- There are a few tricks for transforming matrix power expressions that may reduce the order of our problem.

## Our Example

Juang and Pappa worked on large, flexible space structure such as the Galileo spacecraft (which should not be mixed up with the European Galileo satellite navigation system), and aimed to identify the oscillation modes of such structures, so that these could be taken into account in control of the spacecraft.

We’ll look at something similar, although much more simple: A spring-loaded, dampened horizontal pendulum. This pendulum consists of a mass connected to a spring and a linear dampener. The mass can move horizontally, and we measure the position of its centre of gravity, with being the position when the system is at rest.

This system can be modelled as a Single-Input, Single-Output system: We have a single input and a single output .

Normally, we could easily provide at least the structure of a model for this system from first principles — namely, the conservation of momentum:

(1)

Introducing state variables and , we can define the dynamics of this system in state space notation:

(2)

Instead of actually measuring our system, we will be using the continuous-time dynamics to simulate the system and thereby perform “measurements”. We will add white noise to emulate measurement noise.

## Impulse-Response Measurement

Now, what happens if we whack this system with a hammer? What if we give it a short kick and then measure the position of the mass? Let’s find out!

The figure shows the position of the mass over time. We see that the system oscillates around the neutral position, but the amplitude of the oscillation falls exponentially with time, until measurement noise exceeds the movement of the system. In this example, we whacked the mass with a short force of 1 Newton, and then let go. So, our input would look like this:

This function is called the Kronecker Delta Function, and when used in analysis of signals and systems, it is referred to as the (discrete-time) unit impulse. It is one for exactly one tick of our (discrete) time-base and zero everywhere else. When given as an input to a discrete-time system, the output observed afterwards is called the discrete-time impulse response. Specifically, LTI systems are completely characterised by their impulse response.

As we are working on data that we would acquire by measurement, considering the continuous-time formulation is not useful, as we cannot get continuous-time measurement data. Instead, we’ll have some kind of quantisation in both time and value. This actually simplifies the maths of our problem a bit, as we will see in a moment.

We now want to have a look at how we can mathematically and generally describe the impulse response. This will be the basis of our model to which we will then fit our measurement data. In discrete time, we describe the dynamics of an LTI system using a recurrence relation in state space:

(3)

Let us assume that for our initial state holds. This is certainly true for dampened structures if we wait long enough for all oscillations to die down before we whack the structure. Now, if our input is for time step and 0 for all other time steps, we can see the progression of our state by repeatedly applying Equation 3:

Time Step | State |

0 | |

1 | |

2 | |

3 | |

… | … |

n |

So, in general we have for , and with the output relation

(4)

we get

(5)

Note that this impulse-response is completely independent of what our state variables represent. You can try it yourself: Substitute with some invertible square matrix and determine the impulse response of that system. However, this actually works in our favour, as it allows us to arbitrarily select at least part of the parameters later on.

In general, the terms — or, in their Multiple-Input, Multiple-Output form — are called the Markov Parameters of the system.

## Finding an Exact Model of Order n

What Juang and Pappa aim to do is to find parameter matrices , and so that the impulse response so described fits the measured data exactly. One way of doing so would be to explicitly write down the parametrised impulse-response — as we did in the last chapter — and try to solve this for the parameters directly.

However, that approach is rather ugly, and looking at Equation 5, this is a non-linear problem due to the powers of occurring. Instead, Juang and Pappa use quite a nice trick: They restructure the data in such a way that the problem becomes linear. This allows us to use our toolkit from linear algebra to find a solution.

In a first step, Juang and Pappa construct the vectors , which are the column vectors of observations starting at time step :

(6)

We will see later that is the order of the system we will build. For the time being, we shall assume that we have an arbitrary amount of measurements, so there are no limits on . From Equation 5, we can see that

(7)

with the observability matrix

(8)

Now, remember the function of the observability matrix: If the state has dimension and the observability matrix has at least rank , then we can use it to uniquely reconstruct the state from the outputs at time steps . Thus, if our system is observable, we can use the inverse of the observability matrix to find the state :

(9)

However, from that we can construct any state at any time due to:

(10)

Applying this to we get the recurrence

(11)

Now, let’s look at what we have done: We are able to describe the output at time steps from the output at time steps using a linear operator! How do we find ? Equation 11 does not sufficiently specify their value. To increase the rank of the equation system, Joung and Pappa proceed to build the Hankel-matrices

(12)

These matrices are constructed by listing measurements starting at time step in the first column, then listing measurements starting at time step in the second column, and so on. Using Equation 11, we can find the following recurrence for :

(13)

Thus, if we have any matrices and , with the former being a regular, invertible matrix, we can find

(14)

Thus, we can express using

(15)

where . This exactly describes our measured system response:

## The Problem of Noise

However, we also see that our estimated system exactly fits all the noise from our measurement. In our case we know from first-principles considerations that the exact model would only be of order , while the model we developed here is of much larger order. Besides being quite a misuse of resources, this may also lead to considerable estimation errors, as we can see when plotting the step responses:

Our estimated system clearly deviates from the system we measured. This is no wonder, as the estimated system incorporates all the noise we measured.

Further, our Hankel-matrices are pretty badly conditioned. This means that small rounding errors during calculations may become large errors in the result. We know that — in the absence of measurement error — the rank of the Hankel-matrices cannot be larger than the order of the underlying system. If they are indeed regular, they only are due to measurement noise, which should ideally be much smaller in magnitude than the actual data. Thus, simple inversion is prone to large numerical errors.

In their original paper, Juang and Pappa describe a method of using only a subset of the measurements to reduce the size of the Hankel-matrices, and thereby to improve the conditioning of the matrices. However, today, the method of ERA is almost universally presented (e.g. on Wikipedia) as being based on Singular Value Decomposition (SVD), using the whole set of measurement data.

## Singular Value Decomposition

Any matrix can be represented in the form of its Singular Value Decomposition (SVD):

(16)

The matrices all have special forms:

- The matrices and are square, orthogonal matrices, i.e. and . Note that the identity matrices may have different dimensions if is not square.
- The matrix is a diagonal matrix, with all diagonal elements being non-negative. As a common convention, the diagonal elements of — called the
*singular values*of — are listed in decreasing order.

Essentially, the SVD expresses the contents of the matrix as the sum of matrices with decreasing amount of impact:

(17)

As the matrices and are orthogonal, the impact of the individual element is represented by . If we look at the Pareto plot of the singular values, we see that there are some dominant elements in there:

The plot shows the individual singular values in orange, ordered in decreasing order from left to right, and the cumulative proportion of the singular values, added up from left to right. We can see that the first two singular values are the most prominent, making up for about 55% of the total data values, and the remaining singular values are pretty small in comparison to that — although they still add up to representing 45% of the data. We also see that our matrix is awfully close to being singular, which it would be if any of the singular values were zero.

This is somewhat consistent with our original considerations, where we found out that our system is actually of second order. Thus, we may assume that the data represented by the major two singular values is our actual data, while the remaining singular values represent noise.

Now, what do we do with that information? We can use it to remove what we consider to be noise from our data. In essence, the vectors of and for bases of vector spaces, and the singular values determine the magnitude by which the base vectors are represented in the data in the matrix. Thus, by removing some singular values and base vectors, we can project the data in the matrix onto the directions represented by the remaining base vectors, and thereby approximate the original matrix:

(18)

By appropriately selecting , we can approximate arbitrarily close. Usually, one would determine the size of from first-principles considerations, or by defining a cut-off value in relative error.

With that, we can determine the pseudo-inverse of :

(19)

I encourage you to personally verify that indeed holds, where is the identity matrix (ones on the diagonal, zeroes everywhere else).

## Simplified Model

With the simplified Hankel-Matrix

(20)

and its pseudo-inverse , we can build a simplified, cleaned-up model:

(21)

Determining the impulse response of that model, we get a pretty clean fit, and it seems that the noise is actually being ignored.

Also, our step response fits much better:

## Reduced-Order Model

But still, our matrices , and are of order , while they should be of second order for our system. To reduce that order, we’ll have to make use of some trickery. We’ll re-use Equation 21 and add an identity matrix in there:

(22)

Keep in mind that the values in are non-negative and all non-zero values are on the diagonal, so that we can take the square-root of both and its inverse. We’ll use that to regroup a bit:

(23)

Notice that in the first step, we have pushed from the left into the parenthesised, inner expression, and out the right side again. This does not change the value of the expression. We have then done the same with , and then simplified the remaining expression. Now, if we check the dimensions of , and , they indeed match the reduced order of our system.

In fact, nothing should have changed except for the order of the system, so looking at our impulse response, we should not see any difference:

Similarly, our step response should still look the same:

## Conclusion

Our expedition into the details of the ERA was quite fruitful: We have learned a few techniques for restructuring, analysis and reduction of measurement data.

Many of these will come handy when we examine the Observer/Kalman System Identification Algorithm (OKID). The OKID is used to extract the impulse response from measurement data that was acquired using arbitrary control input.

Further, some of these ideas also will allow us to identify non-linear systems — or at least approximate them. As the engine/propeller combination includes possibly non-linear elements due to aerodynamic drag, this is specifically relevant for the identification of multicopter propulsion systems.

[Bibtex]

```
@Article{Juang.Pappa1985,
author = {Juang, Jer-Nan and Pappa, Richard S.},
title = {{A}n {E}igensystem {R}ealization {A}lgorithm for {M}odal {P}arameter {I}dentification and {M}odel {R}eduction},
journal = {Journal of Guidance Control and Dynamics},
year = {1985},
volume = {8},
number = {5},
month = nov,
abstract = {A method called the eigensystem realization algorithm is developed for modal parameter identification and model reduction of dynamic systems from test data. A new approach is introduced in conjunction with the singular-value decomposition technique to derive the basic formulation of minimum order realization which is an extended version of the Ho-Kalman algorithm. The basic formulation is then transformed into modal space for modal parameter identification. Two accuracy indicators are developed to quantitatively identify the system and noise modes. For illustration of the algorithm, an example is shown using experimental data from the Galileo spacecraft.},
doi = {10.2514/3.20031},
file = {:Juang.Pappa1985 - An Eigensystem Realization Algorithm for Modal Parameter Identification and Model Reduction.pdf:PDF},
}
```