# Lecture notes/Notes de cours (html)

# README

This document is intended for training and future reference. As a reference document, you may find it useful for Biomaths 1 (3BS, Fall semester), for linear algebra (3BIM Biomaths 2, Winter Semester) and for Advanced ODEs (3BIM Biomaths 3, Winter Semester).

When important concepts are encountered for the first time, they are
highlighted in **bold** next to their definition. We tried
to provides examples that are as complete as possible. This means that
they are long, you could probably solve them faster. Exercises are
important, they can introduce theory or techniques that will be prove
useful. Solutions to the exercises can be found at the end of the
documents. Examples and exercises (and their solutions!) will be added
regularly, so feel free download the most recent version (here).
This is version V2023.09.07.

# Functions, maps

A **function** is a relation, often denoted \(f\), that associates an element \(x\) of a **domain** \(I\), and at most one element \(y\) of the **image** \(J\). The domain \(I\) and image \(J\) are sets. Usually \(I, J \in \mathbb{R}\).

A **map** is a relation that associate *each*
element of its domain to exactly one element of its image. Maps and
functions are related but slightly different concepts. A function \(f\) is a map if it is defined for all
elements of its domain \(I\). A map is
always a function. The term *map* can also be used when the
domain or the image are not numbers (Figure 1).

The **graph** of a function \(f\), denoted \(\mathcal{G}(f)\) is the set of all pairs
\((x, f(x))\) in the \(I \times J\) plane. For real-valued
functions, the graph is represented as a curve in the Cartesian
plane.

Functions are not numbers. Do not confuse

\(f\) the function

\(f(x)\) the evaluation of \(f\) at element \(x\); \(f(x)\) is an element of the image (usually a number)

\(\mathcal{G}(f)\) the graph of \(f\)

Consequently, do not write

\(f(x)\) is increasing... Instead write \(f\) is increasing...

\(f(x)\) is decreasing... Instead write \(f\) is decreasing...

\(f(x)\) is continuous... Instead write \(f\) continuous...

## Some usual maps

\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to k\), \(k \in \mathbb{R}\) constant; \(x \to x\),

**identity map**.\(f:\mathbb{R}\backslash\{0\} \to \mathbb{R}\), with \(x \to \frac 1x\),

**inverse**.\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to x^2\), parabola; \(x \to x^3\),

**cubic map**.\(f:\mathbb{R}^+ \to \mathbb{R}\), with \(x \to \sqrt{x} = x^{\frac 12}\),

**square root**; more generally with \(x \to x^{\frac pq} = {}^q\sqrt{x^p}\),**fractional power**.\(f:\mathbb{R}\backslash\{-d/c\} \to \mathbb{R}\), with \(x \to \frac{ax + b}{cx + d}\).

\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \exp(x)\),

**exponential**.\(f:\mathbb{R}^+\backslash\{0\} \to \mathbb{R}\), with \(x \to \ln(x)\),

**natural logarithm**.On logarithms: For \(a, b >0\), \(n\) positive integer, \(\ln(ab) = \ln(a) + \ln(b)\), \(\ln(a^n) = n \ln(a)\), \(\ln(a/b) = \ln(a) - \ln(b)\).

\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \cos(x)\),

**cosine**; \(x \to \sin(x)\),**sine**\(x \to \tan(x)\),**tangent**.A bit more on trigonometric functions. The diagram below shows the relationship between the sine, cosine and tangent, of an angle \(\theta \in [0,2\pi]\).

\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \cosh(x) = \frac 12 \bigl( e^{x} + e^{-x} \bigr)\),

**hyperbolic cosine**; \(x \to \sinh(x) = \frac 12 \bigl( e^{x} - e^{-x} \bigr)\),**hyperbolic sine**; \(x \to \tanh(x) = \frac{\sinh(x)}{\cosh(x)}\),**hyperbolic tangent**.

## Exercises on functions

More to come...

# Derivatives

We call the **derivative** of the function \(f:I \to J\) (\(I,J \subset \mathbb{R}),\), at point \(a \in I\) the limit, if it exists, \(\lim_{x \to a} \frac{f(x) - f(a)}{x -
a}.\) The derivative is denoted \(f'(a)\). An alternative representation
of the limit is obtained by setting \(h = x -
a\), \(f'(a) = \lim_{h \to 0}
\frac{f(a + h) - f(a)}{h}.\) If the derivative exists for all
elements \(a \in I\), we say that
**differentiable** on \(I\).

If \(f\) is differentiable on \(I\), and \(f'(x) > 0\), then \(f\) is strictly increasing on \(I\).

If \(f\) is differentiable on \(I\), and \(f'(x) < 0\), then \(f\) is strictly decreasing on \(I\).

However, if \(f\) is strictly increasing, it does not mean that \(f'(x) > 0\). For example the function \(f\) with \(f(x) = x^3\) is strictly increasing on \(\mathbb{R}\), but \(f'(0) = 0\). Where the derivative exists, we can define the derivative function \(f':I \to \mathbb{R}\) of \(f\).

The **second derivative** of a function \(f\), denoted \(f''\) is the derivative of \(f'\), where defined. If \(f''(x)\) exists and \(f''(x) > 0\) for all \(x \in I\), we say that \(f\) is **convex** (U-shaped).
If \(f'(x) = 0\) and \(f''(x) > 0\), the point \(x\) is a **minimum**. If \(f'(x)\) and \(f''(x) < 0\), the point \(x\) is a **maximum**. Maxima
and minima are **extrema**. If \(f''(0) = 0\), the point \(x\) is an **inflection point**
(Figure 2).

## List of common derivatives

The derivative has linear properties. If \(f\) and \(g\) are differentiable on \(I\), and \(a \in \mathbb{R}\),

\((f + g)' = f' + g'\).

\((af)' = a (f')\).

\((af + g)' = a (f') + g'\).

Let \(g:I \to J\) and \(f:J \to K\) be two functions. The
**composition** of \(f\)
and \(g\), denoted \(f \circ g\), is the function \(x \to f(g(x))\), i.e. \(f \circ g (x) = f(g(x))\). If \(f\) and \(g\) are differentiable, the composition
\(f \circ g\) is also differentiable,
and its derivative follows the **rule of composed
functions**:

\(\bigl( f \circ g \bigr)'(x) = f'(g(x))g'(x).\)

**Example ** Let \(f:x \to
x^2\) and \(g:x \to 3x + 1\) be
two differentiable functions, with \(f'(x)
= 2x\) and \(g'(x) = 3\).
The derivative of the composed function \(f
\circ g\) at \(x\) is \(f'(g(x))g'(x) = f'(3x+1)g'(x) =
2(3x+1) \cdot 3 = 6(3x+1) = 18x + 6.\) The derivative could have
been obtained by first computing the composed function \(f(g(x)) = (3x+1)^2 =
9 x^2 + 6x + 1\), and then taking the derivative.

**Example ** Compute the derivative of \(f: x \to \sin(1/x)\). The function \(f\) is composed of a sine and an inverse
function. To compute the derivative, we decomposed the function \(f\) as \(f(x) =
g(h(x))\) with \(g(x) =
\sin(x)\) and \(h(x) = 1/x\).
The derivatives are \(g'(x) =
\cos(x)\) and \(h'(x) =
-1/x^2\). Finally the derivative of \(f\) is

\(f'(x) = g'(h(x))h'(x) = \cos(1/x) \Bigl( \frac{-1}{x^2} \Bigr) = - \frac{\cos(1/x)}{x^2}.\)

**Example ** A function \(f:I
\to I\) is bijective (**invertible**) on \(I\) if there exists a function, denoted
\(f^{-1}\) and called
**inverse** of \(f\), such
that the compositions are equal \(f \circ
f^{-1} = f^{-1} \circ f\), and are equal to the identity map.
That is, \(f^{-1} \circ f (x) = f \circ
f^{-1}(x) = x\) for all \(x \in
I\). If \(f\) is differentiable
and invertible, what is the derivative of \(f^{-1}\)?

We apply the derivative to \(f(f^{-1})\). Given that \(f(f^{-1}(x)) = x\) by definition, we have \(\bigl( f(f^{-1}) \bigr)' = 1,\) and \(\begin{aligned} \bigl( f(f^{-1}) \bigr)'(x) & = f'(f^{-1}(x))(f^{-1})'(x), \\ & = 1, \\ (f^{-1})'(x) & = \frac{1}{f'(f^{-1}(x))}. \end{aligned}\) Take for instance \(f(x) = x^2\) on \(x \in (0,1]\). The inverse of \(f\) is \(f^{-1}(x) = \sqrt{x}\). The derivative of \(f\) is \(f'(x) = 2x\) and the derivative \(\bigl(f^{-1}\bigr)'(x) = \frac{1}{f'(f^{-1}(x))} = \frac{1}{2(\sqrt{x})}.\)

**Example ** Compute a derivative when the independent
variable is in the exponent. To compute the derivative of \(f: x \to 2^x\), we need to re-express the
function in terms of the natural base \(e\). To do that, we use to properties of
the natural logarithm

For any positive expression \(y\), \(y = e^{ \ln (y) }\) (\(\ln\) is the inverse of the exponential function).

\(\ln (a^b) = b \ln (a).\)

Then \(2^x = e^{ \ln (2^x) } = e^{ x \ln (2) }\). The derivative is \(\ln (2) e^{ x \ln (2) }\). Re-writing in term of base 2, we obtain \(f'(x) = \ln (2) 2^x\).

Function | Derivative | Note |
---|---|---|

\(x^a\) | \(ax^{a-1}\) | \(a \in \mathbb{R}\) |

\(\frac{1}{x}\) | \(\frac{-1}{x^2}\) | |

\(x^{\frac 12}\) | \(\frac{1}{2x^{\frac 12}}\) | |

\(\ln(x)\) | \(\frac 1x\) | |

\(e^x\) | \(e^x\) | |

\(\cosh(x)\) | \(\sinh(x)\) | |

\(\sinh(x)\) | \(\cosh(x)\) | |

\(\cos(x)\) | \(-\sin(x)\) | |

\(\sin(x)\) | \(\cos(x)\) | |

\(\tan(x)\) | \(1 + \tan^2(x) = \frac{1}{\cos^2(x)}\) | |

\(\dfrac{u(x)}{v(x)}\) | \(\dfrac{v(x)u'(x) - u(x)v'(x)}{v^2(x)}\) | |

\(u(x) v(x)\) | \(u'(x)v(x) + u(x)v'(x)\) |

## Exercises on derivatives

**Exercise ** Compute the derivatives of the following
functions

\(f_1 : x \to \sqrt{\cos x}.\)

\(f_2 : x \to \sin(3x + 2).\)

\(f_3 : x \to e^{\cos x}.\)

\(f_4 : x \to \ln\bigl(\sqrt{x} \bigr).\)

\(f_5 : x \to 2^{\ln x}.\)

**Correction** \(f_1'(x) =
-\frac{\sin x}{2\sqrt{\cos x}}.\) \(f_2'(x) = 3 \cos(3x + 2).\) \(f_3'(x) = -\sin(x) e^{\cos x}.\) \(f_4'(x) = \frac{1}{2x}.\) \(f_5'(x) = \frac{\ln 2}{x} 2^{\ln
x}.\)

# Taylor series and truncated expansions

If a function \(f\) is infinitely differentiable (i.e. the \(k\)-th derivative function \(f^{(k)}\) is continuous for any integer \(k\geq 0\)), the Taylor series of \(f\) at point \(a\) is the series \(\begin{aligned} \label{eq_taylor} f(a) + \frac{f'(a)}{1!}(x - a) + \frac{f''(a)}{2!} (x-a)^2 + \frac{f'''(a)}{3!} (x - a)^3 + ... \end{aligned}\)

A function is **analytic** on an open interval \(I\) if and only if its Taylor series
converges pointwise to the value of the function. Polynomials,
exponential and trigonometric function are analytic over all real
points. The square root function is not analytic.

The partial sums of the series (or truncated expansion) can be used to approximate a function, and to evaluate it numerically. The \(k\)th-order expansion of a function \(f\) is the polynomial \(\begin{aligned} f(a) + \frac{f'(a)}{1!}(x - a) + \frac{f''(a)}{2!} (x-a)^2 + ... + \frac{f^{(k)}(a)}{k!} (x - a)^k. \end{aligned}\) Truncated expansions are used in implementations of common mathematical functions in computer programs.

**Example ** The Taylor series of the sine function at
point \(a = 0\) is \(\sum_{n=0}^\infty
\frac{1}{n!}\frac{d^n\sin(x)}{dx^n} x^n = x - \frac{x^3}{3!} +
\frac{x^5}{5!} - \frac{x^7}{7!} + ...\) The 3rd-order expansion
is the cubic polynomial \(x -
\frac{x^3}{3!}.\) How good this cubic polynomial approximation to
the original sine function? The error is the remainder of the terms of
the Taylor series \(\begin{aligned}
\Bigl| - \frac{x^7}{7!} + \frac{x^9}{9!} - ... \Bigr|.
\end{aligned}\) For \(|x| <
1\), the error is bounded by \(|x|^7/7!\). Given that \(7! = 5040\), the error is less than \(1/5040 \approx 0.0002\). Truncated
expansions are never good approximations when \(|x|\) becomes large, because polynomials
are not bounded, but the approximation can be quite good over small
intervals around the point at which the Taylor series is computed.

It is important to note that even if a function has a Taylor series, the series may not converge to the function.

**Example ** Let the function \(f(x) = \begin{cases}
0 & x \leq 0 \\
e^{-\frac 1x} & x > 0.
\end{cases}.\) Around 0, all the derivatives exist and
are equal to 0. The Taylor series of \(f\) at point \(x
= 0\) is 0, but the function itself is different from the
identically zero function.

## Expansion of a function of two variables

For functions of two variables \(f : \mathbb{R}^2 \to \mathbb{R}\), the Taylor expansion at point \((x_0,y_0)^t \in \mathbb{R}^2\) is \(\begin{aligned} f(x_0,y_0) = f(x_0,y_0) & + \frac{\partial f(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f(x_0,y_0)}{\partial y} (y - y_0) \\ & + \frac{1}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial x^2} (x - x_0)^2 \\ & + \frac{1}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial y^2} (y - y_0)^2 \\ & + \frac{2}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial x \partial y} (x - x_0) (y - y_0) + ... \end{aligned}\)

**Example ** The second-order truncated expansion of the
function \(f(x,y) = y e^{-x}\) at \((0,0)^t\) is \(y
- xy.\)

## Expansion of a function from \(\mathbb{R}^2 \to \mathbb{R}^2\)

Functions \(f: \mathbb{R}^2 \to \mathbb{R}^2\) have the form \(f(x,y) = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \end{pmatrix}.\) To compute the Taylor series, we need to compute the Taylor series of each function \(f_1\) and \(f_2\). For a first-order expansion at \((x_0,y_0)^t\), we obtain the expansion \(\begin{aligned} \begin{pmatrix} f_1(x_0,y_0) \\ f_2(x_0,y_0) \end{pmatrix} + \begin{pmatrix} \frac{\partial f_1(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f_1(x_0,y_0)}{\partial y} (y - y_0) \\ \frac{\partial f_2(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f_2(x_0,y_0)}{\partial y} (y - y_0) \end{pmatrix} \end{aligned}\) Using vector notation \(\begin{aligned} \boldsymbol{f}(\boldsymbol{x}) & = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \end{pmatrix}, \\ D\boldsymbol{f} & = \begin{pmatrix} \frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y} \\ \frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y} \end{pmatrix}, \end{aligned}\) we can write a first-order (or linear) approximation of a function \(\boldsymbol{f}\) more compactly with \(\boldsymbol{f}(\boldsymbol{x}) \approx \boldsymbol{f}(\boldsymbol{a}) + D\boldsymbol{f}(\boldsymbol{a}) (\boldsymbol{x} - \boldsymbol{a}).\) The approximation is valid only in a neighbourhood of the point \(\boldsymbol{a}\).

# Integrals and primitives

## Primitives

Let \(f:I \to \mathbb{R}\), (\(I \subset \mathbb{R}\)). A
**primitive** \(F\) of
\(f\) on \(I\) is a differentiable map such that \(F'(x) = f(x)\), \(x \in I\).

**Example ** If \(f(x) =
x^2\), we look for \(F(x)\) such
that \(F'(x) = x^2\). Take \(F(x) = x^3/3\), then \(F'(x) = 3x^2/3 = x^2.\) It also work
for \(F(x) = x^3/3 + C\), for any
constant \(C \in \mathbb{R}\).

We denote the primitive as \(F(x) = \int f(x) dx\). Primitives are unique up to a constant: if \(F_1\) and \(F_2\) are primitives of \(f\), then \(F_1'(x) = f(x)\) and \(F_2'(x) = f(x)\), which implies that \(F_1'(x) - F_2'(x) = 0\). Therefore the difference between \(F_1\) and \(F_2\), \(G : x \to F_1(x) - F_2(x)\) has a zero derivative: \(G'(x) = F_1'(x) - F_2'(x) = 0\). This means that \(G\) is a constant.

Primitives have linear properties. Let \(F\) be a primitive of \(f\), \(G\) be a primitive of \(g\) and \(a,b \in \mathbb{R}\). Then

\(F+G\) is a primitive of \(f+g\), or \(\int \bigl( f(x) + g(x) \bigr) dx = \int f(x)dx + \int g(x)dx.\)

\(aF\) is a primitive of \(af\), or \(\int af(x) dx = a \int f(x) dx.\)

Function | Primitive | Note |
---|---|---|

\(0\) | \(C\) | \(C \in \mathbb{R}\) |

\(a\) | \(ax + C\) | \(a \in \mathbb{R}\) |

\(x^a\) | \(\frac{x^{a+1}}{a+1}\) | \(a \neq -1\) |

\(x^{-1} = \frac{1}{x}\) | \(\ln |x| + C\) | (\(a = -1\)), \(x \neq 0\) |

\(e^x\) | \(e^x + C\) | |

\(\cos(x)\) | \(\sin(x) + C\) | |

\(\sin(x)\) | \(-\cos(x) + C\) | |

\(f'(g(x))g'(x)\) | \(f(g(x)) + C\) |

**Exercise ** Compute \(\begin{aligned}
\int { \frac{1}{\sqrt{3x + 5}} } dx
\end{aligned}\)

## Integrals

Let \(f:[a,b] \to \mathbb{R}\), and
\(F\) a primitive of \(f\) on \([a,b]\). Then the **definite
integral** is the value \(F(b) -
F(a)\), and is denoted \(\begin{aligned}
\int_a^b { f(x) dx } = \bigl[ F(x) \bigr]_a^b .
\end{aligned}\)

The most important interpretation of the integral is the **area
under the curve** (AUC). If \(f(x) \geq
0\), \(x \in [a,b]\), then \(\int_a^b { f(x) dx }\) is the area under
delimited by \(f(x)\), \(x\)-axis and the axes \(x = a\) and \(x =
b\). Negative values integrate as negative areas.

If \(f\) is a rate (unit in
`something per time`

), and \(x\) is `time`

, then
the integral of \(f\) has unit
`something`

. This applies to speed (integral:
displacement), production or degradation (integral: concentration),
etc.

The integral has the following properties

Linearity. The integral of a sum is the sum of the integrals. \(\int_a^b { (f(x) + g(x))dx } = \int_a^b f(x) dx + \int_a^b g(x) dx.\)

Negative intervals. \(\int_a^b { f(x) dx } = - \int_b^a { f(x) dx }.\)

Midpoint rule. \(\int_a^b { f(x)dx } = \int_a^c f(x)dx + \int_c^b f(x) dx.\) Notice that this works even if \(c\) is not between \(a\) and \(b\).

The expression \(\int_a^x { f(t) dt } = F(x) - F(a) = F_a(x)\) is a function of \(x\). Therefore, for any integrable function \(f\), we have \(\Bigl( \int_a^x f(x) dx \Bigl)' = f(x).\) This is the

**fundamental theorem of calculus**.**Integration by parts**\(\int_a^b f'(x)g(x)dx = f(x)g(x)|_a^b - \int_a^b f(x)g'(x)dx.\) Integration by part is excessively useful for computing integral beyond simple functions.

**Exercise ** Compute the integral \(\int_0^1 x e^x dx.\)

**Correction** The integrand is the product of two
functions, so let us try integration by parts. Let \(g(x) = x\) and \(f'(x) = e^x\). Then \(g'(x) = 1\) and \(f(x) = e^x\). The integral becomes \(\begin{aligned}
\int_0^1 x e^x dx & = \Bigl. e^x x \Bigr|_0^1 - \int_0^1 e^x dx,
\\
& = e^1 - 0 - \Bigl. e^x \Bigr|_0^1, \\
& = e - (e^1 - e^0), \\
& = 0 + 1, \\
& = 1.
\end{aligned}\)

**Exercise ** Compute the integral \(\int_0^\pi \cos x e^x dx.\)

**Correction** The integrand is the product of two
functions, we try integration by parts. Let \(g(x) = \cos x\) and \(f'(x) = e^x\). Then \(g'(x) = -\sin x\) and \(f(x) = e^x\). The integral becomes \(\begin{aligned}
\int_0^\pi \cos x e^x dx & = \Bigl. e^x (\cos x) \Bigr|_0^\pi -
\int_0^\pi e^x (-\sin x)dx, \\
& = - e^\pi - e^0 - \int_0^\pi e^x (-\sin x)dx, \\
& = - e^\pi - 1 - \int_0^\pi e^x (-\sin x)dx, \\
\end{aligned}\) We still have a integral of a product to compute.
We apply the integral by part once more, \(\begin{aligned}
& = - e^\pi - 1 - \bigl( \Bigl. e^x ( - \sin x ) \Bigr|_0^\pi +
\int_0^\pi e^x (\cos x)dx \bigr). \\
& = - e^\pi - 1 - \int_0^\pi e^x (\cos x)dx. \\
\end{aligned}\) With this new integration, we just came back to
our initial integral. We are looping, and further integrations by part
will not help us. To break the loop, we use the fact that the initial
integral term is present on both side of the equation and solve for it.
If \(\begin{aligned}
I = & \int_0^\pi \cos x e^x dx,
\end{aligned}\) then \(\begin{aligned}
I & = - e^\pi - 1 - I, \\
2 I & = - e^\pi - 1, \\
I & = \frac{- e^\pi - 1}{2}.
\end{aligned}\) Et voilà!

**Exercise ** Compute the integral \(\int_0^1 { \sinh x e^x dx }.\)

# Differential equations in 1D

An **ordinary differential equation** (ODE) of
**order** \(n\) is a
relation (i.e. an equation) between a real variable \(t \in I \subset \mathbb{R}\), an unknown
function \(x: t \to x(t)\) and its
derivatives \(x',x'',x''', ...,
x^{(n)}\) at point \(t\) defined
by \(F\bigl(t,x,x',x'',...,x^{(n)}\bigr) =
0,\) where \(F\)
*depends* on \(x^{(n)}\). In
general, \(x(t)\) takes values in \(\mathbb{R}^N\), i.e. it is a vector. We say
that the equation is a **scalar differential equation** if
\(N = 1\). (The expression \(x^{(i)}\) stands for the \(i\)-th derivative, not the \(i\)-th power.)

The **normal form** of a differential equation of order
\(n\) is \(x^{(n)} =
f\bigl(t,x,x',x'',...,x^{(n-1)}\bigr).\)

A differential equation is **autonomous** if it does not
depend on \(t\), i.e. \(F\) has the form \(F\bigl(x,x',x'',...,x^{(n)}\bigr) =
0,\)

A **linear differential equation** is a differential
equation for which \(F\) is a linear
function in \(x,x',...x^{(n)}\). It
can be expressed as \(a_n(t) x^{(n)} +
a_{n-1}(t)x^{(n-1)} + ... a_1(t)x' + a_0(t)x = g(t),\) where
the coefficients \(a_j(t)\) may depend
on \(t\), and \(x\) and its derivatives \(x^{(i)}\) appear only as monomials of
degree 1 (that is, linearly). If all coefficients are constants,
including \(g\), the linear
differential equation is autonomous.

**Exercise ** For the following differential equations,
give the order \(n\), and determine
whether they are autonomous, linear, and whether they are expressed
under their normal form.

\(x - t + 4 t x' = 0.\)

\(\bigl( x'' \bigr)^2 - 2 x' t x = 0.\)

\(x^{(3)} + \sin(x') = -5 x.\)

\(x^{(4)} - x x' = 0 .\)

\(3x'' - 4 x' + 6 x = 2.\)

\(\ln x' + 3 t x = \sqrt{x}.\)

**Correction**

linear, non-autonomous, not in normal form

non-linear, non-automous, not in normal form

non-linear, autonomous, not in normal form

non-linear, autonomous, not in normal form

non-linear, non-autonomous, not in normal form

A **solution or integral of a differential equation** of
order \(n\) for \(t\) in an interval \(I \subset \mathbb{R}\), is a map \(x: I -> \mathbb{R}\) that is \(n\) times differentiable for all \(t \in I\) and satisfies the differential
equation.

A **integral curve (or chronic)** is the set of points
\((t,x(t))\) for \(t \in I\). If \(x(t) \in \mathbb{R}^N\), the integral curve
is in \(\mathbb{R}^{N+1}\). A
**trajectory or orbit** is the set of points \(x(t)\) for \(t
\in I\). This is a set in \(\mathbb{R}^N\). The space that contains the
trajectories is called the **phase space**. The set of all
trajectories is called the **phase portrait**.

It is always possible to write an differential equation of order \(n\) as a differential equation of order 1, by defining extra variables for the higher-order derivatives.

**Example ** We consider the differential equation \(a(t) x'' + b(t) x' + c(t) x =
d(t)\). This is a equation of order 2. To reduce it to order 1,
let \(z_1 = x\) and \(z_2 = x'\). Then \(x'' = z_2'\) and \(z_1' = z_2\). The second-order
differential equation can be re-expressed as two first order equations:
\(z_1' = z_2, \quad a(t)z_2' +
b(t)z_2 + c(t) z_1 = d(t).\) Often, the system of first order
equations can be re-expressed in normal form, by isolating the variables
\(z_1'\) and \(z_2'\), \(z_1' = z_2, \text{ and } z_2' =
\frac{d(t) - b(t)z_2 - c(t) z}{a(t)}.\) (We assume that \(a(t) \neq 0\).) In general, for the
differential equation of order \(n\)
\(F(t,x,x',...,x^{(n)}) = 0,\) with
\(x : I \to \mathbb{R}^m\), we make a
change of variables: \(z_1 = x, z_2 = x',
...,
z_i = x^{(i-1)}\) until \(z_n =
x^{(n-1)}\). Each variable \(z_i(t) \in
\mathbb{R}^m\), so the new vector \(z =
\bigl( z_1, z_2, ..., z_n \bigr)^t\) is in \(\mathbb{R}^{mn}\). With the change of
variables, the differential equation now reads \(\begin{aligned}
z_1' & = z_2, \\
z_2' & = z_3, \\
& ..., \\
z_i' & = z_{i+1}, \\
& ..., \\
F(t,&z_1,z_2,...,z_n,z_n') = 0. \\
\end{aligned}\)

**Tips on Ordinary differential equations**

The most frequently used differential equations are order 1, and they usually are represented in their normal form: \(x' = f(x)\) for autonomous equations, and \(x' = f(t,x)\) for non-autonomous equations

For a scalar, autonomous differential equation \(x' = f(x)\) with \(x(t) \in \mathbb{R}\), the trajectories are monotonous: if \(x\) is a solution, then \(x\) is either increasing, decreasing, or constant.

## Finding solutions of differential equations

We consider a **first order scalar differential
equation**, \(a(t) x' + b(t) x =
d(t),\) \(t \in I\) and \(a(t) \neq 0\) on \(I\), \(a(t)\) and \(b(t)\) continuous on \(I\). If \(d(t) =
0\), we call the equation homogeneous, and \(a(t) x' + b(t) x = 0.\)

*First order scalar ODE. Homogeneous case, first method.*
Write the equation in normal form, \(x' =
- \frac{b(t)}{a(t)} x.\) The solution \(x\) is either the constant \(x = 0\), or \(x(t) \neq 0\) for all \(t\in I\). We know that because of
uniqueness of solutions, which implies that trajectories cannot cross.
If \(x = 0\) we are done, so we can
assume that \(x(t) \neq 0\). Dividing
the equation by \(x\) \(\frac{x'}{x} = -\frac{b(t)}{a(t)}.\)
The terms on both sides are functions of \(t\). We can compute their primitives \(\int {\frac{x'}{x} dt } = - \int {
\frac{b(t)}{a(t)} dt}.\) The integrand of the left-hand side is
\(x'/x\). This is a very common
form called **log-derivative** and admits the primitive
\(\ln |x|\). The right-hand side does
not necessarily have a close form, and we leave it as it is. With the
integration constant, we obtain implicit solution for \(x\) \(\ln |x| =
- \int { \frac{b(t)}{a(t)} dt } + K.\) We would like an explicit
solution \(x\), \(\begin{aligned}
|x| & = e^{- \int { \frac{b(t)}{a(t)} dt } + K}.
\end{aligned}\) Therefore, \(\begin{aligned}
x & = \pm e^K e^{- \int { \frac{b(t)}{a(t)} dt }}.
\end{aligned}\) Notice that \(t\) is a variable of integration, and does
not exist outside the integral, and can be replaced by any other
variable. To have \(x\) as as function
of \(t\), we must define the bounds of
the integral. When the domain of definition of \(t\) is the interval \(I = [t_0,t_1], t_0 < t_1,\) the solution
\(x(t)\) is obtained by integrating
from \(t_0\) to \(t\), \(x = \pm
e^K e^{- \int_{t_0}^t { \frac{b(u)}{a(u)} du }},\) where we have
replaced the variable of integration by \(u\). Then the definite integral is a
function of \(t\). The constant \(K\) is determined by the **initial
condition** on \(x\) at \(t_0\), \(x(t_0) =
x_0 \in \mathbb{R},\) \(x(t_0) = \pm
e^K e^{- \int_{t_0}^{t_0} { \frac{b(u)}{a(u)} du }} = \pm e^K =
x_0.\) The constant \(K = \mathrm{sign}
\, x_0 \ln x_0.\) The fixed bound at \(t_0\) is arbitrary, we could have chosen
any other time of reference. However, it is very common to consider
differential equations for which we know the value of the solution at
the initial time \(t_0\). The problem
of solving a differential equation with a given initial condition is
called an **initial value problem** (IVP) or a
**Cauchy problem**.

*First order scalar ODE. Homogeneous case, second method.*
Assume that \(a(t) \neq 0\), and write
the equation as \(x' + \frac{b(t)}{a(t)}
x = 0.\) Multiply the equation by the term \(e^{\int { \frac{b(t)}{a(t)} dt }}\), \(x' e^{\int { \frac{b(t)}{a(t)} dt }} +
\frac{b(t)}{a(t)} x e^{\int { \frac{b(t)}{a(t)} dt }} = 0.\)
Notice that the first term has the form \(x' f\) and the second term the form
\(x f'\), with \(f = e^{\int { \frac{b(t)}{a(t)} dt }}.\)
The left-hand-side of the resulting equation, \(x' f + x f'\), is the derivative of
the product \(xf\), so the differential
equation can be integrated, \(\begin{aligned}
x e^{\int { \frac{b(t)}{a(t)} dt }} & = K, \\
x & = K e^{ - \int { \frac{b(t)}{a(t)} dt }}.
\end{aligned}\) The constant \(K\) is determined by a condition set on the
solution, as in the first method.

**Exercise ** Solve the equation \(2x' + 6x = 0, \quad x(0) = 1.\)

**Correction** The initial condition \(x(0) = 1\) tells us to start the
integration at \(t = 0\). Using the
second method with \(a(t) = 2\) and
\(b(t) = 6\), we have \(x(t) = K e^{- \int_0^t { - \frac 62 du } } = K
e^{ - \frac 62 t }.\) The constant \(K\) is determined by the initial condition,
\(x(0) = K e^0 = K = 1.\) The complete
solution is \(x(t) = e^{- \frac 62
t}.\)

**Exercise ** Solve the equation \(x' + \frac 1t x = 0, \quad x(1) =
1.\)

**Correction** The initial condition \(x(1) = 1\) tells us to start the
integration at \(t = 1\). Using the
second method with \(a(t) = 1\) and
\(b(t) = 1/t\), we have \(x(t) = K e^{- \int_1^t { - \frac 1t du } } = K
e^{ - \ln u |_1^t } = K e^{ - \ln t}.\) The term \(e^{ - \ln t} = e^{\ln t^{-1}} = t^{-1}.\)
The solution simplifies to \(x(t) = \frac
Kt.\) The constant \(K\) is
found with the initial condition \(x(1) = K =
1,\) for a complete solution \(x(t) =
\frac 1t.\)

*First order scalar ODE. Heterogeneous case.* We now consider
the more general differential equation \(a(t)
x' + b(t) x = d(t).\) Using the strategy from the second
method for the homogeneous case above, we divide the equation by \(a(t)\), again assuming that \(a(t) \neq 0\), and then multiply by \(e^{ \int { \frac{b(t)}{a(t)} dt} }.\) The
resulting equation now has a non-zero right-hand-side, \(x'e^{ \int { \frac{b(t)}{a(t)} dt} } +
\frac{b(t)}{a(t)} x e^{ \int { \frac{b(t)}{a(t)} dt} } =
\frac{d(t)}{a(t)} e^{ \int { \frac{b(t)}{a(t)} dt} }.\)
Nevertheless, the left-hand-side is still of the form \(x' f + x f'\), and the
right-hand-side only depends on \(t\).
By integrating both sides, we obtain \(x e^{
\int { \frac{b(t)}{a(t)} dt} } = \int { \frac{d(t)}{a(t)} e^{ \int {
\frac{b(t)}{a(t)} dt}} dt } + K.\) The solution for \(x\) is then \(x
= \Bigl( \int { \frac{d(t)}{a(t)} e^{ \int { \frac{b(t)}{a(t)} dt}} dt }
+ K \Bigr) e^{ - \int { \frac{b(t)}{a(t)} dt} }.\)

**Example ** We consider the differential equation \(\begin{aligned}
x' + 3 x = 1+\sin(t), \quad x(0) = x_0 > 0.
\end{aligned}\) The coefficients \(a(t)
= 1, b(t) = 3, d(t) = 1 + \sin(t).\) The term \(e^{ \int { \frac{b(t)}{a(t)} dt}} = e^{ \int { 3
dt } } = e^{3t}\). The general solution is \(\begin{aligned}
x(t) & = \Bigl( \int { (1+\sin(t)) e^{3t} dt } + K \Bigr)e^{ - 3t
} , \\
& = \frac 13 + \frac {1}{10} \bigl( \sin(t) - \cos(t) \bigr)
+ Ke^{-3t}.
\end{aligned}\) At \(t = 0\),
the equation \(x(0) = \frac 13 + \frac{1}{10}
(0 - 1) + K = x_0\) solve in \(K\) as \(K = x_0
- \frac{7}{30}\). The solution \(x(t)\) is \(x(t)
= \frac 13 + \frac {1}{10} \bigl( 3 \sin(t) - \cos(t) \bigr) + \Bigl(
x_0 - \frac{7}{30} \Bigr) e^{-3t}.\)

We now consider a **nonlinear differential equation** of
**Bernoulli** type. Bernoulli equations are of the form
\(\begin{aligned}
x' + P(t) x + Q(t) x^r = 0, \quad t \in I \subset \mathbb{R},
\end{aligned}\) with continuous functions \(P, Q\), and \(r
\in \mathbb{R}\). There is no general method to solve nonlinear
differential equations, but is can be done is particular cases. If \(r = 0\) or \(r =
1\), the equation is linear, and we already know how to solve it.
Suppose \(r\) different from 0 or 1. We
will look for positive solutions \(x(t) >
0\) on \(t \in I\). Dividing by
\(x^r\), we get \(\begin{aligned}
x' x^{-r} + P(t) x x^{-r} + Q(t) x^r x^{-r} & = 0, \\
x' x^{-r} + P(t) x x^{-r} + Q(t) & = 0.
\end{aligned}\) We now set an auxiliary variable \(u = x^{1-r}\), (\(x = u^{1/(1-r)}\)). Then \(\begin{aligned}
u' = (1-r)x^{1-r-1}x' = (1-r)x^{-r}x',
\end{aligned}\) and, substituting in the differential equation,
\(\begin{aligned}
\frac{1}{1-r} u' + P(t) u + Q(t) & = 0, \\
u' + (1-r)P(t) u + (1-r) Q(t) & = 0.
\end{aligned}\) We know how to solve this equation; this is a
linear equation of the form \(\begin{aligned}
a(t) u' + b(t) u = d(t),
\end{aligned}\) with \(a(t) = 1, b(t) =
(1-r)P(t), d(t) = - (1-r)Q(t)\).

**Example ** Verhulst equation (logistic equation) \(\begin{aligned}
x' = \mu x \Bigl( 1 - \frac xK \Bigr).
\end{aligned}\) We first rewrite the equation in Bernoulli form,
\(\begin{aligned}
x' - \mu x + \mu \frac{x^2}{K} = 0,
\end{aligned}\) with \(P(t) = -1, Q(t)
= \mu/K, r = 2\). The auxiliary equation reads \(\begin{aligned}
u' + (1-2)(-1) \mu u + (1-2) \frac{\mu}{K} & =
0.
u' + u - \frac{\mu}{K} & = 0.
\end{aligned}\) This is a scalar linear equation with \(a(t) = 1, b(t) = \mu, d(t) = \mu/K.\). The
general solution is \(\begin{aligned}
u(t) & = \Bigl( \int { \frac{d(t)}{a(t)}e^{\int b(t)/a(t) dt}dt }
+ C \Bigr) e^{ - \int { b(t)/a(t) dt } }, \\
\int b(t)/a(t) dt & = \int \mu dt = \mu t, \\
u(t) & = \Bigl( \int { \frac{\mu}{K} e^{\mu t}dt } + C \Bigr)
e^{-\mu t}, \\
& = \Bigl( \frac{\mu}{K} \frac{e^{\mu t}}{\mu} + C
\Bigr) e^{- \mu t}, \\
& = \frac{1}{K} + C e^{- \mu t}.
\end{aligned}\) The initial condition \(u(0) = x_0^{1-r} = x_0^{-1}\), \(\begin{aligned}
u(0) & = \frac{1}{K} + C = \frac{1}{x_0}, \\
C & = \frac{1}{x_0} - \frac{1}{K}.
\end{aligned}\) The solution to the original Verhulst equation is
\(x = u^{-1}\), \(\begin{aligned}
x(t) & = \frac{1}{\frac{1}{K} + \bigl( \frac{1}{x_0} - \frac{1}{K}
\bigr) e^{-\mu t}} \\
& = \frac{x_0 K}{x_0 + (K- x_0) e^{- \mu t}}.
\end{aligned}\)

# Complex numbers

A complex number is a number that can be expressed in the form \(a + i b\), where \(a\) and \(b\) are real numbers, and the symbol \(i\) is called **imaginary
unit**. The imaginary unit satisfies the equation \(i^2 = -1\). Because no real number
satisfies this equation, this number is called *imaginary*.

For the complex number \(z = a + i
b\), \(a\) is called the
**real part** and \(b\) is
called the **imaginary part**. The real part of \(z\) is denoted \(\Re(z)\) (`\Re`

in LaTeX) or
just \(\mathrm{Re}(z).\) The imaginary
part of \(z\) denoted \(\Im(z)\) (`\Im`

in LaTeX) or
just \(\mathrm{Im}(z)\). The set of all
complex numbers is denoted \(\mathbb{C}\) (`\mathbb{C}`

in
LaTeX).

We need complex numbers for solving polynomial equations. The fundamental theorem of algebra asserts that a polynomial equation of with real or complex coefficients has complex solutions. These polynomial equations arise when trying to compute the eigenvalues of matrices, something we need to do to solve linear differential equations for instance.

Arithmetic rules that apply on real numbers also apply on complex numbers, by using the rule \(i^2 = -1\): addition, subtraction, multiplication and division are associative, commutative and distributive.

Let \(u = a + i b\) and \(v = c + i d\) two complex numbers, with real coefficients \(a,b,c,d\). Then

\(u + v = a + i b + c + i d = (a+c) + i (b+d)\).

\(uv = (a + i b)(c + i d) = ac + i a d + i b c + i^2 b d = ac - bd + i(ad + bc)\).

\(\frac{1}{v} = \frac{1}{c + i d} = \frac{c - id}{(c - id)(c + id)} = \frac{c - id}{c^2 + d^2} = \frac{c}{c^2 + d^2} - i \frac{d}{c^2+d^2}\).

\(u = v\) if and only if \(a = c\) and \(b = d\).

It follows from the rule on \(i\) that

\(\frac 1i = -i.\) (Proof: \(\frac 1i = \frac{i}{i^2} = \frac{i}{-1} = -i\).)

Multiplying by the imaginary unit \(i\) is equivalent to a counterclockwise rotation by \(\pi/2\) (Figure 3) \(ui = (a + ib)i = ia + i^2 b = -b + ia.\)

Let \(z = a + ib\) a complex number
with real \(a\) and \(b\). The **conjugate** of
\(z\), denoted \(\bar z\), is \(a
- ib\). The conjugate of the conjugate of \(z\) is \(z\) (*reflection*, Figure 3). The **modulus**
of \(z\), denoted \(|z|\) is \(\sqrt{z \bar z}\). The product \(z \bar z = (a+ib)(a-ib)=a^2 + b^2 + i(-ab + ab) =
a^2 + b^2\). The modulus is the complex version of the absolute
value, for if \(z\) (i.e. \(b = 0\)), \(|z| =
\sqrt{a^2} = |a|\). It is always a real, positive number, and
\(|z| = 0\) if and only if \(z = 0\). The modulus also has the property
of being the *length* of the complex number \(z\), if \(a\) and \(b\) are the sides of a rectangular
triangle, then \(|z|\) is the
hypotenuse.

When simplifying a ratio involving a complex \(v\) at the denominator, it is important to convert it to a real number by multiplying the ratio by \(\bar v/ \bar v\). For instance, if \(v \neq 0\), \(\frac{u}{v} = \frac{u \bar v}{u \bar v} = \frac{u \bar v}{|v|^2}.\) The denominator \(|v|^2\) is always a positive real number.

By allowing complex values, nonlinear functions of real numbers like
exponentials, logarithms and trigonometric functions can have their
domain extended to all real and complex numbers. The most useful
extension is the exponential function. Recall that the exponential
function \(e^x\), where \(e \approx 2.71828\) is Euler’s constant,
satisfies the relation \(e^{x + y} = e^{x}
e^{y}\). This remains true for complex numbers. The
**Euler’s formula** relates the exponential of a imaginary
number with trigonometric functions. For a real number \(y\), \(e^{i y} =
\cos(y) + i \sin(y).\) Therefore, for any complex number \(z = a + i b\), the exponential \(e^{z} = e^{a + ib} = e^a e^{ib} = e^a \bigl(
\cos(b) + i \sin(b) \bigr).\)

**Tips on complex numbers**

If \(x\) is real, \(ix\) is pure imaginary. If \(y\) is imaginary, \(iy\) is real.

\(|i| = 1\). For any real \(\theta\), \(|e^{i \theta}| = 1\).

\(|z_1 z_2| = |z_1| |z_2|\).

In particular, \(|iz| = |i||z| = |z|\). (Multiplying by \(i\) is a rotation in the complex plane, it does not change the modulus.)

## Roots of a complex number

For complex numbers, the equation \(z^n =
1\) has \(n\) solutions. They
are called the **root of unity**. For \(n=2\), we have the well-known roots \(z = \pm 1\), which are real. What are the
roots of \(z^3 = 1\)? To find them, we
express \(z\) in polar coordinates:
\(z = r e^{i\theta}\). Then \(z^3 = (r e^{i\theta})^3 = r^3 e^{i 3\theta} =
1.\) The equation implies that \(z\) has modulus 1, so \(r = 1\). The remaining term \(e^{i 3\theta} = 1\) implies that \(3 \theta\) is a multiple of \(2 \pi\) because \(e^{i \omega} = 1\) if and only if \(\omega = 2 k \pi\), for some integer \(k\). Therefore \(\theta = \frac{2}{3} k \pi\), for \(k = 0, 1, 2, ...\). How many distinct
points do we have? Clearly, \(k = 3\)
is equivalent to \(k = 0\): \(e^{i \frac{2}{3} 3 \pi} = e^{i 2 \pi} = e{i
0}\). In the same way \(k = 4\)
is equivalent to \(k = 1\), and so on.
Therefore, there are exactly three distinct solutions for \(\theta\): \(0,
\frac{2}{3} \pi, \frac{4}{3} \pi\) (Figure 5).

## Exercises on complex numbers

**Exercise ** Let the complex number \(z = 2 + 3 i\). Compute \(\bar z\), \(|z|\), \(|\bar
z|\) (compare with \(|z|\)),
\(z^2\), \(\Re(\bar z)\) , \(\Im(\bar z)\), \(\frac{z + \bar z}{2}\) , \(\frac{z - \bar z}{2}\) , \(-z\), \(iz\).

**Correction** \(\bar z = 2 -
3i\), \(|z| = \sqrt{2^2 + 3^2} =
\sqrt{13}\), \(|\bar z| = \sqrt{2^2 +
(-3)^2} = \sqrt{13}\), we see that \(|z| = |\bar z|\), \(\Re(\bar z) = 2\), \(\Im(\bar z) = -3\), \(\frac{z + \bar z}{2} = (2 + 3i + (2 - 3i))/2 =
2\), \(\frac{z - \bar z}{2} = ((2 + 3i
- (2 - 3i))/2 = 3i\), \(-z = -2 -
3i\), \(iz = 2i + 3i^2 = -3 +
2i\).

**Exercise ** Any complex number can be represented in
**polar form**: \(z = r (
\cos(\theta) + i \sin(\theta) )\).

Show that \(|z| = r\)

Show that \(z = r e^{i \theta}\)

Conclude that for any complex number \(z\), \(|z| = 1\) if and only if \(z\) can be expressed as \(z = e^{i \theta}\) for a real \(\theta\).

**Correction** The modulus of \(z\) is \(|z| =
\sqrt{r^2 \cos^2 (\theta) + r^2 \sin^2 (\theta)} = \sqrt{r^2} =
r.\) From Euler’s formula, we have \(\cos(\theta) + i\sin(\theta) =
e^{i\theta}\), so \(z =
re^{i\theta}.\) Therefore, for any complex number \(z = r e^{i\theta}\), \(|z| = 1\) if and only if \(r = 1\).

**Exercise ** Using Euler’s formula, show that \(\cos(a)\cos(b) - \sin(a)\sin(b) =
\cos(a+b)\). *(Use the property that \(e^{ia + ib} = e^{ia} e^{ib}\) and apply
Euler’s Formula)*.

**Correction** All trigonometric identities can be
obtained by applying Euler’s formula. Here we start from \(e^{ia + ib} = \cos(a+b) + i \sin(a+b)\). We
only want the real part, \(\begin{aligned}
\cos(a + b) & = && \frac{e^{ia + ib} + e^{-ia - ib}}{2}
\\
& = && \frac{e^{ia} e^{ib} + e^{-ia}
e^{-ib}}{2} \\
& = && \frac{(\cos(a) + i \sin(a))(\cos(b)+ i
\sin(b)) + (\cos(a) - i \sin(a))(\cos(b) - i \sin(b))}{2} \\
& = && \frac{\cos(a)\cos(b) + i^2
\sin(a)\sin(b) + i \cos(a)\sin(b) + i \cos(b)\sin(a)}{2} \\
& && + \frac{\cos(a)\cos(b) + i^2
\sin(a)\sin(b) - i \cos(a)\sin(b) - i \cos(b)\sin(a)}{2}
\end{aligned}\) The mixed cosine-sine terms cancel each other
while the other ones add up, resulting in \(\begin{aligned}
\cos(a + b) = \cos(a)\cos(b) - \sin(a)\sin(b).
\end{aligned}\)

**Exercise ** Show Euler’s identity: \(e^{i \pi} = -1\).

**Correction** This is a direct application of Euler’s
formula: \(e^{i \pi} = \cos(\pi) + i \sin(\pi)
= -1.\)

**Exercise ** What are the roots of the equation \(z^6 = 1\)?

**Correction** The roots must satisfy \(e^{i 6 \theta} = 1\). This means that \(\theta = \frac{2}{6} k \pi\), for \(k = 0, 1, ..., 5\). There are six distinct
roots.

**Exercise ** For a complex \(z\), find necessary and sufficient
conditions for \(e^{z t}\), \(t > 0\), to converge to 0.

**Correction** The exponential converges to zero if and
only if \(\Re(z) < 0\). A complex
number is close to zero if and only if its modulus is close to zero.
Therefore, to show that a quantity converges to zero, it is necessary
and sufficient to show that its modulus converges to zero. If \(z = a + ib\), the exponential \(e^{z t} = e^{(a + ib)t} = e^{at} e^{ibt}.\)
The modulus \(|e^{ibt}| = 1\), so \(|e^{zt}| = e^{at}\) (no need for absolute
values, the exponential of a real number is always positive). The
condition for convergence to zero is therefore a condition on the real
part of \(z\): \(e^{at} \to 0\) when \(t \to \infty\) if and only if \(a < 0\).

**Exercise ** Let the complex number \(z = a + ib\) with real \(a\) and \(b\). Compute \(\sqrt{z}\) (that is, express \(s = \sqrt{z}\) as \(s = \alpha + i \beta\), with real \(\alpha\) and \(\beta\))

**Correction** The square root of a complex number
always exists. Express \(z\) in polar
form \(z = re^{i\theta}\), \(r \geq 0, \theta \in [0, 2\pi]\). The
square root \(\sqrt{z} =
\sqrt{r}\sqrt{e^{i\theta}} = \sqrt{r} e^{\frac 12 i \theta}\).
Using Euler’s formula, \(\sqrt{z} = \sqrt{r}
\cos(\theta/2) + i \sqrt{r} \sqrt{\theta/2}\). That is, the
square root is obtained by taking the square root of the modulus \(r\), and dividing the angle (the
**argument**) by 2. There is a problem with this solution,
because \(z\) can also be represented
by \(re^{i\theta + 2\pi}\), giving
\(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta +
\pi}\), which is equivalent to dividing the angle by two in the
other direction. We define the **principal square root** as
the solution that makes the smallest change in angle: \(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta}\)
if \(\theta \in [0,\pi]\), and \(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta +
\pi}\) if \(\theta \in
(\pi,2\pi]\) To express the solution in terms of the original
form of \(z = a + i b\), we express the
square root \(s = \alpha + i \beta\).
Then \(s^2 = \alpha^2 - \beta^2 + 2i \alpha
\beta = z = a + ib.\) By identifying the real and imaginary
parts, we get two equations: \(\alpha^2 -
\beta^2 = a\) and \(2 i \alpha \beta =
b.\) Denoting the modulus of \(z\) by \(r =
\sqrt{a^2 + b^2}\), we can obtain the solutions \(\alpha = \frac{1}{\sqrt{2}}\sqrt{a + r}, \quad
\beta = \mathrm{sign}(b) \frac{1}{\sqrt{2}} \sqrt{-a + r}.\)

# Matrices in dimension 2

## Eigenvalues of a \(2 \times 2\) matrix

A \(2 \times 2\) matrix \(A\) is an array with 2 rows and 2 columns:
\(A = \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}.\) Usually, the **coefficients** \(a, b, c, d\) are real numbers. The
**identity** matrix is the matrix \(I = \begin{pmatrix}
1 & 0 \\
0 & 1
\end{pmatrix}.\)

The **determinant** of \(A\), denoted \(\det A\) or \(|A|\) is the number \(ad - bc\). The **trace** of
\(A\), denoted \(\mathrm{tr}\,A\), is the sum of the main
**diagonal** of \(A\):
\(a + d\).

The **characteristic polynomial** of \(A\) is the second order polynomial in \(\lambda\) obtained by computing the
determinant of the matrix \(A - \lambda
I\) (Figure 6), \(\det ( A - \lambda I ) = (a-\lambda)(d-\lambda) -
bc = ad - bc - \lambda ( a + d ) + \lambda^2.\) The
characteristic polynomial \(p_A(\lambda)\) of \(A\) can be expressed in terms of its
determinant and trace: \(p_A(\lambda) = \det
A - \mathrm{tr}\,A \lambda + \lambda^2.\) The
**eigenvalues** of \(A\)
are the roots of the characteristic polynomial. By the fundamental
theorem of algebra, we know that the characteristic polynomial has
exactly two roots, counting multiple roots. These roots can be real, or
complex. The eigenvalues of \(A\) are
calculated using the quadratic formula: \(\lambda_{1,2} = \frac{1}{2} \Bigl( \mathrm{tr}\,A
\pm \sqrt{ (\mathrm{tr}\,A)^2 - 4 \det A } \Bigr).\)

From this formula, we can classify the eigenvalues of \(A\). Let \(\Delta = (\mathrm{tr}\,A)^2 - 4 \det A\)
the **discriminant** of the quadratic formula. The two
eigenvalues of \(A\) are real if and
only if \(\Delta \geq 0\), i.e. \(\mathrm{tr}\,A)^2 \geq 4 \det A\) Then we
have the following properties (Figure 7):

\(\Delta < 0\), complex eigenvalues

The two eigenvalues are complex conjugate: \(\lambda_1 = \bar \lambda_2\)

Their real part \(\Re(\lambda) = \frac{1}{2} \mathrm{tr}\,A\).

\(\Delta = 0\), there is a single root of multiplicity 2: \(\lambda = \frac{1}{2} \mathrm{tr}\,A\).

\(\Delta > 0, \det A > 0\), real, distinct eigenvalues of the same sign.

\(\mathrm{tr}\,A > 0\) and \(\det A > 0\). Then \(\lambda_{1,2}\) are distinct and positive.

\(\mathrm{tr}\,A < 0\) and \(\det A > 0\). Then \(\lambda_{1,2}\) are distinct and negative.

\(\det A < 0\), real distinct eigenvalues of opposite sign.

\(\lambda_1 < 0 < \lambda_2\).

\(\det A = 0\) one of the eigenvalue is zero, the other eigenvalue is \(\mathrm{tr}\,A\).

### Exercises on eigenvalues

**Exercise ** Properties of the eigenvalues of \(2 \times 2\) matrices. For each \(2 \times 2\) matrix, compute the
determinant, the trace, and the discriminant, and determine whether the
eigenvalues are real, complex, distinct, and the sign (negative,
positive, or zero) of the real parts.

\(A_1 = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, \quad A_2 = \begin{pmatrix} -2 & 1 \\ 1 & -2 \end{pmatrix}, \quad A_3 = \begin{pmatrix} 1 & -2 \\ 0 & 1 \end{pmatrix}, \quad A_4 = \begin{pmatrix} -1 & 2 \\ 1/2& 2 \end{pmatrix}.\)

## Matrix-vector operations

A matrix defines a linear transformation between vector spaces. Given
a vector \(x\), the product \(Ax\) is vector composed of linear
combinations of the coefficients of \(x\). For a matrix \(2 \times 2\), the vector \(x\) must be a vector of size 2, and the
product \(Ax\) is a vector of size two.
If \(x = (x_1, x2)^t\) (the \({}^t\) stands for the transpose, because
\(x\) must be a column vector), and
\(A = [ a_{ij} ]_{i=1,2, \, j=1,2}\),
then \(Ax =
\begin{pmatrix}
a_{11} & a_{12} \\
a_{21} & a_{22}
\end{pmatrix}
\begin{pmatrix}
x_{1} \\
x_{2}
\end{pmatrix}
=
\begin{pmatrix}
a_{11} x_{1} + a_{12} x_2 \\
a_{21} x_{1} + a_{22} x_2
\end{pmatrix}.\) Successive linear transformations can be
accomplished by applying several matrices. Given two matrices \(A, B\), the matrix product \(C = AB\) is also a matrix. The matrix \(C\) is the linear transformation that first
applies \(B\), then \(A\). Matrix product is *not*
commutative is general: \(AB \neq BA\).
*(If \(B\) means ’put on socks’ and
\(A\) means ’put on shoes’, then \(BA\) does not have the expected
result.)* The product of two matrices \(A
= [ a_{ij} ]_{i=1,2, \, j=1,2}\) and \(B = [ b_{ij} ]_{i=1,2, \, j=1,2}\) is \(AB =
\begin{pmatrix}
a_{11} & a_{12} \\
a_{21} & a_{22}
\end{pmatrix}
\begin{pmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22}
\end{pmatrix}
=
\begin{pmatrix}
a_{11} b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \\
a_{21} b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22}
\end{pmatrix}.\) The **sum of two matrices** \(A+B\) is performed element-wise: \(A+B = [a_{ij} + b_{ij}]_{i=1,2, \,
j=1,2}\). The **sum of two vectors** is defined
similarly. Addition is commutative. Matrix operations are associative
and distributive. \(\begin{aligned}
A + B & = B + A, \\
A(B + C) & = AB + BC, \\
A(BC) & = (AB)C.
\end{aligned}\) Matrices and vectors can be multiplied by a
scalar value (real or complex). **Multiplication by a
scalar** is associative, distributive, and commutative. The
result of the multiplication by a scalar is to multiply each coefficient
of the matrix or vector by the scalar. For example, if \(\lambda, \mu\) are scalars, \(\begin{aligned}
\lambda A & = A (\lambda I) = A \lambda, \\
\lambda ( A + B ) & = \lambda A + \lambda B, \\
(\lambda A) B & = \lambda (AB) = A (\lambda B), \\
(\mu + \lambda) A & = \mu A + \lambda A, \\
\mu (\lambda A) & = (\mu \lambda) A, ...
\end{aligned}\) The product between two column vectors is not
defined, because the sizes do not match. However, we can define the
**scalar product** between two column vectors \(x, y\) in the same way matrix product is
defined: \(x^ty \equiv x_1 y_1 + x_2
y_2.\)

If the vectors are complex-valued, we need also to conjugate the
transposed vector \(x^t\). The
conjugate-transpose is called the **adjoint** and is
denoted \({}^*\). Thus, if \(x\) is complex-valued, the adjoint \(x^*\) is the row vector \((\bar x_1, \bar x_2)\). The scalar product
for complex-valued vectors is denoted \(x^*y\). Since this notation also works for
real-valued vector, we will used most of the time.

Two vectors are **orthogonal** if their scalar product
is 0. In the plane, this means that they are oriented at 90 degree
apart. Orthogonal vectors are super important because they can be used
to build orthogonal bases that are necessary for solving all sorts of
*linear problems*.

### Exercises on Matrix-vector and matrix-matrix operations

**Exercise ** Compute matrix-vector product \(\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}
\begin{pmatrix}
x_1 \\ x_2
\end{pmatrix}.\) What is the transformation given by this
matrix.

**Correction** \(\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}
\begin{pmatrix}
x_1 \\ x_2
\end{pmatrix}
=
\begin{pmatrix}
-x_2 \\ x_1
\end{pmatrix}.\) The transformation is a 90 degree
counterclockwise rotation.

**Exercise ** Compute the matrix-matrix product \(\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}
\begin{pmatrix}
1 & 0 \\
0 & -1
\end{pmatrix}.\) Can you tell what transformation this is?

**Correction** \(\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}
\begin{pmatrix}
1 & 0 \\
0 & -1
\end{pmatrix} =
\begin{pmatrix}
0 & 1 \\
1 & 0
\end{pmatrix}.\) This matrix exchanges the coordinates of a
vector, this is a reflection through the axis \(x = y\).

**Exercise ** Now compute the product of the same
matrices, but in the inverse order \(\begin{pmatrix}
1 & 0 \\
0 & -1
\end{pmatrix}
\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}.\) Compare with the solution found in the previous
exercise. What is this transformation?

**Correction** \(\begin{pmatrix}
1 & 0 \\
0 & -1
\end{pmatrix}
\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix} =
\begin{pmatrix}
0 & -1 \\
-1 & 0
\end{pmatrix}.\) The pruduct is not the same, the matrices do not
commute. The transformation is now a reflection through \(y = -x\).

**Exercise ** Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((a x_1, b x_2)^t\).

**Correction** The matrix is \(\begin{pmatrix}
a & 0 \\
0 & b
\end{pmatrix}.\)

**Exercise ** Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((x_2, x_1)^t\).

**Correction** The matrix is \(\begin{pmatrix}
0 & 1 \\
1 & 0
\end{pmatrix}.\)

**Exercise ** Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((x_2, 0)^t\).

**Correction** The matrix is \(\begin{pmatrix}
0 & 1 \\
0 & 0
\end{pmatrix}.\)

**Exercise ** Compute the successive powers \(A, A^2, A^3, ...\), for a diagonal matrix
\(A\): \(A
=\begin{pmatrix}
a & 0 \\
0 & b
\end{pmatrix}\)

**Correction** The power of a diagonal matrix is a
diagonal matrix \(A^k =
\begin{pmatrix}
a^k & 0 \\
0 & b^k
\end{pmatrix}.\)

**Exercise ** Compute the scalar product \(x^*y\) between \(x = (1 + 2i, 1 - i)^t\) and \(y = (0.5 - i, -0.5)^t\).

**Correction** The scalar product is \(\begin{aligned}
x^*y & = (1 - 2i, 1 + i) (0.5 - i, -0.5)^t \\
& = (1 - 2i)(0.5 - i) + (1 + i)(-0.5) \\
& = 0.5 + 2i^2 - 2(0.5)i - i -0.5 - 0.5i \\
& = (0.5 - 0.5 + 2i^2) + (-2(0.5) - 1 - 0.5)i \\
& = -2 - 2.5i.
\end{aligned}\)

**Exercise ** Now compute the scalar product \(y^*x\) and compare with the result with the
previous exercise.

**Correction** The scalar product is \(\begin{aligned}
y^*x & = (0.5 + i, -0.5) (1 + 2i, 1 - i)^t \\
& = (0.5 + i)(1 + 2i) + (-0.5)(1 - i) \\
& = 0.5 + 2i^2 + i + 2(0.5)i - 0.5 + 0.5i \\
& = -2 + 2.5i
\end{aligned}\) This is the conjugate: \(x^*y = (y^*x)^*.\)

**Exercise ** Compute the scalar product between \(z = (z_1, z_2)^t\) and itself, if \(z\) is a complex-valued vector. What can
you say about the result?

**Correction** The scalar product \(z^*z = (\bar z_1, \bar z_2) (z_1, z_2)^t = \bar
z_1 z_1 + \bar z_2 z_2 = |z_1|^2 + |z_2|^2.\) The scalar product
is the square of the norm of the vector \(z\).

**Tips on eigenvalues** Some matrices have special
shapes that make it easier to compute the determinant, and the
eigenvalues. These are called eigenvalue-revealing shapes.

Diagonal matrices have their eigenvalues on the diagonal.

**Triangular matrices**, i.e. matrices that have zeros above (lower-triangular matrix) or below (upper-triangular matrix) the main diagonal have also their eigenvalues on the diagonal.A matrix with a row or a column of zeros has its determinant equal to zero. This implies that one of its eigenvalues is 0.

# Eigenvalue decomposition

In many applications, it is useful to decompose a matrix into a form
that makes it easier to operate complex operations on. For instance, we
might want to compute the powers of a matrix \(A\): \(A^2\), \(A^3\), \(A^4\). Multiplying matrices are
computationally intensive, especially when the size of the matrix
becomes large. The **power of a matrix** is \(A^k = AA...A\), \(k\) times. The zeroth power is the identity
matrix: \(A^0 = I\).

The **inverse** of a matrix \(A\), denoted by \(A^{-1}\) is the unique matrix such that
\(AA^{-1} A{^-1}A = I\). The notation
is self-consistent with the positive powers of \(A\). The inverse does not always exist. A
matrix is **invertible** if and only if its determinant is
not 0. If \(A\) and \(B\) are invertible, then \(AB\) is invertible, and \((AB)^{-1} = B^{-1}A^{-1}\).

The **eigenvalue decomposition** is a decomposition of
the form \(A = X D X^{-1}\), where
\(D\) is a diagonal matrix, and \(X\) is an invertible matrix. If there
exists such a decomposition for \(A\),
then computing powers of \(A\) becomes
easy: \(\begin{aligned}
A^k & = (XDX^{-1})^k = XDX^{-1} \, XDX^{-1} \, ... XDX^{-1}, \\
& = XD(X^{-1}X)D(X^{-1}X) D ... (X^{-1}X) DX^{-1}, \\
& = XD^kX^{-1}.
\end{aligned}\)

The eigenvalue decomposition does not always exists, because it is not always possible to find an invertible matrix \(X\). When it exists, though, the columns of the matrix \(X\) is composed of the eigenvectors of \(A\). When \(A\) is a \(2 \times 2\) matrix, it is enough to find 2 linearly independent eigenvectors \(x\) and \(y\) for the matrix \(X = \Biggl( \begin{array}{c|c} x_1 & y_1 \\ x_2 & y_2 \end{array} \Biggr)\) to be invertible.

## Eigenvectors

The **eigenvectors** of a matrix \(A\) are the *nonzero* vectors \(x\) such that for an eigenvalue \(\lambda\) of \(A\), \(Ax =
\lambda x.\) If \(x\) is an
eigenvector, so is any \(\alpha x\) for
any scalar value \(\alpha\). If there
are two linearly independent eigenvectors \(x\) and \(y\) associated to an eigenvalue, \(\alpha x + \beta y\) is also an
eigenvector. There is at least one eigenvector for each distinct
eigenvalue, but there may be more than one when the eigenvalue is
repeated.

**Example ** Distinct, real eigenvalues The matrix \(A =\begin{pmatrix}
-1 & -2 \\
0 & 1
\end{pmatrix}\)

is upper-triangular; this is one of the eigenvalue-relealing shapes. The eigenvalues are \(-1\) and \(1\). These are distinct eigenvalues, so each eigenvalue possesses a single eigenvector. The eigenvector \(x\) associated to \(\lambda_1 = -1\) is found by solving the eigensystem \(Ax = (-1)x.\)

The unknown quantity \(x\) appears on both sides of the equation. We can find a simpler form by noting that multiplying a vector by the identity matrix is neutral: \((-1)x = (-1) I x.\) The eigenproblem becomes \(\begin{aligned} A x & = (-1) I x, \\ A x - (-1) I x = 0, \\ \bigl( A - (-1) I \bigr) x = 0, \end{aligned}\)

that is, the eigenvector is a nonzero solution of the linear system \(\bigl( A - \lambda I \bigr) x = 0\). In general, if a matrix \(B\) is invertible, the only solution to \(Bx=0\) is \(x = 0\) (the vector of zeroes). But, by construction, \(A - \lambda I\) cannot be invertible if \(\lambda\) is an eigenvalue: its determinant is exactly the characteristic polynomial evaluated at one of its roots, so it is zero. This is why the eigensystem has nonzero solutions. Now, because \(A - \lambda I\) is not invertible, this means that a least one of its rows is a linear combination of the others. For \(2 \times 2\) matrices, this implies that the two rows are colinear, or redundant. For our example, the eigensystem reads \(\begin{aligned} \begin{pmatrix} -1 - (-1) & -2 \\ 0 & 1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & -2 \\ 0 & 2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

we immediately see that the two rows \((0,-2)\) and \((0,2)\) are colinear, with a factor \(-1\). This leads to an underdetermined system: \(0 x_1 + -2 x_2 = 0\). The solution is \(x_2 = 0\) and we can take \(x_1\) to be any value, save 0. We choose \(x = (1, 0)^t\).

For the eigenvalue \(\lambda_2 = +1\), the eigensystem reads: \(\begin{aligned} \begin{pmatrix} -1 - (+1) & -2 \\ 0 & 1 - (+1) \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} -2 & -2 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

Again, the second row \((0,0)\) can
be neglected, and the solution is \(-2 y_1 - 2
y_2 = 0\), or \(y_1 = - y_2\).
It is customary to choose an eigenvector with norm 1. The
**norm** of a complex-valued vector \(y = (y_1, y_2)^t\) is \(||y|| = \sqrt{y^*y} = \sqrt{\bar y_1 y_1 + \bar
y_2 y_2} = \sqrt{|y_1|^2 + |y_2|^2}.\)

Here, the eigenvector is \(y = (y_1, - y_1)^t\), so \(||y|| = \sqrt{|y_1|^2 + |- y_1|^2} = \sqrt{2}\sqrt{|y_1|^2} = \sqrt{2}|y_1|.\) Taking \(||y|| = 1\) solves \(|y_1| = 1/\sqrt{2}.\) This means that we could take a negative or a complex value for \(y_1\), as long as the \(|y_1| = 1/\sqrt{2}.\) Going for simplicity, we take \(y_1 = 1/\sqrt{2}\).

**Example ** Complex eigenvalues

The matrix \(A =\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix}\) is *not* diagonal, so we have to compute
the eigenvalues by hand. The trace of \(A\) is zero, the determinant is \(0 - (1)(-1) = 1\), and the discriminant is
\(-4\). A negative discriminant implies
complex eigenvalues, \(\lambda_{1,2} = \frac
12 \bigl( 0 \pm \sqrt{-4} \bigr) = \pm i.\) For the eigenvalue
\(\lambda_1 = +i\), the eigensystem
reads: \(\begin{aligned}
\begin{pmatrix}
- (+i) & -1 \\
1 & - (+i)
\end{pmatrix}
\begin{pmatrix}
x_1 \\
x_2 \\
\end{pmatrix}
& =
\begin{pmatrix}
0 \\
0 \\
\end{pmatrix}, \\
\begin{pmatrix}
-i & -1 \\
1 & -i
\end{pmatrix}
\begin{pmatrix}
x_1 \\
x_2 \\
\end{pmatrix}
& =
\begin{pmatrix}
0 \\
0 \\
\end{pmatrix}.
\end{aligned}\)

The two rows \((-i,1)\) and \((1,-i)\) should be colinear, but this is not obvious with the complex coefficients. Multiplying the first row by \(i\) gives \(i(-i, -1) = (- i^2, - i) = (-(-1), -i) = (1, -i)\), the second row, ok. Having confirmed that the system is indeed underdetermined, we can week a solution to \(-i x_1 - x_2 = 0\). Solving for \(x_2 = -i x_1\), we obtain the eigenvector \(x = (x_1, -i x_1)^t\). Normalization of \(x\) imposes \(||x|| = \sqrt{|x_1|^2 + |-ix_1|^2} = \sqrt{|x_1|^2 + |x_1|^2} = \sqrt{2}|x_1| = 1.\) As in the previous example, we can choose \(x_1 = 1/\sqrt{2}.\)

The second eigenvectors, associated \(\lambda_2 = -i\), solves the eigensystem \(\begin{aligned} \begin{pmatrix} - (-i) & -1 \\ 1 & - (-i) \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} i & -1 \\ 1 & i \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

The first row yields \(iy_1 - y_2 = 0\), so \(y = (y_1, iy_2)^t\). A normalized eigenvector can be \(y = (1/\sqrt{2}, i/\sqrt{2})^t\). We could also have chosen \(y = (i/\sqrt{2}, -1/\sqrt{2})^t\).

**Example ** Repeated eigenvalues 1

The matrix \(\begin{pmatrix} -1 & 0 \\ 2 & -1 \end{pmatrix}\)

is lower-trianglar, with repeated eigenvalues on the diagonal, \(\lambda_{1,2} = -1\). The eigenvectors associated with \(-1\) satisfy the eigenproblem \(\begin{aligned} \begin{pmatrix} -1 - (-1) & 0 \\ 2 & -1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & 0 \\ 2 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

The first row vanishes, and the second row means that \(x_1 = 0\), leaving for instance \(x_2 = 1\), and \(x = (0,1)^t\). There are no other linearly independent eigenvectors. This is not always the case, repeated eigenvalues can have more than one independent eigenvector, as in the next example.

**Example ** Repeated eigenvalues 2

The matrix \(\begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix}\) is diagonal, with repeated eigenvalues on the diagonal, \(\lambda_{1,2} = -1\). The eigenvectors associated with \(-1\) satisfy the eigenproblem \(\begin{aligned} \begin{pmatrix} -1 - (-1) & 0 \\ 0 & -1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\) Now, the two rows vanished, leaving no condition at all on \(x_1\) and \(x_2\). This means that all the vectors are eigenvectors! How many linearly independent eigenvectors can we find? Vectors of size 2 live in a vector space of dimension 2; we can find at most 2 linearly independent vectors. We can choose for instance the canonical basis: \(x = (1,0)^t\) and \(y = (0,1)^t\).

**Tips on eigenvalue decomposition**

A \(2 \times 2\) matrix (or any square matrix) admits an eigenvalue decomposition if all the eigenvalues are distinct. For \(2 \times 2\) matrices, eigenvalues are distinct if and only if the discriminant \(\Delta \neq 0\).

If the matrix has a repeated eigenvalue, it will admit an eigenvalue decomposition if the number of (linearly independent) eigenvectors is equal to the number of times the eigenvalue is repeated. The number of eigenvectors is called geometric multiplicity, and the number of repeats is called algebraic multiplicity.

The eigenproblem should be underdetermined; you should always be able to eliminate at least one row by linear combination. If you cannot, this means that there is a error, possibly an incorrect eigenvalue, or a arithmetic mistake in computing \(A - \lambda I\).

Because eigenvalues are in general complex, the eigenvectors will also be complex.

The eigenvector matrix \(X\) needs to be inverted. When the eigenvectors can be chosen so that they are orthogonal and normalized, the inverse \(X^{-1} = X^*\) (i.e. the conjugate transpose of \(X\)). Symmetric matrices have orthogonal eigenvalues, so this class of matrices are especially easy to diagonalise.

Eigenvalue decomposition and invertibility are two different concepts. A matrix can be invertible without admitting an eigenvalue decomposition, and vice versa.

When a matrix does not admit an eigenvalue decomposition, it still can be triangularised. One such triangularisation is the Jordan decomposition: \(A = P(D+S)P^{-1}\), where \(P\) is invertible, \(D\) is the diagonal matrix of eigenvalues, and \(S\) is a

**nilpotent**matrix, i.e. a nonzero matrix such that \(S^k = 0\) for \(k \geq k_0 > 1\).

## Exercises on eigenvalues decomposition

**Exercise ** Find, if there is any, an eigenvalue
decomposition of \(A = \begin{pmatrix}
-1 & 2 \\
2 & -1
\end{pmatrix}\) To compute \(X^{-1}\), you can use the fact that because
\(A\) is real and symmetrical, the
eigenvectors are orthogonal, meaning that \(X^{-1} = X^t\), if the eignevectors are
normalized.

**Correction** We have \(\det
A = (-1)(-1) - (2)(2) = 1 - 4 = -3 < 0,\) \(\mathrm{tr}\,A = -1 -1 = -2,\) and \(\Delta = (-2)^2 - 4 (-3) = 4 + 12 = 16.\)
The eigenvalues are \(\lambda_{1,2} = \frac
12 \bigl( -2 \pm \sqrt{16} \bigr) = -1 \pm 2 = 1, -3.\) The two
eigenvalues are distinct, so the matrix \(A\) is diagonalisable. The eigenvector
associated with the eigenvalue \(\lambda_1 =
1\) is solution to the eigenproblem \(\bigl( A - \lambda_1 I ) x = 0.\) We look
for a solution \(\begin{aligned}
\begin{pmatrix}
-1 - (1) & 2 \\
2 & -1 - (1)
\end{pmatrix}
\begin{pmatrix}
x_1 \\
x_2
\end{pmatrix}
& = 0, \\
\begin{pmatrix}
-2 & 2 \\
2 & -2
\end{pmatrix}
\begin{pmatrix}
x_1 \\
x_2
\end{pmatrix}
& = 0.
\end{aligned}\) (We check that the two rows are colinear.) The
first row gives \(-2x_1 + 2x_2 = 0\),
or \(x_1 = x_2\). The norm of the
eigenvector \(x = (x_1, x_2)^t = \sqrt{x_1^2 +
x_2^2} = \sqrt{2}|x_1|.\) We choose\(x_1 = 1/\sqrt{2}\) to have a normalized
eigenvector. We know that the second eigenvector is orthogonal to \(x\), so we can take \(y = (1/\sqrt{2},-1/\sqrt{2})^t\) for the
eigenvector associated to \(\lambda_2 =
-3\). To check that this is indeed an eigenvector, we solve to
eigenproblem \(\begin{aligned}
\begin{pmatrix}
-1 - (-3) & 2 \\
2 & -1 - (-3)
\end{pmatrix}
\begin{pmatrix}
y_1 \\
y_2
\end{pmatrix}
& = 0, \\
\begin{pmatrix}
2 & 2 \\
2 & 2
\end{pmatrix}
\begin{pmatrix}
y_1 \\
y_2
\end{pmatrix}
& = 0.
\end{aligned}\) The solutions are vectors that satisfy \(y_1 = - y_2\); this is the case for \(y\). The matrix \(X\) is composed of the column vectors \(x\) and \(y\): \(X = \bigl(
x | y \bigr)\) and its inverse is \(X^{-1} = X^t = \begin{pmatrix} x_1^t \\ \hline
x_2^t \end{pmatrix}
=
\frac{1}{\sqrt{2}}
\begin{pmatrix}
1 & 1 \\
1 & -1
\end{pmatrix}.\)

# Linearisation of functions \(\mathbb{R}^2 \to \mathbb{R}^2\)

Nonlinear systems of ordinary differential equations are used to
describe the **dynamics** (evolution in time) of
concentration of biochemical species, population densities in ecological
systems, of the electrophyiology of neurons.

Two dimensional systems are described by two ordinary differential
equations (ODEs) \(\begin{aligned}
\frac{dx_1}{dt} & = f_1(x_1,x_2), \\
\frac{dx_2}{dt} & = f_2(x_1,x_2), \\
\end{aligned}\) The variables \(x_1,
x_2\) are functions of time: \(x_1(t)\), \(x_2(t)\), and \(f_1, f_2\) are the derivatives. We define
the two-dimensional vectors \(\boldsymbol{x} =
(x_1,x_2)^t\) (here we will use **bold** for
vectors), and \(\boldsymbol{f} = (f_1,
f_2)^t\). The ODEs can now be represented in vector format, \(\frac{d\boldsymbol{x}}{dt} =
\boldsymbol{f}(\boldsymbol{x}).\) Here we assume that there
exists a point in the 2D plane \(\bar{\boldsymbol{x}}\) such that the
derivative \(\boldsymbol{f}(\bar{\boldsymbol{x}}) = 0.\)
This point is called a **steady state** because the
derivatives are all zeros; the steady state is therefore a solution to
the system of ODE.

We are interested in how \(\boldsymbol{f}\) is behaving around the
steady state. To do that we linearize the function \(\boldsymbol{f}\) at the steady state.
**Linearisation** is a first-order expansion. For a
function from \(\mathbb{R}^2 \to
\mathbb{R}^2\), a first-order expansion around a point \(\boldsymbol{x}_0\) is \(\begin{aligned}
\boldsymbol{f}(\boldsymbol{x}) \approx \boldsymbol{f}(\boldsymbol{x}_0)
+
\boldsymbol{D}\boldsymbol{f}(\boldsymbol{x}_0)
(\boldsymbol{x}-\boldsymbol{x}_0)
\end{aligned}\) When expanding around a steady state, the
constant term \(\boldsymbol{f}(\bar{\boldsymbol{x}}) = 0\).
In the second term, \(\boldsymbol{D}\boldsymbol{f}\) is a \(2 \times 2\) matrix, called the Jacobian
matrix, and often denoted \(\boldsymbol{J}\). The **Jacobian
matrix** for the function \(\boldsymbol{f}\) is defined as \(\begin{aligned}
\boldsymbol{J} = \boldsymbol{D}\boldsymbol{f} =
\begin{pmatrix}
\frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial
x_2} \\
\frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial
x_2}
\end{pmatrix}.
\end{aligned}\) When evaluated at a steady state, the Jacobian
matrix can provide information on the dynamics of the nonlinear ODE
system. More precisely, the eigenvalues of the Jacobian matrix can
determine whether the steady state is stable (attracts solutions) or is
unstable. **Linearisation around a steady state means computing
the Jacobian matrix at the steady state.**

**Example ** Linearisation around a steady state

The Lotka-Volterra equations is a classical ODE system mathematical biology. The equations reads \(\begin{aligned} \frac{dx}{dt} = ax - xy, \\ \frac{dy}{dt} = xy - by, \\ \end{aligned}\) for \(a, b\) positive constants. The solution vector is \(\boldsymbol{x} = (x,y)^t\) and the derivatives are \(f_1(x,y) = ax - xy\) and \(f_2 = xy - by\). We first look for steady states \(f_1 = ax - xy = 0, \quad f_2 = x y - b y.\) If \(x\) and \(y\) are not zero, we have \(x = b\) and \(y = a\). If \(x = 0\), the second equation implies \(y = 0\). If \(y = 0\), the first equation implies \(x = 0\). Therefore there are two steady states, \(\bar{\boldsymbol{x}} = (b,a)^t\) and \(\hat{\boldsymbol{x}} = (0,0)^t\).

We have the following derivatives \(\begin{aligned} \frac{\partial f_1}{\partial x} (x,y) & = a - y, \\ \frac{\partial f_1}{\partial y} (x,y) & = - x, \\ \frac{\partial f_2}{\partial x} (x,y) & = y, \\ \frac{\partial f_2}{\partial y} (x,y) & = x - b, \\ \end{aligned}\) The Jacobian matrix is \(\begin{aligned} \boldsymbol{J} = \begin{pmatrix} a - y & - x \\ y & x - b \end{pmatrix}. \end{aligned}\) Evaluated at the steady state \(\bar{\boldsymbol{x}} = (b,a)^t\) and \(\hat{\boldsymbol{x}} = (0,0)^t\), the Jacobian matrices are \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}) = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix}, \quad \boldsymbol{J}(\hat{\boldsymbol{x}}) = \begin{pmatrix} a & 0 \\ 0 & -b \end{pmatrix}. \end{aligned}\)

## Exercises on linearisation

**Exercise ** Let the function \(\boldsymbol{f} = (f_1, f_2)^t\), with \(f_1(x,y) = -d x + x \exp(-a xy), \quad f_2(x,y) =
x - y,\) \(d < 1\), \(a, d\) positive. Find the steady states (by
solving the equations \(f_1 = 0, f_2 =
0\)). Compute the Jacobian matrix, and evaluate the Jacobian
matrix at each steady state.

**Correction** The steady states are found by solving
\(\begin{aligned}
f_1(x,y) = -d x + x \exp(-a xy) & = 0, \\
f_2(x,y) = x - y & = 0.
\end{aligned}\) From the second equation, we have \(x = y\). The first equation is equivalent
to \(d x = x \exp( - a xy).\) We need
to distinguish two cases: (i) \(x =
0\), and (ii) \(x \neq 0\). Case
(i) leads to the solution \(\boldsymbol{x}^* =
(0, 0)^t\), our first steady state. Case (ii) means that we can
simplify \(x\) in the first equation:
\(d = \exp(- a xy).\) Replacing \(y = x\), and solving for \(x\): \(\begin{aligned}
d & = \exp(-a x y), \\
d & = \exp(-a x^2), \\
\ln d & = - a x^2, \\
- \frac{\ln d}{a} & = x^2, \quad (a > 0)
\end{aligned}\) The hypothesis \(d<1\) ensures that \(\ln d < 0\) and \(- \ln d > 0\). There are therefore two
real solutions for \(x\): \(x = \pm \sqrt{- \frac{\ln d}{a}}.\) The
two additional steady states are \(\bar{\boldsymbol{x}}_{1,2} =
\begin{pmatrix}
\pm \sqrt{- \frac{\ln d}{a}} \\ \pm \sqrt{- \frac{\ln d}{a}}
\end{pmatrix}.\)

The Jacobian matrix of \(\boldsymbol{f}\) is computed from the partial derivatives \(\begin{aligned} \frac{\partial f_1}{\partial x} (x,y) & = -d + (-a y) \exp ( - axy ), \\ \frac{\partial f_1}{\partial y} (x,y) & = (-ax) \exp ( -axy ), \\ \frac{\partial f_2}{\partial x} (x,y) & = 1, \\ \frac{\partial f_2}{\partial y} (x,y) & = -1. \\ \boldsymbol{J} & = \begin{pmatrix} -d - a y \exp ( - axy ) & -ax \exp ( -axy ) \\ 1 & -1 \end{pmatrix}. \end{aligned}\) The function \(f_2\) is linear. This is reflected in the Jacobian matrix, which as constant coefficients on the second row. The evaluation of the Jacobian matrix at steady state \(\boldsymbol{x}^* = (0,0)^t\) is \(\boldsymbol{J}(\boldsymbol{x}^*) = \begin{pmatrix} -d & 0 \\ 1 & -1 \end{pmatrix}.\) The evaluation of Jacobian matrix at steady state \(\bar{\boldsymbol{x}}_1 = \bigl( \sqrt{- \frac{\ln d}{a}}, \sqrt{- \frac{\ln d}{a}} \bigr)^t\) is \(\boldsymbol{J}(\bar{\boldsymbol{x}}_1) = \begin{pmatrix} -d - a \bar y_1 \exp( -a \bar x_1 \bar y_1 ) & - a \bar x_1 \exp ( - a \bar x_1 \bar y_1 ) \\ 1 & -1 \end{pmatrix}.\) Here, we use the fact that steady states satisfy the equation \(\exp( -a xy ) = d\) to simplify the exponential terms \(\boldsymbol{J}(\bar{\boldsymbol{x}}_1) = \begin{pmatrix} -d - a \bar y_1 d & - a \bar x_1 d \\ 1 & -1 \end{pmatrix}.\) Replacing \(y_1\) and \(x_1\) by \(\sqrt{- \frac{\ln d}{a}}\), we obtain \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}_1) & = \begin{pmatrix} -d - a y_1 d & - a \bar x_1 d \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d - a \sqrt{- \frac{\ln d}{a}} d & - a \sqrt{- \frac{\ln d}{a}} d \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d \Bigl( 1 + a \sqrt{- \frac{\ln d}{a}} \Bigr) & - d \sqrt{- a^2 \frac{\ln d}{a}} \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d \Bigl( 1 + \sqrt{- a \ln d} \Bigr) & - d \sqrt{- a \ln d} \\ 1 & -1 \end{pmatrix}. \end{aligned}\) The same lines of calculations for the steady state \(\bar{\boldsymbol{x}}_2\) leads to \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}_2) & = \begin{pmatrix} -d \Bigl( 1 - \sqrt{- a \ln d} \Bigr) & d \sqrt{- a \ln d} \\ 1 & -1 \end{pmatrix}. \end{aligned}\)

**Exercise ** Compute the Jacobian matrices of each of
the following functions of \((x,y)\).
All parameters are constants. You do not need to compute the steady
states just the matrices.

van der Pol oscillator \(f_1(x,y) = \mu\bigl( (1-x^2)y - x \bigr), \; f_2(x,y) = y.\)

Two-compartment pharmacokinetics \(f_1(x,y) = a - k_{12} x + k_{21} y - k_1 x, \; f_2(x,y) = k_{12} x - k_{21} y.\)

SI epidemiological model \(f_1(x,y) = - \beta x y, \; f_2(x,y) = \beta x y - \gamma y.\)

# Solution of systems of linear differential equations in dimension 2

Linear differential equations have linear derivative parts, which can
be represented in matrix-vector format \(\frac{d\boldsymbol{x}(t)}{dt} = \boldsymbol{A}
\boldsymbol{x}(t),\) for a vector \(\boldsymbol{x}\) square matrix \(\boldsymbol{A}\). For initial conditions
\(\boldsymbol{x}(t) =
\boldsymbol{x}_0\), the **solution of the linear system of
ODEs** is \(\begin{aligned}
\boldsymbol{x}(t) = e^{\boldsymbol{A}t} \boldsymbol{x}_0.
\end{aligned}\) If we have at our disposal an eigenvalue
decomposition of \(\boldsymbol{A} =
\boldsymbol{X} \boldsymbol{D} \boldsymbol{X}^{-1}\), the
**exponential of the matrix** is \(\begin{aligned}
e^{\boldsymbol{A}t} & = \boldsymbol{X} e^{\boldsymbol{D}t}
\boldsymbol{X}^{-1}, \\
& = \boldsymbol{X}
\begin{pmatrix}
e^{ \lambda_1 t } & 0 \\
0 & e^{ \lambda_2 t }
\end{pmatrix}
\boldsymbol{X}^{-1}.
\end{aligned}\) Therefore, the long-time behavior of the
exponential is controlled by the eigenvalues \(\lambda_{1,2}\).

**Example ** Solution of a linear system of ODEs

Consider the linear system of ODEs given by the Lotka-Volterra model linearised at its nonzero steady state \(\bar{\boldsymbol{x}} = (b,a)^t\) is \(\begin{aligned} \begin{pmatrix} \frac{dx}{dt} \\ \frac{dy}{dt} \end{pmatrix} = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}, \quad \begin{pmatrix} x(0) \\ y(0) \end{pmatrix} = \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}. \label{eq_linearequation} \end{aligned}\) This system approximates the nonlinear version near the steady state. In this linear system, variables \((x,y)\) are deviations from the steady state; their solutions are "centered" around 0. To solve this linear system, we will diagonalise the matrix \(\begin{aligned} A = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix}. \end{aligned}\) The goal is to go slowly through every step once for this system. In general it is not necessary to solve the system completely by hand; knowledge of the eigenvalues is often sufficient in many applications.

We have \(\det A = 0 - a(-b) = ab >
0\), \(\mathrm{tr}\,A = 0\) and
\(\Delta = 0 - 4ab = -4ab < 0\). The
eigenvalues are therefore complex conjugates: \(\lambda_{1,2} = \pm i \sqrt{ab}.\) Distinct
eigenvalues means that \(A\) is
diagonalisable. The eigenvector associated to \(\lambda_1 = i\sqrt{ab}\) is given by the
system \(\begin{aligned}
\Biggl(
\begin{array}{cc|c}
-i\sqrt{ab} & -b & 0 \\
a & -i\sqrt{ab} & 0
\end{array}
\Biggr)
\end{aligned}\) We have from the first row \(-i\sqrt{ab} x = b y\). Letting \(x = b\) and \(y =
-i\sqrt{ab}\), we obtain the non-normalized eigenvector \(\tilde x_1 = (b, -i\sqrt{ab})^t\).
Normalization is done by dividing by \(||\tilde x || = \sqrt{b^2 + (-i\sqrt{ab})^2} =
\sqrt{b^2 + ab},\) to obtain the first eigenvector \(x =
\begin{pmatrix}
\frac{b}{\sqrt{b^2 + ab}} \\
\frac{-i\sqrt{ab}}{\sqrt{b^2 + ab}}
\end{pmatrix}
=
\begin{pmatrix}
\frac{b}{\sqrt{b}\sqrt{b + a}} \\
\frac{-i\sqrt{a}\sqrt{b}}{\sqrt{b}\sqrt{b + a}}
\end{pmatrix}
=
\begin{pmatrix}
\frac{\sqrt{b}}{\sqrt{b + a}} \\
\frac{-i\sqrt{a}}{\sqrt{b + a}}
\end{pmatrix}.\) The second eigenvector is computed the same way
(watch out for the slightly different signs!). The eigenproblem for the
eigenvalue \(\lambda = -i\sqrt{ab}\) is
\(\begin{aligned}
\Biggl(
\begin{array}{cc|c}
+i\sqrt{ab} & -b & 0 \\
a & +i\sqrt{ab} & 0
\end{array}
\Biggr)
\end{aligned}\) Given that the only change is \(-i \to +i\), the second eigenvector is
\(x_2 =
\begin{pmatrix}
\frac{\sqrt{b}}{\sqrt{b + a}} \\
\frac{i\sqrt{a}}{\sqrt{b + a}}
\end{pmatrix}.\) The solution to the linear ODE is \(\begin{pmatrix}
x(t) \\ y(t)
\end{pmatrix}
=
\boldsymbol{X} e^{\boldsymbol{D}t} \boldsymbol{X}^{-1}
\begin{pmatrix} x_0 \\ y_0 \end{pmatrix},\) with \(\begin{aligned}
\boldsymbol{X} =
\frac{1}{\sqrt{b + a}}
\begin{pmatrix}
\sqrt{b} & \sqrt{b} \\
-i\sqrt{a} & i\sqrt{a}
\end{pmatrix}, \quad
\boldsymbol{D} =
\begin{pmatrix}
+i\sqrt{ab} & 0 \\
0 & -i\sqrt{ab}
\end{pmatrix}
\end{aligned}\) The **inverse of a \(2 \times 2\) matrix** with
coefficients \(a,b,c,d\) is \(\begin{aligned}
\begin{pmatrix}
a & b \\
c & d
\end{pmatrix}^{-1} =
\frac{1}{ad-bc}
\begin{pmatrix}
d & -b \\
-c & a
\end{pmatrix}.
\end{aligned}\) This is conditional to \(\det = ad - bc \neq 0\), of course. With
this formula, the inverse of \(\boldsymbol{X}\) is \(\boldsymbol{X}^{-1} =
\frac{1}{\sqrt{b + a}} \frac{1}{\det \boldsymbol{X}}
\begin{pmatrix}
i \sqrt{a} & -\sqrt{b} \\
i \sqrt{a} & \sqrt{b}
\end{pmatrix}.\) The determinant \(\det
\boldsymbol{X} = \frac{i \sqrt{b} \sqrt{a}}{b + a} + \frac{i \sqrt{a}
\sqrt{b}}{b + a} = 2i \frac{\sqrt{ab}}{b + a}.\) The inverse
reduces to \(\frac{1}{\sqrt{b + a}} \frac{a +
b}{2i\sqrt{ab}}
\begin{pmatrix}
i \sqrt{a} & -\sqrt{b} \\
i \sqrt{a} & \sqrt{b}
\end{pmatrix}
=
\frac{-i \sqrt{b + a}}{2\sqrt{ab}}
\begin{pmatrix}
i \sqrt{a} & -\sqrt{b} \\
i \sqrt{a} & \sqrt{b}
\end{pmatrix}
=
\frac{\sqrt{b + a}}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{a} & i\sqrt{b} \\
\sqrt{a} & -i\sqrt{b}
\end{pmatrix}.\) We have now obtained the eigenvalue decompostion
of \(\boldsymbol{A} = \boldsymbol{X}
\boldsymbol{D} \boldsymbol{X}^{-1}\). To solve the linear ODE, we
need to compute the product \(\begin{aligned}
\begin{pmatrix}
x(t) \\ y(t)
\end{pmatrix}
& =
\boldsymbol{X} e^{\boldsymbol{D}t} \boldsymbol{X}^{-1}
\begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\
& =
\frac{1}{\sqrt{b + a}}
\begin{pmatrix}
\sqrt{b} & \sqrt{b} \\
-i\sqrt{a} & i\sqrt{a}
\end{pmatrix}
\begin{pmatrix}
e^{+i\sqrt{ab}t} & 0 \\
0 & e^{-i\sqrt{ab}t}
\end{pmatrix}
\frac{\sqrt{b + a}}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{a} & i\sqrt{b} \\
\sqrt{a} & -i\sqrt{b}
\end{pmatrix}
\begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\
& =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{b} & \sqrt{b} \\
-i\sqrt{a} & i\sqrt{a}
\end{pmatrix}
\begin{pmatrix}
e^{+i\sqrt{ab}t} & 0 \\
0 & e^{-i\sqrt{ab}t}
\end{pmatrix}
\begin{pmatrix}
\sqrt{a} & i\sqrt{b} \\
\sqrt{a} & -i\sqrt{b}
\end{pmatrix}
\begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\
& =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{b} e^{+i\sqrt{ab}t} & \sqrt{b} e^{-i\sqrt{ab}t}\\
-i\sqrt{a} e^{+i\sqrt{ab}t} & i\sqrt{a} e^{-i\sqrt{ab}t}
\end{pmatrix}
\begin{pmatrix}
\sqrt{a}x_0+i\sqrt{b}y_0 \\
\sqrt{a}x_0-i\sqrt{b}y_0
\end{pmatrix}. \\
\end{aligned}\) To simplify the last steps of the calculation, we
will introduce the following notation. Using Euler’s formula, \(e^{\pm i\sqrt{ab}t} = \cos(\sqrt{ab}t) \pm i
\sin(\sqrt{ab}t)\). Let \(c =
\cos(\sqrt{ab}t)\), \(s =
\sin(\sqrt{ab}t)\), and \(C_1 =
\sqrt{a}x_0+i\sqrt{b}y_0\), \(C_2 =
\sqrt{a}x_0-i\sqrt{b}y_0\). The solution reads \(\begin{aligned}
\begin{pmatrix}
x(t) \\ y(t)
\end{pmatrix} & =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{b} e^{i\sqrt{ab}t} C_1 + \sqrt{b} e^{-i\sqrt{ab}t} C_2 \\
-i\sqrt{a}e^{i\sqrt{ab}t} C_1 + i\sqrt{a}e^{-i\sqrt{ab}t} C_2
\end{pmatrix}, \\
& =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{b} (c + is) C_1 + \sqrt{b} (c - is) C_2 \\
-i\sqrt{a}(c + is) C_1 + i\sqrt{a}(c - is) C_2
\end{pmatrix}, \\
& =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
\sqrt{b} c (C_1 + C_2) + i \sqrt{b} s (C_1 - C_2) \\
\sqrt{a} s (C_1 + C_2) + i \sqrt{a} c (-C_1 + C_2)
\end{pmatrix}, \\
& =
\frac{1}{2\sqrt{ab}}
\begin{pmatrix}
2 \sqrt{ab} \cos(\sqrt{ab}t) x_0 - 2 b \sin(\sqrt{ab}t) y_0 \\
2 a \sin(\sqrt{ab}t)x_0 + 2 \sqrt{ab} \cos(\sqrt{ab}t) y_0
\end{pmatrix}, \\
& =
\begin{pmatrix}
\cos(\sqrt{ab}t) x_0 - \sqrt{b/a} \sin(\sqrt{ab}t) y_0 \\
\sqrt{a/b} \sin(\sqrt{ab}t)x_0 + \cos(\sqrt{ab}t) y_0
\end{pmatrix}.
\end{aligned}\) And that’s it! We have obtained a solution to the
linear ODE (Figure 8).

# Glossary

French | English | Note |
---|---|---|

dérivable | differentiable | |

matrice jacobienne | Jacobian matrix | |

rang | rank | |

noyau | kernel | notation: \(\ker\) |

ensemble | set | |

espace vectoriel | vector space | |

sous-espace vectoriel | linear subspace | |

valeur propre | eigenvalue | |

vecteur propre | eigenvector | |

sous-espace propre | eigenspace | |

décomposition en valeurs propres | eigenvalue decomposition | |

décomposition en valeurs singulières | singular value decomposition | |

valeur singulière | singular value | |

trace | trace | |

déterminant | determinant | \(\det\) |

base | basis | |

application linéaire | linear map | |

application | map | |

dimension | dimension | |

moindres carrés | least-squares | |

produit scalaire | scalar product | |

Vect | Span | |

famille libre | linearly independent set | |

famille génératrice | spanning set |