README

This document is intended for training and future reference. As a reference document, you may find it useful for Biomaths 1 (3BS, Fall semester), for linear algebra (3BIM Biomaths 2, Winter Semester) and for Advanced ODEs (3BIM Biomaths 3, Winter Semester).

When important concepts are encountered for the first time, they are highlighted in bold next to their definition. We tried to provides examples that are as complete as possible. This means that they are long, you could probably solve them faster. Exercises are important, they can introduce theory or techniques that will be prove useful. Solutions to the exercises can be found at the end of the documents. Examples and exercises (and their solutions!) will be added regularly, so feel free download the most recent version (here). This is version V2023.09.07.

Functions, maps

A function is a relation, often denoted \(f\), that associates an element \(x\) of a domain \(I\), and at most one element \(y\) of the image \(J\). The domain \(I\) and image \(J\) are sets. Usually \(I, J \in \mathbb{R}\).

Functions. (A) Function \(f\). (B) Not a function.

A map is a relation that associate each element of its domain to exactly one element of its image. Maps and functions are related but slightly different concepts. A function \(f\) is a map if it is defined for all elements of its domain \(I\). A map is always a function. The term map can also be used when the domain or the image are not numbers (Figure 1).

The graph of a function \(f\), denoted \(\mathcal{G}(f)\) is the set of all pairs \((x, f(x))\) in the \(I \times J\) plane. For real-valued functions, the graph is represented as a curve in the Cartesian plane.

Functions are not numbers. Do not confuse

\(f\) the function
\(f(x)\) the evaluation of \(f\) at element \(x\); \(f(x)\) is an element of the image (usually a number)
\(\mathcal{G}(f)\) the graph of \(f\)

Consequently, do not write

\(f(x)\) is increasing... Instead write \(f\) is increasing...
\(f(x)\) is decreasing... Instead write \(f\) is decreasing...
\(f(x)\) is continuous... Instead write \(f\) continuous...

Some usual maps

\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to k\), \(k \in \mathbb{R}\) constant; \(x \to x\), identity map.
\(f:\mathbb{R}\backslash\{0\} \to \mathbb{R}\), with \(x \to \frac 1x\), inverse.
\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to x^2\), parabola; \(x \to x^3\), cubic map.
\(f:\mathbb{R}^+ \to \mathbb{R}\), with \(x \to \sqrt{x} = x^{\frac 12}\), square root; more generally with \(x \to x^{\frac pq} = {}^q\sqrt{x^p}\), fractional power.
\(f:\mathbb{R}\backslash\{-d/c\} \to \mathbb{R}\), with \(x \to \frac{ax + b}{cx + d}\).
\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \exp(x)\), exponential.
\(f:\mathbb{R}^+\backslash\{0\} \to \mathbb{R}\), with \(x \to \ln(x)\), natural logarithm.

On logarithms: For \(a, b >0\), \(n\) positive integer, \(\ln(ab) = \ln(a) + \ln(b)\), \(\ln(a^n) = n \ln(a)\), \(\ln(a/b) = \ln(a) - \ln(b)\).
\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \cos(x)\), cosine; \(x \to \sin(x)\), sine \(x \to \tan(x)\), tangent.

A bit more on trigonometric functions. The diagram below shows the relationship between the sine, cosine and tangent, of an angle \(\theta \in [0,2\pi]\).
\(f:\mathbb{R} \to \mathbb{R}\), with \(x \to \cosh(x) = \frac 12 \bigl( e^{x} + e^{-x} \bigr)\), hyperbolic cosine; \(x \to \sinh(x) = \frac 12 \bigl( e^{x} - e^{-x} \bigr)\), hyperbolic sine; \(x \to \tanh(x) = \frac{\sinh(x)}{\cosh(x)}\), hyperbolic tangent.

Exercises on functions

More to come...

Derivatives

We call the derivative of the function \(f:I \to J\) (\(I,J \subset \mathbb{R}),\), at point \(a \in I\) the limit, if it exists, \(\lim_{x \to a} \frac{f(x) - f(a)}{x - a}.\) The derivative is denoted \(f'(a)\). An alternative representation of the limit is obtained by setting \(h = x - a\), \(f'(a) = \lim_{h \to 0} \frac{f(a + h) - f(a)}{h}.\) If the derivative exists for all elements \(a \in I\), we say that differentiable on \(I\).

If \(f\) is differentiable on \(I\), and \(f'(x) > 0\), then \(f\) is strictly increasing on \(I\).
If \(f\) is differentiable on \(I\), and \(f'(x) < 0\), then \(f\) is strictly decreasing on \(I\).

However, if \(f\) is strictly increasing, it does not mean that \(f'(x) > 0\). For example the function \(f\) with \(f(x) = x^3\) is strictly increasing on \(\mathbb{R}\), but \(f'(0) = 0\). Where the derivative exists, we can define the derivative function \(f':I \to \mathbb{R}\) of \(f\).

The second derivative of a function \(f\), denoted \(f''\) is the derivative of \(f'\), where defined. If \(f''(x)\) exists and \(f''(x) > 0\) for all \(x \in I\), we say that \(f\) is convex (U-shaped). If \(f'(x) = 0\) and \(f''(x) > 0\), the point \(x\) is a minimum. If \(f'(x)\) and \(f''(x) < 0\), the point \(x\) is a maximum. Maxima and minima are extrema. If \(f''(0) = 0\), the point \(x\) is an inflection point (Figure 2).

Extrema, inflection points of the polynomial \(f(x) = (x+0.8)(x-0.5)(x-0.8).\)

List of common derivatives

The derivative has linear properties. If \(f\) and \(g\) are differentiable on \(I\), and \(a \in \mathbb{R}\),

\((f + g)' = f' + g'\).
\((af)' = a (f')\).
\((af + g)' = a (f') + g'\).

Let \(g:I \to J\) and \(f:J \to K\) be two functions. The composition of \(f\) and \(g\), denoted \(f \circ g\), is the function \(x \to f(g(x))\), i.e. \(f \circ g (x) = f(g(x))\). If \(f\) and \(g\) are differentiable, the composition \(f \circ g\) is also differentiable, and its derivative follows the rule of composed functions:

\(\bigl( f \circ g \bigr)'(x) = f'(g(x))g'(x).\)

Example Let \(f:x \to x^2\) and \(g:x \to 3x + 1\) be two differentiable functions, with \(f'(x) = 2x\) and \(g'(x) = 3\). The derivative of the composed function \(f \circ g\) at \(x\) is \(f'(g(x))g'(x) = f'(3x+1)g'(x) = 2(3x+1) \cdot 3 = 6(3x+1) = 18x + 6.\) The derivative could have been obtained by first computing the composed function \(f(g(x)) = (3x+1)^2 = 9 x^2 + 6x + 1\), and then taking the derivative.

Example Compute the derivative of \(f: x \to \sin(1/x)\). The function \(f\) is composed of a sine and an inverse function. To compute the derivative, we decomposed the function \(f\) as \(f(x) = g(h(x))\) with \(g(x) = \sin(x)\) and \(h(x) = 1/x\). The derivatives are \(g'(x) = \cos(x)\) and \(h'(x) = -1/x^2\). Finally the derivative of \(f\) is

\(f'(x) = g'(h(x))h'(x) = \cos(1/x) \Bigl( \frac{-1}{x^2} \Bigr) = - \frac{\cos(1/x)}{x^2}.\)

Example A function \(f:I \to I\) is bijective (invertible) on \(I\) if there exists a function, denoted \(f^{-1}\) and called inverse of \(f\), such that the compositions are equal \(f \circ f^{-1} = f^{-1} \circ f\), and are equal to the identity map. That is, \(f^{-1} \circ f (x) = f \circ f^{-1}(x) = x\) for all \(x \in I\). If \(f\) is differentiable and invertible, what is the derivative of \(f^{-1}\)?

We apply the derivative to \(f(f^{-1})\). Given that \(f(f^{-1}(x)) = x\) by definition, we have \(\bigl( f(f^{-1}) \bigr)' = 1,\) and \(\begin{aligned} \bigl( f(f^{-1}) \bigr)'(x) & = f'(f^{-1}(x))(f^{-1})'(x), \\ & = 1, \\ (f^{-1})'(x) & = \frac{1}{f'(f^{-1}(x))}. \end{aligned}\) Take for instance \(f(x) = x^2\) on \(x \in (0,1]\). The inverse of \(f\) is \(f^{-1}(x) = \sqrt{x}\). The derivative of \(f\) is \(f'(x) = 2x\) and the derivative \(\bigl(f^{-1}\bigr)'(x) = \frac{1}{f'(f^{-1}(x))} = \frac{1}{2(\sqrt{x})}.\)

Example Compute a derivative when the independent variable is in the exponent. To compute the derivative of \(f: x \to 2^x\), we need to re-express the function in terms of the natural base \(e\). To do that, we use to properties of the natural logarithm

For any positive expression \(y\), \(y = e^{ \ln (y) }\) (\(\ln\) is the inverse of the exponential function).
\(\ln (a^b) = b \ln (a).\)

Then \(2^x = e^{ \ln (2^x) } = e^{ x \ln (2) }\). The derivative is \(\ln (2) e^{ x \ln (2) }\). Re-writing in term of base 2, we obtain \(f'(x) = \ln (2) 2^x\).

Function	Derivative	Note
\(x^a\)	\(ax^{a-1}\)	\(a \in \mathbb{R}\)
\(\frac{1}{x}\)	\(\frac{-1}{x^2}\)
\(x^{\frac 12}\)	\(\frac{1}{2x^{\frac 12}}\)
\(\ln(x)\)	\(\frac 1x\)
\(e^x\)	\(e^x\)
\(\cosh(x)\)	\(\sinh(x)\)
\(\sinh(x)\)	\(\cosh(x)\)
\(\cos(x)\)	\(-\sin(x)\)
\(\sin(x)\)	\(\cos(x)\)
\(\tan(x)\)	\(1 + \tan^2(x) = \frac{1}{\cos^2(x)}\)
\(\dfrac{u(x)}{v(x)}\)	\(\dfrac{v(x)u'(x) - u(x)v'(x)}{v^2(x)}\)
\(u(x) v(x)\)	\(u'(x)v(x) + u(x)v'(x)\)

Exercises on derivatives

Exercise Compute the derivatives of the following functions

\(f_1 : x \to \sqrt{\cos x}.\)
\(f_2 : x \to \sin(3x + 2).\)
\(f_3 : x \to e^{\cos x}.\)
\(f_4 : x \to \ln\bigl(\sqrt{x} \bigr).\)
\(f_5 : x \to 2^{\ln x}.\)

Correction \(f_1'(x) = -\frac{\sin x}{2\sqrt{\cos x}}.\) \(f_2'(x) = 3 \cos(3x + 2).\) \(f_3'(x) = -\sin(x) e^{\cos x}.\) \(f_4'(x) = \frac{1}{2x}.\) \(f_5'(x) = \frac{\ln 2}{x} 2^{\ln x}.\)

Taylor series and truncated expansions

If a function \(f\) is infinitely differentiable (i.e. the \(k\)-th derivative function \(f^{(k)}\) is continuous for any integer \(k\geq 0\)), the Taylor series of \(f\) at point \(a\) is the series \(\begin{aligned} \label{eq_taylor} f(a) + \frac{f'(a)}{1!}(x - a) + \frac{f''(a)}{2!} (x-a)^2 + \frac{f'''(a)}{3!} (x - a)^3 + ... \end{aligned}\)

A function is analytic on an open interval \(I\) if and only if its Taylor series converges pointwise to the value of the function. Polynomials, exponential and trigonometric function are analytic over all real points. The square root function is not analytic.

The partial sums of the series (or truncated expansion) can be used to approximate a function, and to evaluate it numerically. The \(k\)th-order expansion of a function \(f\) is the polynomial \(\begin{aligned} f(a) + \frac{f'(a)}{1!}(x - a) + \frac{f''(a)}{2!} (x-a)^2 + ... + \frac{f^{(k)}(a)}{k!} (x - a)^k. \end{aligned}\) Truncated expansions are used in implementations of common mathematical functions in computer programs.

Example The Taylor series of the sine function at point \(a = 0\) is \(\sum_{n=0}^\infty \frac{1}{n!}\frac{d^n\sin(x)}{dx^n} x^n = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + ...\) The 3rd-order expansion is the cubic polynomial \(x - \frac{x^3}{3!}.\) How good this cubic polynomial approximation to the original sine function? The error is the remainder of the terms of the Taylor series \(\begin{aligned} \Bigl| - \frac{x^7}{7!} + \frac{x^9}{9!} - ... \Bigr|. \end{aligned}\) For \(|x| < 1\), the error is bounded by \(|x|^7/7!\). Given that \(7! = 5040\), the error is less than \(1/5040 \approx 0.0002\). Truncated expansions are never good approximations when \(|x|\) becomes large, because polynomials are not bounded, but the approximation can be quite good over small intervals around the point at which the Taylor series is computed.

First, third and fifth order expansion of \(f(x) = \sin(x)\).

It is important to note that even if a function has a Taylor series, the series may not converge to the function.

Example Let the function \(f(x) = \begin{cases} 0 & x \leq 0 \\ e^{-\frac 1x} & x > 0. \end{cases}.\) Around 0, all the derivatives exist and are equal to 0. The Taylor series of \(f\) at point \(x = 0\) is 0, but the function itself is different from the identically zero function.

Expansion of a function of two variables

For functions of two variables \(f : \mathbb{R}^2 \to \mathbb{R}\), the Taylor expansion at point \((x_0,y_0)^t \in \mathbb{R}^2\) is \(\begin{aligned} f(x_0,y_0) = f(x_0,y_0) & + \frac{\partial f(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f(x_0,y_0)}{\partial y} (y - y_0) \\ & + \frac{1}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial x^2} (x - x_0)^2 \\ & + \frac{1}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial y^2} (y - y_0)^2 \\ & + \frac{2}{2!} \frac{\partial^2 f(x_0,y_0)}{\partial x \partial y} (x - x_0) (y - y_0) + ... \end{aligned}\)

Example The second-order truncated expansion of the function \(f(x,y) = y e^{-x}\) at \((0,0)^t\) is \(y - xy.\)

Expansion of a function from \(\mathbb{R}^2 \to \mathbb{R}^2\)

Functions \(f: \mathbb{R}^2 \to \mathbb{R}^2\) have the form \(f(x,y) = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \end{pmatrix}.\) To compute the Taylor series, we need to compute the Taylor series of each function \(f_1\) and \(f_2\). For a first-order expansion at \((x_0,y_0)^t\), we obtain the expansion \(\begin{aligned} \begin{pmatrix} f_1(x_0,y_0) \\ f_2(x_0,y_0) \end{pmatrix} + \begin{pmatrix} \frac{\partial f_1(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f_1(x_0,y_0)}{\partial y} (y - y_0) \\ \frac{\partial f_2(x_0,y_0)}{\partial x} (x - x_0) + \frac{\partial f_2(x_0,y_0)}{\partial y} (y - y_0) \end{pmatrix} \end{aligned}\) Using vector notation \(\begin{aligned} \boldsymbol{f}(\boldsymbol{x}) & = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \end{pmatrix}, \\ D\boldsymbol{f} & = \begin{pmatrix} \frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y} \\ \frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y} \end{pmatrix}, \end{aligned}\) we can write a first-order (or linear) approximation of a function \(\boldsymbol{f}\) more compactly with \(\boldsymbol{f}(\boldsymbol{x}) \approx \boldsymbol{f}(\boldsymbol{a}) + D\boldsymbol{f}(\boldsymbol{a}) (\boldsymbol{x} - \boldsymbol{a}).\) The approximation is valid only in a neighbourhood of the point \(\boldsymbol{a}\).

Integrals and primitives

Primitives

Let \(f:I \to \mathbb{R}\), (\(I \subset \mathbb{R}\)). A primitive \(F\) of \(f\) on \(I\) is a differentiable map such that \(F'(x) = f(x)\), \(x \in I\).

Example If \(f(x) = x^2\), we look for \(F(x)\) such that \(F'(x) = x^2\). Take \(F(x) = x^3/3\), then \(F'(x) = 3x^2/3 = x^2.\) It also work for \(F(x) = x^3/3 + C\), for any constant \(C \in \mathbb{R}\).

We denote the primitive as \(F(x) = \int f(x) dx\). Primitives are unique up to a constant: if \(F_1\) and \(F_2\) are primitives of \(f\), then \(F_1'(x) = f(x)\) and \(F_2'(x) = f(x)\), which implies that \(F_1'(x) - F_2'(x) = 0\). Therefore the difference between \(F_1\) and \(F_2\), \(G : x \to F_1(x) - F_2(x)\) has a zero derivative: \(G'(x) = F_1'(x) - F_2'(x) = 0\). This means that \(G\) is a constant.

Primitives have linear properties. Let \(F\) be a primitive of \(f\), \(G\) be a primitive of \(g\) and \(a,b \in \mathbb{R}\). Then

\(F+G\) is a primitive of \(f+g\), or \(\int \bigl( f(x) + g(x) \bigr) dx = \int f(x)dx + \int g(x)dx.\)
\(aF\) is a primitive of \(af\), or \(\int af(x) dx = a \int f(x) dx.\)

Function	Primitive	Note
\(0\)	\(C\)	\(C \in \mathbb{R}\)
\(a\)	\(ax + C\)	\(a \in \mathbb{R}\)
\(x^a\)	\(\frac{x^{a+1}}{a+1}\)	\(a \neq -1\)
\(x^{-1} = \frac{1}{x}\)	\(\ln \|x\| + C\)	(\(a = -1\)), \(x \neq 0\)
\(e^x\)	\(e^x + C\)
\(\cos(x)\)	\(\sin(x) + C\)
\(\sin(x)\)	\(-\cos(x) + C\)
\(f'(g(x))g'(x)\)	\(f(g(x)) + C\)

Exercise Compute \(\begin{aligned} \int { \frac{1}{\sqrt{3x + 5}} } dx \end{aligned}\)

Integrals

Let \(f:[a,b] \to \mathbb{R}\), and \(F\) a primitive of \(f\) on \([a,b]\). Then the definite integral is the value \(F(b) - F(a)\), and is denoted \(\begin{aligned} \int_a^b { f(x) dx } = \bigl[ F(x) \bigr]_a^b . \end{aligned}\)

The most important interpretation of the integral is the area under the curve (AUC). If \(f(x) \geq 0\), \(x \in [a,b]\), then \(\int_a^b { f(x) dx }\) is the area under delimited by \(f(x)\), \(x\)-axis and the axes \(x = a\) and \(x = b\). Negative values integrate as negative areas.

If \(f\) is a rate (unit in something per time), and \(x\) is time, then the integral of \(f\) has unit something. This applies to speed (integral: displacement), production or degradation (integral: concentration), etc.

The integral has the following properties

Linearity. The integral of a sum is the sum of the integrals. \(\int_a^b { (f(x) + g(x))dx } = \int_a^b f(x) dx + \int_a^b g(x) dx.\)
Negative intervals. \(\int_a^b { f(x) dx } = - \int_b^a { f(x) dx }.\)
Midpoint rule. \(\int_a^b { f(x)dx } = \int_a^c f(x)dx + \int_c^b f(x) dx.\) Notice that this works even if \(c\) is not between \(a\) and \(b\).
The expression \(\int_a^x { f(t) dt } = F(x) - F(a) = F_a(x)\) is a function of \(x\). Therefore, for any integrable function \(f\), we have \(\Bigl( \int_a^x f(x) dx \Bigl)' = f(x).\) This is the fundamental theorem of calculus.
Integration by parts \(\int_a^b f'(x)g(x)dx = f(x)g(x)|_a^b - \int_a^b f(x)g'(x)dx.\) Integration by part is excessively useful for computing integral beyond simple functions.

Exercise Compute the integral \(\int_0^1 x e^x dx.\)

Correction The integrand is the product of two functions, so let us try integration by parts. Let \(g(x) = x\) and \(f'(x) = e^x\). Then \(g'(x) = 1\) and \(f(x) = e^x\). The integral becomes \(\begin{aligned} \int_0^1 x e^x dx & = \Bigl. e^x x \Bigr|_0^1 - \int_0^1 e^x dx, \\ & = e^1 - 0 - \Bigl. e^x \Bigr|_0^1, \\ & = e - (e^1 - e^0), \\ & = 0 + 1, \\ & = 1. \end{aligned}\)

Exercise Compute the integral \(\int_0^\pi \cos x e^x dx.\)

Correction The integrand is the product of two functions, we try integration by parts. Let \(g(x) = \cos x\) and \(f'(x) = e^x\). Then \(g'(x) = -\sin x\) and \(f(x) = e^x\). The integral becomes \(\begin{aligned} \int_0^\pi \cos x e^x dx & = \Bigl. e^x (\cos x) \Bigr|_0^\pi - \int_0^\pi e^x (-\sin x)dx, \\ & = - e^\pi - e^0 - \int_0^\pi e^x (-\sin x)dx, \\ & = - e^\pi - 1 - \int_0^\pi e^x (-\sin x)dx, \\ \end{aligned}\) We still have a integral of a product to compute. We apply the integral by part once more, \(\begin{aligned} & = - e^\pi - 1 - \bigl( \Bigl. e^x ( - \sin x ) \Bigr|_0^\pi + \int_0^\pi e^x (\cos x)dx \bigr). \\ & = - e^\pi - 1 - \int_0^\pi e^x (\cos x)dx. \\ \end{aligned}\) With this new integration, we just came back to our initial integral. We are looping, and further integrations by part will not help us. To break the loop, we use the fact that the initial integral term is present on both side of the equation and solve for it. If \(\begin{aligned} I = & \int_0^\pi \cos x e^x dx, \end{aligned}\) then \(\begin{aligned} I & = - e^\pi - 1 - I, \\ 2 I & = - e^\pi - 1, \\ I & = \frac{- e^\pi - 1}{2}. \end{aligned}\) Et voilà!

Exercise Compute the integral \(\int_0^1 { \sinh x e^x dx }.\)

Differential equations in 1D

An ordinary differential equation (ODE) of order \(n\) is a relation (i.e. an equation) between a real variable \(t \in I \subset \mathbb{R}\), an unknown function \(x: t \to x(t)\) and its derivatives \(x',x'',x''', ..., x^{(n)}\) at point \(t\) defined by \(F\bigl(t,x,x',x'',...,x^{(n)}\bigr) = 0,\) where \(F\) depends on \(x^{(n)}\). In general, \(x(t)\) takes values in \(\mathbb{R}^N\), i.e. it is a vector. We say that the equation is a scalar differential equation if \(N = 1\). (The expression \(x^{(i)}\) stands for the \(i\)-th derivative, not the \(i\)-th power.)

The normal form of a differential equation of order \(n\) is \(x^{(n)} = f\bigl(t,x,x',x'',...,x^{(n-1)}\bigr).\)

A differential equation is autonomous if it does not depend on \(t\), i.e. \(F\) has the form \(F\bigl(x,x',x'',...,x^{(n)}\bigr) = 0,\)

A linear differential equation is a differential equation for which \(F\) is a linear function in \(x,x',...x^{(n)}\). It can be expressed as \(a_n(t) x^{(n)} + a_{n-1}(t)x^{(n-1)} + ... a_1(t)x' + a_0(t)x = g(t),\) where the coefficients \(a_j(t)\) may depend on \(t\), and \(x\) and its derivatives \(x^{(i)}\) appear only as monomials of degree 1 (that is, linearly). If all coefficients are constants, including \(g\), the linear differential equation is autonomous.

Exercise For the following differential equations, give the order \(n\), and determine whether they are autonomous, linear, and whether they are expressed under their normal form.

\(x - t + 4 t x' = 0.\)
\(\bigl( x'' \bigr)^2 - 2 x' t x = 0.\)
\(x^{(3)} + \sin(x') = -5 x.\)
\(x^{(4)} - x x' = 0 .\)
\(3x'' - 4 x' + 6 x = 2.\)
\(\ln x' + 3 t x = \sqrt{x}.\)

Correction

linear, non-autonomous, not in normal form
non-linear, non-automous, not in normal form
non-linear, autonomous, not in normal form
non-linear, autonomous, not in normal form
non-linear, non-autonomous, not in normal form

A solution or integral of a differential equation of order \(n\) for \(t\) in an interval \(I \subset \mathbb{R}\), is a map \(x: I -> \mathbb{R}\) that is \(n\) times differentiable for all \(t \in I\) and satisfies the differential equation.

A integral curve (or chronic) is the set of points \((t,x(t))\) for \(t \in I\). If \(x(t) \in \mathbb{R}^N\), the integral curve is in \(\mathbb{R}^{N+1}\). A trajectory or orbit is the set of points \(x(t)\) for \(t \in I\). This is a set in \(\mathbb{R}^N\). The space that contains the trajectories is called the phase space. The set of all trajectories is called the phase portrait.

It is always possible to write an differential equation of order \(n\) as a differential equation of order 1, by defining extra variables for the higher-order derivatives.

Example We consider the differential equation \(a(t) x'' + b(t) x' + c(t) x = d(t)\). This is a equation of order 2. To reduce it to order 1, let \(z_1 = x\) and \(z_2 = x'\). Then \(x'' = z_2'\) and \(z_1' = z_2\). The second-order differential equation can be re-expressed as two first order equations: \(z_1' = z_2, \quad a(t)z_2' + b(t)z_2 + c(t) z_1 = d(t).\) Often, the system of first order equations can be re-expressed in normal form, by isolating the variables \(z_1'\) and \(z_2'\), \(z_1' = z_2, \text{ and } z_2' = \frac{d(t) - b(t)z_2 - c(t) z}{a(t)}.\) (We assume that \(a(t) \neq 0\).) In general, for the differential equation of order \(n\) \(F(t,x,x',...,x^{(n)}) = 0,\) with \(x : I \to \mathbb{R}^m\), we make a change of variables: \(z_1 = x, z_2 = x', ..., z_i = x^{(i-1)}\) until \(z_n = x^{(n-1)}\). Each variable \(z_i(t) \in \mathbb{R}^m\), so the new vector \(z = \bigl( z_1, z_2, ..., z_n \bigr)^t\) is in \(\mathbb{R}^{mn}\). With the change of variables, the differential equation now reads \(\begin{aligned} z_1' & = z_2, \\ z_2' & = z_3, \\ & ..., \\ z_i' & = z_{i+1}, \\ & ..., \\ F(t,&z_1,z_2,...,z_n,z_n') = 0. \\ \end{aligned}\)

Tips on Ordinary differential equations

The most frequently used differential equations are order 1, and they usually are represented in their normal form: \(x' = f(x)\) for autonomous equations, and \(x' = f(t,x)\) for non-autonomous equations
For a scalar, autonomous differential equation \(x' = f(x)\) with \(x(t) \in \mathbb{R}\), the trajectories are monotonous: if \(x\) is a solution, then \(x\) is either increasing, decreasing, or constant.

Finding solutions of differential equations

We consider a first order scalar differential equation, \(a(t) x' + b(t) x = d(t),\) \(t \in I\) and \(a(t) \neq 0\) on \(I\), \(a(t)\) and \(b(t)\) continuous on \(I\). If \(d(t) = 0\), we call the equation homogeneous, and \(a(t) x' + b(t) x = 0.\)

First order scalar ODE. Homogeneous case, first method. Write the equation in normal form, \(x' = - \frac{b(t)}{a(t)} x.\) The solution \(x\) is either the constant \(x = 0\), or \(x(t) \neq 0\) for all \(t\in I\). We know that because of uniqueness of solutions, which implies that trajectories cannot cross. If \(x = 0\) we are done, so we can assume that \(x(t) \neq 0\). Dividing the equation by \(x\) \(\frac{x'}{x} = -\frac{b(t)}{a(t)}.\) The terms on both sides are functions of \(t\). We can compute their primitives \(\int {\frac{x'}{x} dt } = - \int { \frac{b(t)}{a(t)} dt}.\) The integrand of the left-hand side is \(x'/x\). This is a very common form called log-derivative and admits the primitive \(\ln |x|\). The right-hand side does not necessarily have a close form, and we leave it as it is. With the integration constant, we obtain implicit solution for \(x\) \(\ln |x| = - \int { \frac{b(t)}{a(t)} dt } + K.\) We would like an explicit solution \(x\), \(\begin{aligned} |x| & = e^{- \int { \frac{b(t)}{a(t)} dt } + K}. \end{aligned}\) Therefore, \(\begin{aligned} x & = \pm e^K e^{- \int { \frac{b(t)}{a(t)} dt }}. \end{aligned}\) Notice that \(t\) is a variable of integration, and does not exist outside the integral, and can be replaced by any other variable. To have \(x\) as as function of \(t\), we must define the bounds of the integral. When the domain of definition of \(t\) is the interval \(I = [t_0,t_1], t_0 < t_1,\) the solution \(x(t)\) is obtained by integrating from \(t_0\) to \(t\), \(x = \pm e^K e^{- \int_{t_0}^t { \frac{b(u)}{a(u)} du }},\) where we have replaced the variable of integration by \(u\). Then the definite integral is a function of \(t\). The constant \(K\) is determined by the initial condition on \(x\) at \(t_0\), \(x(t_0) = x_0 \in \mathbb{R},\) \(x(t_0) = \pm e^K e^{- \int_{t_0}^{t_0} { \frac{b(u)}{a(u)} du }} = \pm e^K = x_0.\) The constant \(K = \mathrm{sign} \, x_0 \ln x_0.\) The fixed bound at \(t_0\) is arbitrary, we could have chosen any other time of reference. However, it is very common to consider differential equations for which we know the value of the solution at the initial time \(t_0\). The problem of solving a differential equation with a given initial condition is called an initial value problem (IVP) or a Cauchy problem.

First order scalar ODE. Homogeneous case, second method. Assume that \(a(t) \neq 0\), and write the equation as \(x' + \frac{b(t)}{a(t)} x = 0.\) Multiply the equation by the term \(e^{\int { \frac{b(t)}{a(t)} dt }}\), \(x' e^{\int { \frac{b(t)}{a(t)} dt }} + \frac{b(t)}{a(t)} x e^{\int { \frac{b(t)}{a(t)} dt }} = 0.\) Notice that the first term has the form \(x' f\) and the second term the form \(x f'\), with \(f = e^{\int { \frac{b(t)}{a(t)} dt }}.\) The left-hand-side of the resulting equation, \(x' f + x f'\), is the derivative of the product \(xf\), so the differential equation can be integrated, \(\begin{aligned} x e^{\int { \frac{b(t)}{a(t)} dt }} & = K, \\ x & = K e^{ - \int { \frac{b(t)}{a(t)} dt }}. \end{aligned}\) The constant \(K\) is determined by a condition set on the solution, as in the first method.

Exercise Solve the equation \(2x' + 6x = 0, \quad x(0) = 1.\)

Correction The initial condition \(x(0) = 1\) tells us to start the integration at \(t = 0\). Using the second method with \(a(t) = 2\) and \(b(t) = 6\), we have \(x(t) = K e^{- \int_0^t { - \frac 62 du } } = K e^{ - \frac 62 t }.\) The constant \(K\) is determined by the initial condition, \(x(0) = K e^0 = K = 1.\) The complete solution is \(x(t) = e^{- \frac 62 t}.\)

Exercise Solve the equation \(x' + \frac 1t x = 0, \quad x(1) = 1.\)

Correction The initial condition \(x(1) = 1\) tells us to start the integration at \(t = 1\). Using the second method with \(a(t) = 1\) and \(b(t) = 1/t\), we have \(x(t) = K e^{- \int_1^t { - \frac 1t du } } = K e^{ - \ln u |_1^t } = K e^{ - \ln t}.\) The term \(e^{ - \ln t} = e^{\ln t^{-1}} = t^{-1}.\) The solution simplifies to \(x(t) = \frac Kt.\) The constant \(K\) is found with the initial condition \(x(1) = K = 1,\) for a complete solution \(x(t) = \frac 1t.\)

First order scalar ODE. Heterogeneous case. We now consider the more general differential equation \(a(t) x' + b(t) x = d(t).\) Using the strategy from the second method for the homogeneous case above, we divide the equation by \(a(t)\), again assuming that \(a(t) \neq 0\), and then multiply by \(e^{ \int { \frac{b(t)}{a(t)} dt} }.\) The resulting equation now has a non-zero right-hand-side, \(x'e^{ \int { \frac{b(t)}{a(t)} dt} } + \frac{b(t)}{a(t)} x e^{ \int { \frac{b(t)}{a(t)} dt} } = \frac{d(t)}{a(t)} e^{ \int { \frac{b(t)}{a(t)} dt} }.\) Nevertheless, the left-hand-side is still of the form \(x' f + x f'\), and the right-hand-side only depends on \(t\). By integrating both sides, we obtain \(x e^{ \int { \frac{b(t)}{a(t)} dt} } = \int { \frac{d(t)}{a(t)} e^{ \int { \frac{b(t)}{a(t)} dt}} dt } + K.\) The solution for \(x\) is then \(x = \Bigl( \int { \frac{d(t)}{a(t)} e^{ \int { \frac{b(t)}{a(t)} dt}} dt } + K \Bigr) e^{ - \int { \frac{b(t)}{a(t)} dt} }.\)

Example We consider the differential equation \(\begin{aligned} x' + 3 x = 1+\sin(t), \quad x(0) = x_0 > 0. \end{aligned}\) The coefficients \(a(t) = 1, b(t) = 3, d(t) = 1 + \sin(t).\) The term \(e^{ \int { \frac{b(t)}{a(t)} dt}} = e^{ \int { 3 dt } } = e^{3t}\). The general solution is \(\begin{aligned} x(t) & = \Bigl( \int { (1+\sin(t)) e^{3t} dt } + K \Bigr)e^{ - 3t } , \\ & = \frac 13 + \frac {1}{10} \bigl( \sin(t) - \cos(t) \bigr) + Ke^{-3t}. \end{aligned}\) At \(t = 0\), the equation \(x(0) = \frac 13 + \frac{1}{10} (0 - 1) + K = x_0\) solve in \(K\) as \(K = x_0 - \frac{7}{30}\). The solution \(x(t)\) is \(x(t) = \frac 13 + \frac {1}{10} \bigl( 3 \sin(t) - \cos(t) \bigr) + \Bigl( x_0 - \frac{7}{30} \Bigr) e^{-3t}.\)

Solution of the initial value problem for the heterogeneous differential equation \(x' + 3x = 1 + \sin(x)\).

We now consider a nonlinear differential equation of Bernoulli type. Bernoulli equations are of the form \(\begin{aligned} x' + P(t) x + Q(t) x^r = 0, \quad t \in I \subset \mathbb{R}, \end{aligned}\) with continuous functions \(P, Q\), and \(r \in \mathbb{R}\). There is no general method to solve nonlinear differential equations, but is can be done is particular cases. If \(r = 0\) or \(r = 1\), the equation is linear, and we already know how to solve it. Suppose \(r\) different from 0 or 1. We will look for positive solutions \(x(t) > 0\) on \(t \in I\). Dividing by \(x^r\), we get \(\begin{aligned} x' x^{-r} + P(t) x x^{-r} + Q(t) x^r x^{-r} & = 0, \\ x' x^{-r} + P(t) x x^{-r} + Q(t) & = 0. \end{aligned}\) We now set an auxiliary variable \(u = x^{1-r}\), (\(x = u^{1/(1-r)}\)). Then \(\begin{aligned} u' = (1-r)x^{1-r-1}x' = (1-r)x^{-r}x', \end{aligned}\) and, substituting in the differential equation, \(\begin{aligned} \frac{1}{1-r} u' + P(t) u + Q(t) & = 0, \\ u' + (1-r)P(t) u + (1-r) Q(t) & = 0. \end{aligned}\) We know how to solve this equation; this is a linear equation of the form \(\begin{aligned} a(t) u' + b(t) u = d(t), \end{aligned}\) with \(a(t) = 1, b(t) = (1-r)P(t), d(t) = - (1-r)Q(t)\).

Example Verhulst equation (logistic equation) \(\begin{aligned} x' = \mu x \Bigl( 1 - \frac xK \Bigr). \end{aligned}\) We first rewrite the equation in Bernoulli form, \(\begin{aligned} x' - \mu x + \mu \frac{x^2}{K} = 0, \end{aligned}\) with \(P(t) = -1, Q(t) = \mu/K, r = 2\). The auxiliary equation reads \(\begin{aligned} u' + (1-2)(-1) \mu u + (1-2) \frac{\mu}{K} & = 0. u' + u - \frac{\mu}{K} & = 0. \end{aligned}\) This is a scalar linear equation with \(a(t) = 1, b(t) = \mu, d(t) = \mu/K.\). The general solution is \(\begin{aligned} u(t) & = \Bigl( \int { \frac{d(t)}{a(t)}e^{\int b(t)/a(t) dt}dt } + C \Bigr) e^{ - \int { b(t)/a(t) dt } }, \\ \int b(t)/a(t) dt & = \int \mu dt = \mu t, \\ u(t) & = \Bigl( \int { \frac{\mu}{K} e^{\mu t}dt } + C \Bigr) e^{-\mu t}, \\ & = \Bigl( \frac{\mu}{K} \frac{e^{\mu t}}{\mu} + C \Bigr) e^{- \mu t}, \\ & = \frac{1}{K} + C e^{- \mu t}. \end{aligned}\) The initial condition \(u(0) = x_0^{1-r} = x_0^{-1}\), \(\begin{aligned} u(0) & = \frac{1}{K} + C = \frac{1}{x_0}, \\ C & = \frac{1}{x_0} - \frac{1}{K}. \end{aligned}\) The solution to the original Verhulst equation is \(x = u^{-1}\), \(\begin{aligned} x(t) & = \frac{1}{\frac{1}{K} + \bigl( \frac{1}{x_0} - \frac{1}{K} \bigr) e^{-\mu t}} \\ & = \frac{x_0 K}{x_0 + (K- x_0) e^{- \mu t}}. \end{aligned}\)

Solution of the initial value problem for the Verhulst equation \(x' = \mu x (1 - x/K)\), for \(x(0) = 0.1\) and \(x(0) = 1.5\). Verhulst equation is a type of Bernoulli equation, and can be solved analytically.

Complex numbers

A complex number is a number that can be expressed in the form \(a + i b\), where \(a\) and \(b\) are real numbers, and the symbol \(i\) is called imaginary unit. The imaginary unit satisfies the equation \(i^2 = -1\). Because no real number satisfies this equation, this number is called imaginary.

For the complex number \(z = a + i b\), \(a\) is called the real part and \(b\) is called the imaginary part. The real part of \(z\) is denoted \(\Re(z)\) (\Re in LaTeX) or just \(\mathrm{Re}(z).\) The imaginary part of \(z\) denoted \(\Im(z)\) (\Im in LaTeX) or just \(\mathrm{Im}(z)\). The set of all complex numbers is denoted \(\mathbb{C}\) (\mathbb{C} in LaTeX).

We need complex numbers for solving polynomial equations. The fundamental theorem of algebra asserts that a polynomial equation of with real or complex coefficients has complex solutions. These polynomial equations arise when trying to compute the eigenvalues of matrices, something we need to do to solve linear differential equations for instance.

Arithmetic rules that apply on real numbers also apply on complex numbers, by using the rule \(i^2 = -1\): addition, subtraction, multiplication and division are associative, commutative and distributive.

Let \(u = a + i b\) and \(v = c + i d\) two complex numbers, with real coefficients \(a,b,c,d\). Then

\(u + v = a + i b + c + i d = (a+c) + i (b+d)\).
\(uv = (a + i b)(c + i d) = ac + i a d + i b c + i^2 b d = ac - bd + i(ad + bc)\).
\(\frac{1}{v} = \frac{1}{c + i d} = \frac{c - id}{(c - id)(c + id)} = \frac{c - id}{c^2 + d^2} = \frac{c}{c^2 + d^2} - i \frac{d}{c^2+d^2}\).
\(u = v\) if and only if \(a = c\) and \(b = d\).

It follows from the rule on \(i\) that
\(\frac 1i = -i.\) (Proof: \(\frac 1i = \frac{i}{i^2} = \frac{i}{-1} = -i\).)

Multiplying by the imaginary unit \(i\) is equivalent to a counterclockwise rotation by \(\pi/2\) (Figure 3) \(ui = (a + ib)i = ia + i^2 b = -b + ia.\)

Rotation in the complex plane. Multiplication by \(i\) is a 90 degree counterclockwise rotation.

Let \(z = a + ib\) a complex number with real \(a\) and \(b\). The conjugate of \(z\), denoted \(\bar z\), is \(a - ib\). The conjugate of the conjugate of \(z\) is \(z\) (reflection, Figure 3). The modulus of \(z\), denoted \(|z|\) is \(\sqrt{z \bar z}\). The product \(z \bar z = (a+ib)(a-ib)=a^2 + b^2 + i(-ab + ab) = a^2 + b^2\). The modulus is the complex version of the absolute value, for if \(z\) (i.e. \(b = 0\)), \(|z| = \sqrt{a^2} = |a|\). It is always a real, positive number, and \(|z| = 0\) if and only if \(z = 0\). The modulus also has the property of being the length of the complex number \(z\), if \(a\) and \(b\) are the sides of a rectangular triangle, then \(|z|\) is the hypotenuse.

When simplifying a ratio involving a complex \(v\) at the denominator, it is important to convert it to a real number by multiplying the ratio by \(\bar v/ \bar v\). For instance, if \(v \neq 0\), \(\frac{u}{v} = \frac{u \bar v}{u \bar v} = \frac{u \bar v}{|v|^2}.\) The denominator \(|v|^2\) is always a positive real number.

By allowing complex values, nonlinear functions of real numbers like exponentials, logarithms and trigonometric functions can have their domain extended to all real and complex numbers. The most useful extension is the exponential function. Recall that the exponential function \(e^x\), where \(e \approx 2.71828\) is Euler’s constant, satisfies the relation \(e^{x + y} = e^{x} e^{y}\). This remains true for complex numbers. The Euler’s formula relates the exponential of a imaginary number with trigonometric functions. For a real number \(y\), \(e^{i y} = \cos(y) + i \sin(y).\) Therefore, for any complex number \(z = a + i b\), the exponential \(e^{z} = e^{a + ib} = e^a e^{ib} = e^a \bigl( \cos(b) + i \sin(b) \bigr).\)

Complex plane

Tips on complex numbers

If \(x\) is real, \(ix\) is pure imaginary. If \(y\) is imaginary, \(iy\) is real.
\(|i| = 1\). For any real \(\theta\), \(|e^{i \theta}| = 1\).
\(|z_1 z_2| = |z_1| |z_2|\).
In particular, \(|iz| = |i||z| = |z|\). (Multiplying by \(i\) is a rotation in the complex plane, it does not change the modulus.)

Roots of a complex number

For complex numbers, the equation \(z^n = 1\) has \(n\) solutions. They are called the root of unity. For \(n=2\), we have the well-known roots \(z = \pm 1\), which are real. What are the roots of \(z^3 = 1\)? To find them, we express \(z\) in polar coordinates: \(z = r e^{i\theta}\). Then \(z^3 = (r e^{i\theta})^3 = r^3 e^{i 3\theta} = 1.\) The equation implies that \(z\) has modulus 1, so \(r = 1\). The remaining term \(e^{i 3\theta} = 1\) implies that \(3 \theta\) is a multiple of \(2 \pi\) because \(e^{i \omega} = 1\) if and only if \(\omega = 2 k \pi\), for some integer \(k\). Therefore \(\theta = \frac{2}{3} k \pi\), for \(k = 0, 1, 2, ...\). How many distinct points do we have? Clearly, \(k = 3\) is equivalent to \(k = 0\): \(e^{i \frac{2}{3} 3 \pi} = e^{i 2 \pi} = e{i 0}\). In the same way \(k = 4\) is equivalent to \(k = 1\), and so on. Therefore, there are exactly three distinct solutions for \(\theta\): \(0, \frac{2}{3} \pi, \frac{4}{3} \pi\) (Figure 5).

The roots of \(z^3 - 1\).

Exercises on complex numbers

Exercise Let the complex number \(z = 2 + 3 i\). Compute \(\bar z\), \(|z|\), \(|\bar z|\) (compare with \(|z|\)), \(z^2\), \(\Re(\bar z)\) , \(\Im(\bar z)\), \(\frac{z + \bar z}{2}\) , \(\frac{z - \bar z}{2}\) , \(-z\), \(iz\).

Correction \(\bar z = 2 - 3i\), \(|z| = \sqrt{2^2 + 3^2} = \sqrt{13}\), \(|\bar z| = \sqrt{2^2 + (-3)^2} = \sqrt{13}\), we see that \(|z| = |\bar z|\), \(\Re(\bar z) = 2\), \(\Im(\bar z) = -3\), \(\frac{z + \bar z}{2} = (2 + 3i + (2 - 3i))/2 = 2\), \(\frac{z - \bar z}{2} = ((2 + 3i - (2 - 3i))/2 = 3i\), \(-z = -2 - 3i\), \(iz = 2i + 3i^2 = -3 + 2i\).

Exercise Any complex number can be represented in polar form: \(z = r ( \cos(\theta) + i \sin(\theta) )\).

Show that \(|z| = r\)
Show that \(z = r e^{i \theta}\)
Conclude that for any complex number \(z\), \(|z| = 1\) if and only if \(z\) can be expressed as \(z = e^{i \theta}\) for a real \(\theta\).

Correction The modulus of \(z\) is \(|z| = \sqrt{r^2 \cos^2 (\theta) + r^2 \sin^2 (\theta)} = \sqrt{r^2} = r.\) From Euler’s formula, we have \(\cos(\theta) + i\sin(\theta) = e^{i\theta}\), so \(z = re^{i\theta}.\) Therefore, for any complex number \(z = r e^{i\theta}\), \(|z| = 1\) if and only if \(r = 1\).

Exercise Using Euler’s formula, show that \(\cos(a)\cos(b) - \sin(a)\sin(b) = \cos(a+b)\). (Use the property that \(e^{ia + ib} = e^{ia} e^{ib}\) and apply Euler’s Formula).

Correction All trigonometric identities can be obtained by applying Euler’s formula. Here we start from \(e^{ia + ib} = \cos(a+b) + i \sin(a+b)\). We only want the real part, \(\begin{aligned} \cos(a + b) & = && \frac{e^{ia + ib} + e^{-ia - ib}}{2} \\ & = && \frac{e^{ia} e^{ib} + e^{-ia} e^{-ib}}{2} \\ & = && \frac{(\cos(a) + i \sin(a))(\cos(b)+ i \sin(b)) + (\cos(a) - i \sin(a))(\cos(b) - i \sin(b))}{2} \\ & = && \frac{\cos(a)\cos(b) + i^2 \sin(a)\sin(b) + i \cos(a)\sin(b) + i \cos(b)\sin(a)}{2} \\ & && + \frac{\cos(a)\cos(b) + i^2 \sin(a)\sin(b) - i \cos(a)\sin(b) - i \cos(b)\sin(a)}{2} \end{aligned}\) The mixed cosine-sine terms cancel each other while the other ones add up, resulting in \(\begin{aligned} \cos(a + b) = \cos(a)\cos(b) - \sin(a)\sin(b). \end{aligned}\)

Exercise Show Euler’s identity: \(e^{i \pi} = -1\).

Correction This is a direct application of Euler’s formula: \(e^{i \pi} = \cos(\pi) + i \sin(\pi) = -1.\)

Exercise What are the roots of the equation \(z^6 = 1\)?

Correction The roots must satisfy \(e^{i 6 \theta} = 1\). This means that \(\theta = \frac{2}{6} k \pi\), for \(k = 0, 1, ..., 5\). There are six distinct roots.

Exercise For a complex \(z\), find necessary and sufficient conditions for \(e^{z t}\), \(t > 0\), to converge to 0.

Correction The exponential converges to zero if and only if \(\Re(z) < 0\). A complex number is close to zero if and only if its modulus is close to zero. Therefore, to show that a quantity converges to zero, it is necessary and sufficient to show that its modulus converges to zero. If \(z = a + ib\), the exponential \(e^{z t} = e^{(a + ib)t} = e^{at} e^{ibt}.\) The modulus \(|e^{ibt}| = 1\), so \(|e^{zt}| = e^{at}\) (no need for absolute values, the exponential of a real number is always positive). The condition for convergence to zero is therefore a condition on the real part of \(z\): \(e^{at} \to 0\) when \(t \to \infty\) if and only if \(a < 0\).

Exercise Let the complex number \(z = a + ib\) with real \(a\) and \(b\). Compute \(\sqrt{z}\) (that is, express \(s = \sqrt{z}\) as \(s = \alpha + i \beta\), with real \(\alpha\) and \(\beta\))

Correction The square root of a complex number always exists. Express \(z\) in polar form \(z = re^{i\theta}\), \(r \geq 0, \theta \in [0, 2\pi]\). The square root \(\sqrt{z} = \sqrt{r}\sqrt{e^{i\theta}} = \sqrt{r} e^{\frac 12 i \theta}\). Using Euler’s formula, \(\sqrt{z} = \sqrt{r} \cos(\theta/2) + i \sqrt{r} \sqrt{\theta/2}\). That is, the square root is obtained by taking the square root of the modulus \(r\), and dividing the angle (the argument) by 2. There is a problem with this solution, because \(z\) can also be represented by \(re^{i\theta + 2\pi}\), giving \(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta + \pi}\), which is equivalent to dividing the angle by two in the other direction. We define the principal square root as the solution that makes the smallest change in angle: \(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta}\) if \(\theta \in [0,\pi]\), and \(\sqrt{z} = \sqrt{r} e^{\frac 12 i\theta + \pi}\) if \(\theta \in (\pi,2\pi]\) To express the solution in terms of the original form of \(z = a + i b\), we express the square root \(s = \alpha + i \beta\). Then \(s^2 = \alpha^2 - \beta^2 + 2i \alpha \beta = z = a + ib.\) By identifying the real and imaginary parts, we get two equations: \(\alpha^2 - \beta^2 = a\) and \(2 i \alpha \beta = b.\) Denoting the modulus of \(z\) by \(r = \sqrt{a^2 + b^2}\), we can obtain the solutions \(\alpha = \frac{1}{\sqrt{2}}\sqrt{a + r}, \quad \beta = \mathrm{sign}(b) \frac{1}{\sqrt{2}} \sqrt{-a + r}.\)

Matrices in dimension 2

Eigenvalues of a \(2 \times 2\) matrix

A \(2 \times 2\) matrix \(A\) is an array with 2 rows and 2 columns: \(A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}.\) Usually, the coefficients \(a, b, c, d\) are real numbers. The identity matrix is the matrix \(I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}.\)

The determinant of \(A\), denoted \(\det A\) or \(|A|\) is the number \(ad - bc\). The trace of \(A\), denoted \(\mathrm{tr}\,A\), is the sum of the main diagonal of \(A\): \(a + d\).

The characteristic polynomial of \(A\) is the second order polynomial in \(\lambda\) obtained by computing the determinant of the matrix \(A - \lambda I\) (Figure 6), \(\det ( A - \lambda I ) = (a-\lambda)(d-\lambda) - bc = ad - bc - \lambda ( a + d ) + \lambda^2.\) The characteristic polynomial \(p_A(\lambda)\) of \(A\) can be expressed in terms of its determinant and trace: \(p_A(\lambda) = \det A - \mathrm{tr}\,A \lambda + \lambda^2.\) The eigenvalues of \(A\) are the roots of the characteristic polynomial. By the fundamental theorem of algebra, we know that the characteristic polynomial has exactly two roots, counting multiple roots. These roots can be real, or complex. The eigenvalues of \(A\) are calculated using the quadratic formula: \(\lambda_{1,2} = \frac{1}{2} \Bigl( \mathrm{tr}\,A \pm \sqrt{ (\mathrm{tr}\,A)^2 - 4 \det A } \Bigr).\)

Graph of the characteristic polynomial. (a, green) polynomial with two negative roots, \(p(\lambda) = 0.08 + 0.8\lambda + \lambda^2\). (b, gray) polynomial with two complex roots, \(p(\lambda) = 0.1 - 0.3\lambda + \lambda^2\). (c, red) polynomial with two positive roots, \(p(\lambda) = 0.05 + 0.5\lambda + \lambda^2\). (d, purple) polynomial with a negative and a positive root, \(p(\lambda) = -0.1 - 0.3\lambda + \lambda^2\).

From this formula, we can classify the eigenvalues of \(A\). Let \(\Delta = (\mathrm{tr}\,A)^2 - 4 \det A\) the discriminant of the quadratic formula. The two eigenvalues of \(A\) are real if and only if \(\Delta \geq 0\), i.e. \(\mathrm{tr}\,A)^2 \geq 4 \det A\) Then we have the following properties (Figure 7):

\(\Delta < 0\), complex eigenvalues
- The two eigenvalues are complex conjugate: \(\lambda_1 = \bar \lambda_2\)
- Their real part \(\Re(\lambda) = \frac{1}{2} \mathrm{tr}\,A\).
\(\Delta = 0\), there is a single root of multiplicity 2: \(\lambda = \frac{1}{2} \mathrm{tr}\,A\).
\(\Delta > 0, \det A > 0\), real, distinct eigenvalues of the same sign.
- \(\mathrm{tr}\,A > 0\) and \(\det A > 0\). Then \(\lambda_{1,2}\) are distinct and positive.
- \(\mathrm{tr}\,A < 0\) and \(\det A > 0\). Then \(\lambda_{1,2}\) are distinct and negative.
\(\det A < 0\), real distinct eigenvalues of opposite sign.
- \(\lambda_1 < 0 < \lambda_2\).
\(\det A = 0\) one of the eigenvalue is zero, the other eigenvalue is \(\mathrm{tr}\,A\).

Properties of the eigenvalues of a \(2 \times 2\) matrix \(A\) with respect to \(\det A\) and \(\mathrm{tr}\,A\).

Exercises on eigenvalues

Exercise Properties of the eigenvalues of \(2 \times 2\) matrices. For each \(2 \times 2\) matrix, compute the determinant, the trace, and the discriminant, and determine whether the eigenvalues are real, complex, distinct, and the sign (negative, positive, or zero) of the real parts.

\(A_1 = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, \quad A_2 = \begin{pmatrix} -2 & 1 \\ 1 & -2 \end{pmatrix}, \quad A_3 = \begin{pmatrix} 1 & -2 \\ 0 & 1 \end{pmatrix}, \quad A_4 = \begin{pmatrix} -1 & 2 \\ 1/2& 2 \end{pmatrix}.\)

Matrix-vector operations

A matrix defines a linear transformation between vector spaces. Given a vector \(x\), the product \(Ax\) is vector composed of linear combinations of the coefficients of \(x\). For a matrix \(2 \times 2\), the vector \(x\) must be a vector of size 2, and the product \(Ax\) is a vector of size two. If \(x = (x_1, x2)^t\) (the \({}^t\) stands for the transpose, because \(x\) must be a column vector), and \(A = [ a_{ij} ]_{i=1,2, \, j=1,2}\), then \(Ax = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} = \begin{pmatrix} a_{11} x_{1} + a_{12} x_2 \\ a_{21} x_{1} + a_{22} x_2 \end{pmatrix}.\) Successive linear transformations can be accomplished by applying several matrices. Given two matrices \(A, B\), the matrix product \(C = AB\) is also a matrix. The matrix \(C\) is the linear transformation that first applies \(B\), then \(A\). Matrix product is not commutative is general: \(AB \neq BA\). (If \(B\) means ’put on socks’ and \(A\) means ’put on shoes’, then \(BA\) does not have the expected result.) The product of two matrices \(A = [ a_{ij} ]_{i=1,2, \, j=1,2}\) and \(B = [ b_{ij} ]_{i=1,2, \, j=1,2}\) is \(AB = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix} \begin{pmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{pmatrix} = \begin{pmatrix} a_{11} b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22} \\ a_{21} b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} \end{pmatrix}.\) The sum of two matrices \(A+B\) is performed element-wise: \(A+B = [a_{ij} + b_{ij}]_{i=1,2, \, j=1,2}\). The sum of two vectors is defined similarly. Addition is commutative. Matrix operations are associative and distributive. \(\begin{aligned} A + B & = B + A, \\ A(B + C) & = AB + BC, \\ A(BC) & = (AB)C. \end{aligned}\) Matrices and vectors can be multiplied by a scalar value (real or complex). Multiplication by a scalar is associative, distributive, and commutative. The result of the multiplication by a scalar is to multiply each coefficient of the matrix or vector by the scalar. For example, if \(\lambda, \mu\) are scalars, \(\begin{aligned} \lambda A & = A (\lambda I) = A \lambda, \\ \lambda ( A + B ) & = \lambda A + \lambda B, \\ (\lambda A) B & = \lambda (AB) = A (\lambda B), \\ (\mu + \lambda) A & = \mu A + \lambda A, \\ \mu (\lambda A) & = (\mu \lambda) A, ... \end{aligned}\) The product between two column vectors is not defined, because the sizes do not match. However, we can define the scalar product between two column vectors \(x, y\) in the same way matrix product is defined: \(x^ty \equiv x_1 y_1 + x_2 y_2.\)

If the vectors are complex-valued, we need also to conjugate the transposed vector \(x^t\). The conjugate-transpose is called the adjoint and is denoted \({}^*\). Thus, if \(x\) is complex-valued, the adjoint \(x^*\) is the row vector \((\bar x_1, \bar x_2)\). The scalar product for complex-valued vectors is denoted \(x^*y\). Since this notation also works for real-valued vector, we will used most of the time.

Two vectors are orthogonal if their scalar product is 0. In the plane, this means that they are oriented at 90 degree apart. Orthogonal vectors are super important because they can be used to build orthogonal bases that are necessary for solving all sorts of linear problems.

Exercises on Matrix-vector and matrix-matrix operations

Exercise Compute matrix-vector product \(\begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix}.\) What is the transformation given by this matrix.

Correction \(\begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} -x_2 \\ x_1 \end{pmatrix}.\) The transformation is a 90 degree counterclockwise rotation.

Exercise Compute the matrix-matrix product \(\begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}.\) Can you tell what transformation this is?

Correction \(\begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.\) This matrix exchanges the coordinates of a vector, this is a reflection through the axis \(x = y\).

Exercise Now compute the product of the same matrices, but in the inverse order \(\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}.\) Compare with the solution found in the previous exercise. What is this transformation?

Correction \(\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} = \begin{pmatrix} 0 & -1 \\ -1 & 0 \end{pmatrix}.\) The pruduct is not the same, the matrices do not commute. The transformation is now a reflection through \(y = -x\).

Exercise Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((a x_1, b x_2)^t\).

Correction The matrix is \(\begin{pmatrix} a & 0 \\ 0 & b \end{pmatrix}.\)

Exercise Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((x_2, x_1)^t\).

Correction The matrix is \(\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.\)

Exercise Find the matrix that takes a vector \(x = (x_1,x_2)^t\) and returns \((x_2, 0)^t\).

Correction The matrix is \(\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}.\)

Exercise Compute the successive powers \(A, A^2, A^3, ...\), for a diagonal matrix \(A\): \(A =\begin{pmatrix} a & 0 \\ 0 & b \end{pmatrix}\)

Correction The power of a diagonal matrix is a diagonal matrix \(A^k = \begin{pmatrix} a^k & 0 \\ 0 & b^k \end{pmatrix}.\)

Exercise Compute the scalar product \(x^*y\) between \(x = (1 + 2i, 1 - i)^t\) and \(y = (0.5 - i, -0.5)^t\).

Correction The scalar product is \(\begin{aligned} x^*y & = (1 - 2i, 1 + i) (0.5 - i, -0.5)^t \\ & = (1 - 2i)(0.5 - i) + (1 + i)(-0.5) \\ & = 0.5 + 2i^2 - 2(0.5)i - i -0.5 - 0.5i \\ & = (0.5 - 0.5 + 2i^2) + (-2(0.5) - 1 - 0.5)i \\ & = -2 - 2.5i. \end{aligned}\)

Exercise Now compute the scalar product \(y^*x\) and compare with the result with the previous exercise.

Correction The scalar product is \(\begin{aligned} y^*x & = (0.5 + i, -0.5) (1 + 2i, 1 - i)^t \\ & = (0.5 + i)(1 + 2i) + (-0.5)(1 - i) \\ & = 0.5 + 2i^2 + i + 2(0.5)i - 0.5 + 0.5i \\ & = -2 + 2.5i \end{aligned}\) This is the conjugate: \(x^*y = (y^*x)^*.\)

Exercise Compute the scalar product between \(z = (z_1, z_2)^t\) and itself, if \(z\) is a complex-valued vector. What can you say about the result?

Correction The scalar product \(z^*z = (\bar z_1, \bar z_2) (z_1, z_2)^t = \bar z_1 z_1 + \bar z_2 z_2 = |z_1|^2 + |z_2|^2.\) The scalar product is the square of the norm of the vector \(z\).

Tips on eigenvalues Some matrices have special shapes that make it easier to compute the determinant, and the eigenvalues. These are called eigenvalue-revealing shapes.

Diagonal matrices have their eigenvalues on the diagonal.
Triangular matrices, i.e. matrices that have zeros above (lower-triangular matrix) or below (upper-triangular matrix) the main diagonal have also their eigenvalues on the diagonal.
A matrix with a row or a column of zeros has its determinant equal to zero. This implies that one of its eigenvalues is 0.

Eigenvalue decomposition

In many applications, it is useful to decompose a matrix into a form that makes it easier to operate complex operations on. For instance, we might want to compute the powers of a matrix \(A\): \(A^2\), \(A^3\), \(A^4\). Multiplying matrices are computationally intensive, especially when the size of the matrix becomes large. The power of a matrix is \(A^k = AA...A\), \(k\) times. The zeroth power is the identity matrix: \(A^0 = I\).

The inverse of a matrix \(A\), denoted by \(A^{-1}\) is the unique matrix such that \(AA^{-1} A{^-1}A = I\). The notation is self-consistent with the positive powers of \(A\). The inverse does not always exist. A matrix is invertible if and only if its determinant is not 0. If \(A\) and \(B\) are invertible, then \(AB\) is invertible, and \((AB)^{-1} = B^{-1}A^{-1}\).

The eigenvalue decomposition is a decomposition of the form \(A = X D X^{-1}\), where \(D\) is a diagonal matrix, and \(X\) is an invertible matrix. If there exists such a decomposition for \(A\), then computing powers of \(A\) becomes easy: \(\begin{aligned} A^k & = (XDX^{-1})^k = XDX^{-1} \, XDX^{-1} \, ... XDX^{-1}, \\ & = XD(X^{-1}X)D(X^{-1}X) D ... (X^{-1}X) DX^{-1}, \\ & = XD^kX^{-1}. \end{aligned}\)

The eigenvalue decomposition does not always exists, because it is not always possible to find an invertible matrix \(X\). When it exists, though, the columns of the matrix \(X\) is composed of the eigenvectors of \(A\). When \(A\) is a \(2 \times 2\) matrix, it is enough to find 2 linearly independent eigenvectors \(x\) and \(y\) for the matrix \(X = \Biggl( \begin{array}{c|c} x_1 & y_1 \\ x_2 & y_2 \end{array} \Biggr)\) to be invertible.

Eigenvectors

The eigenvectors of a matrix \(A\) are the nonzero vectors \(x\) such that for an eigenvalue \(\lambda\) of \(A\), \(Ax = \lambda x.\) If \(x\) is an eigenvector, so is any \(\alpha x\) for any scalar value \(\alpha\). If there are two linearly independent eigenvectors \(x\) and \(y\) associated to an eigenvalue, \(\alpha x + \beta y\) is also an eigenvector. There is at least one eigenvector for each distinct eigenvalue, but there may be more than one when the eigenvalue is repeated.

Example Distinct, real eigenvalues The matrix \(A =\begin{pmatrix} -1 & -2 \\ 0 & 1 \end{pmatrix}\)

is upper-triangular; this is one of the eigenvalue-relealing shapes. The eigenvalues are \(-1\) and \(1\). These are distinct eigenvalues, so each eigenvalue possesses a single eigenvector. The eigenvector \(x\) associated to \(\lambda_1 = -1\) is found by solving the eigensystem \(Ax = (-1)x.\)

The unknown quantity \(x\) appears on both sides of the equation. We can find a simpler form by noting that multiplying a vector by the identity matrix is neutral: \((-1)x = (-1) I x.\) The eigenproblem becomes \(\begin{aligned} A x & = (-1) I x, \\ A x - (-1) I x = 0, \\ \bigl( A - (-1) I \bigr) x = 0, \end{aligned}\)

that is, the eigenvector is a nonzero solution of the linear system \(\bigl( A - \lambda I \bigr) x = 0\). In general, if a matrix \(B\) is invertible, the only solution to \(Bx=0\) is \(x = 0\) (the vector of zeroes). But, by construction, \(A - \lambda I\) cannot be invertible if \(\lambda\) is an eigenvalue: its determinant is exactly the characteristic polynomial evaluated at one of its roots, so it is zero. This is why the eigensystem has nonzero solutions. Now, because \(A - \lambda I\) is not invertible, this means that a least one of its rows is a linear combination of the others. For \(2 \times 2\) matrices, this implies that the two rows are colinear, or redundant. For our example, the eigensystem reads \(\begin{aligned} \begin{pmatrix} -1 - (-1) & -2 \\ 0 & 1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & -2 \\ 0 & 2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

we immediately see that the two rows \((0,-2)\) and \((0,2)\) are colinear, with a factor \(-1\). This leads to an underdetermined system: \(0 x_1 + -2 x_2 = 0\). The solution is \(x_2 = 0\) and we can take \(x_1\) to be any value, save 0. We choose \(x = (1, 0)^t\).

For the eigenvalue \(\lambda_2 = +1\), the eigensystem reads: \(\begin{aligned} \begin{pmatrix} -1 - (+1) & -2 \\ 0 & 1 - (+1) \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} -2 & -2 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

Again, the second row \((0,0)\) can be neglected, and the solution is \(-2 y_1 - 2 y_2 = 0\), or \(y_1 = - y_2\). It is customary to choose an eigenvector with norm 1. The norm of a complex-valued vector \(y = (y_1, y_2)^t\) is \(||y|| = \sqrt{y^*y} = \sqrt{\bar y_1 y_1 + \bar y_2 y_2} = \sqrt{|y_1|^2 + |y_2|^2}.\)

Here, the eigenvector is \(y = (y_1, - y_1)^t\), so \(||y|| = \sqrt{|y_1|^2 + |- y_1|^2} = \sqrt{2}\sqrt{|y_1|^2} = \sqrt{2}|y_1|.\) Taking \(||y|| = 1\) solves \(|y_1| = 1/\sqrt{2}.\) This means that we could take a negative or a complex value for \(y_1\), as long as the \(|y_1| = 1/\sqrt{2}.\) Going for simplicity, we take \(y_1 = 1/\sqrt{2}\).

Example Complex eigenvalues

The matrix \(A =\begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}\) is not diagonal, so we have to compute the eigenvalues by hand. The trace of \(A\) is zero, the determinant is \(0 - (1)(-1) = 1\), and the discriminant is \(-4\). A negative discriminant implies complex eigenvalues, \(\lambda_{1,2} = \frac 12 \bigl( 0 \pm \sqrt{-4} \bigr) = \pm i.\) For the eigenvalue \(\lambda_1 = +i\), the eigensystem reads: \(\begin{aligned} \begin{pmatrix} - (+i) & -1 \\ 1 & - (+i) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} -i & -1 \\ 1 & -i \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

The two rows \((-i,1)\) and \((1,-i)\) should be colinear, but this is not obvious with the complex coefficients. Multiplying the first row by \(i\) gives \(i(-i, -1) = (- i^2, - i) = (-(-1), -i) = (1, -i)\), the second row, ok. Having confirmed that the system is indeed underdetermined, we can week a solution to \(-i x_1 - x_2 = 0\). Solving for \(x_2 = -i x_1\), we obtain the eigenvector \(x = (x_1, -i x_1)^t\). Normalization of \(x\) imposes \(||x|| = \sqrt{|x_1|^2 + |-ix_1|^2} = \sqrt{|x_1|^2 + |x_1|^2} = \sqrt{2}|x_1| = 1.\) As in the previous example, we can choose \(x_1 = 1/\sqrt{2}.\)

The second eigenvectors, associated \(\lambda_2 = -i\), solves the eigensystem \(\begin{aligned} \begin{pmatrix} - (-i) & -1 \\ 1 & - (-i) \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} i & -1 \\ 1 & i \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

The first row yields \(iy_1 - y_2 = 0\), so \(y = (y_1, iy_2)^t\). A normalized eigenvector can be \(y = (1/\sqrt{2}, i/\sqrt{2})^t\). We could also have chosen \(y = (i/\sqrt{2}, -1/\sqrt{2})^t\).

Example Repeated eigenvalues 1

The matrix \(\begin{pmatrix} -1 & 0 \\ 2 & -1 \end{pmatrix}\)

is lower-trianglar, with repeated eigenvalues on the diagonal, \(\lambda_{1,2} = -1\). The eigenvectors associated with \(-1\) satisfy the eigenproblem \(\begin{aligned} \begin{pmatrix} -1 - (-1) & 0 \\ 2 & -1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & 0 \\ 2 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\)

The first row vanishes, and the second row means that \(x_1 = 0\), leaving for instance \(x_2 = 1\), and \(x = (0,1)^t\). There are no other linearly independent eigenvectors. This is not always the case, repeated eigenvalues can have more than one independent eigenvector, as in the next example.

Example Repeated eigenvalues 2

The matrix \(\begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix}\) is diagonal, with repeated eigenvalues on the diagonal, \(\lambda_{1,2} = -1\). The eigenvectors associated with \(-1\) satisfy the eigenproblem \(\begin{aligned} \begin{pmatrix} -1 - (-1) & 0 \\ 0 & -1 - (-1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}, \\ \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ \end{pmatrix}. \end{aligned}\) Now, the two rows vanished, leaving no condition at all on \(x_1\) and \(x_2\). This means that all the vectors are eigenvectors! How many linearly independent eigenvectors can we find? Vectors of size 2 live in a vector space of dimension 2; we can find at most 2 linearly independent vectors. We can choose for instance the canonical basis: \(x = (1,0)^t\) and \(y = (0,1)^t\).

Tips on eigenvalue decomposition

A \(2 \times 2\) matrix (or any square matrix) admits an eigenvalue decomposition if all the eigenvalues are distinct. For \(2 \times 2\) matrices, eigenvalues are distinct if and only if the discriminant \(\Delta \neq 0\).
If the matrix has a repeated eigenvalue, it will admit an eigenvalue decomposition if the number of (linearly independent) eigenvectors is equal to the number of times the eigenvalue is repeated. The number of eigenvectors is called geometric multiplicity, and the number of repeats is called algebraic multiplicity.
The eigenproblem should be underdetermined; you should always be able to eliminate at least one row by linear combination. If you cannot, this means that there is a error, possibly an incorrect eigenvalue, or a arithmetic mistake in computing \(A - \lambda I\).
Because eigenvalues are in general complex, the eigenvectors will also be complex.
The eigenvector matrix \(X\) needs to be inverted. When the eigenvectors can be chosen so that they are orthogonal and normalized, the inverse \(X^{-1} = X^*\) (i.e. the conjugate transpose of \(X\)). Symmetric matrices have orthogonal eigenvalues, so this class of matrices are especially easy to diagonalise.
Eigenvalue decomposition and invertibility are two different concepts. A matrix can be invertible without admitting an eigenvalue decomposition, and vice versa.
When a matrix does not admit an eigenvalue decomposition, it still can be triangularised. One such triangularisation is the Jordan decomposition: \(A = P(D+S)P^{-1}\), where \(P\) is invertible, \(D\) is the diagonal matrix of eigenvalues, and \(S\) is a nilpotent matrix, i.e. a nonzero matrix such that \(S^k = 0\) for \(k \geq k_0 > 1\).

Exercises on eigenvalues decomposition

Exercise Find, if there is any, an eigenvalue decomposition of \(A = \begin{pmatrix} -1 & 2 \\ 2 & -1 \end{pmatrix}\) To compute \(X^{-1}\), you can use the fact that because \(A\) is real and symmetrical, the eigenvectors are orthogonal, meaning that \(X^{-1} = X^t\), if the eignevectors are normalized.

Correction We have \(\det A = (-1)(-1) - (2)(2) = 1 - 4 = -3 < 0,\) \(\mathrm{tr}\,A = -1 -1 = -2,\) and \(\Delta = (-2)^2 - 4 (-3) = 4 + 12 = 16.\) The eigenvalues are \(\lambda_{1,2} = \frac 12 \bigl( -2 \pm \sqrt{16} \bigr) = -1 \pm 2 = 1, -3.\) The two eigenvalues are distinct, so the matrix \(A\) is diagonalisable. The eigenvector associated with the eigenvalue \(\lambda_1 = 1\) is solution to the eigenproblem \(\bigl( A - \lambda_1 I ) x = 0.\) We look for a solution \(\begin{aligned} \begin{pmatrix} -1 - (1) & 2 \\ 2 & -1 - (1) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} & = 0, \\ \begin{pmatrix} -2 & 2 \\ 2 & -2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} & = 0. \end{aligned}\) (We check that the two rows are colinear.) The first row gives \(-2x_1 + 2x_2 = 0\), or \(x_1 = x_2\). The norm of the eigenvector \(x = (x_1, x_2)^t = \sqrt{x_1^2 + x_2^2} = \sqrt{2}|x_1|.\) We choose\(x_1 = 1/\sqrt{2}\) to have a normalized eigenvector. We know that the second eigenvector is orthogonal to \(x\), so we can take \(y = (1/\sqrt{2},-1/\sqrt{2})^t\) for the eigenvector associated to \(\lambda_2 = -3\). To check that this is indeed an eigenvector, we solve to eigenproblem \(\begin{aligned} \begin{pmatrix} -1 - (-3) & 2 \\ 2 & -1 - (-3) \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \end{pmatrix} & = 0, \\ \begin{pmatrix} 2 & 2 \\ 2 & 2 \end{pmatrix} \begin{pmatrix} y_1 \\ y_2 \end{pmatrix} & = 0. \end{aligned}\) The solutions are vectors that satisfy \(y_1 = - y_2\); this is the case for \(y\). The matrix \(X\) is composed of the column vectors \(x\) and \(y\): \(X = \bigl( x | y \bigr)\) and its inverse is \(X^{-1} = X^t = \begin{pmatrix} x_1^t \\ \hline x_2^t \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}.\)

Linearisation of functions \(\mathbb{R}^2 \to \mathbb{R}^2\)

Nonlinear systems of ordinary differential equations are used to describe the dynamics (evolution in time) of concentration of biochemical species, population densities in ecological systems, of the electrophyiology of neurons.

Two dimensional systems are described by two ordinary differential equations (ODEs) \(\begin{aligned} \frac{dx_1}{dt} & = f_1(x_1,x_2), \\ \frac{dx_2}{dt} & = f_2(x_1,x_2), \\ \end{aligned}\) The variables \(x_1, x_2\) are functions of time: \(x_1(t)\), \(x_2(t)\), and \(f_1, f_2\) are the derivatives. We define the two-dimensional vectors \(\boldsymbol{x} = (x_1,x_2)^t\) (here we will use bold for vectors), and \(\boldsymbol{f} = (f_1, f_2)^t\). The ODEs can now be represented in vector format, \(\frac{d\boldsymbol{x}}{dt} = \boldsymbol{f}(\boldsymbol{x}).\) Here we assume that there exists a point in the 2D plane \(\bar{\boldsymbol{x}}\) such that the derivative \(\boldsymbol{f}(\bar{\boldsymbol{x}}) = 0.\) This point is called a steady state because the derivatives are all zeros; the steady state is therefore a solution to the system of ODE.

We are interested in how \(\boldsymbol{f}\) is behaving around the steady state. To do that we linearize the function \(\boldsymbol{f}\) at the steady state. Linearisation is a first-order expansion. For a function from \(\mathbb{R}^2 \to \mathbb{R}^2\), a first-order expansion around a point \(\boldsymbol{x}_0\) is \(\begin{aligned} \boldsymbol{f}(\boldsymbol{x}) \approx \boldsymbol{f}(\boldsymbol{x}_0) + \boldsymbol{D}\boldsymbol{f}(\boldsymbol{x}_0) (\boldsymbol{x}-\boldsymbol{x}_0) \end{aligned}\) When expanding around a steady state, the constant term \(\boldsymbol{f}(\bar{\boldsymbol{x}}) = 0\). In the second term, \(\boldsymbol{D}\boldsymbol{f}\) is a \(2 \times 2\) matrix, called the Jacobian matrix, and often denoted \(\boldsymbol{J}\). The Jacobian matrix for the function \(\boldsymbol{f}\) is defined as \(\begin{aligned} \boldsymbol{J} = \boldsymbol{D}\boldsymbol{f} = \begin{pmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} \end{pmatrix}. \end{aligned}\) When evaluated at a steady state, the Jacobian matrix can provide information on the dynamics of the nonlinear ODE system. More precisely, the eigenvalues of the Jacobian matrix can determine whether the steady state is stable (attracts solutions) or is unstable. Linearisation around a steady state means computing the Jacobian matrix at the steady state.

Example Linearisation around a steady state

The Lotka-Volterra equations is a classical ODE system mathematical biology. The equations reads \(\begin{aligned} \frac{dx}{dt} = ax - xy, \\ \frac{dy}{dt} = xy - by, \\ \end{aligned}\) for \(a, b\) positive constants. The solution vector is \(\boldsymbol{x} = (x,y)^t\) and the derivatives are \(f_1(x,y) = ax - xy\) and \(f_2 = xy - by\). We first look for steady states \(f_1 = ax - xy = 0, \quad f_2 = x y - b y.\) If \(x\) and \(y\) are not zero, we have \(x = b\) and \(y = a\). If \(x = 0\), the second equation implies \(y = 0\). If \(y = 0\), the first equation implies \(x = 0\). Therefore there are two steady states, \(\bar{\boldsymbol{x}} = (b,a)^t\) and \(\hat{\boldsymbol{x}} = (0,0)^t\).

We have the following derivatives \(\begin{aligned} \frac{\partial f_1}{\partial x} (x,y) & = a - y, \\ \frac{\partial f_1}{\partial y} (x,y) & = - x, \\ \frac{\partial f_2}{\partial x} (x,y) & = y, \\ \frac{\partial f_2}{\partial y} (x,y) & = x - b, \\ \end{aligned}\) The Jacobian matrix is \(\begin{aligned} \boldsymbol{J} = \begin{pmatrix} a - y & - x \\ y & x - b \end{pmatrix}. \end{aligned}\) Evaluated at the steady state \(\bar{\boldsymbol{x}} = (b,a)^t\) and \(\hat{\boldsymbol{x}} = (0,0)^t\), the Jacobian matrices are \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}) = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix}, \quad \boldsymbol{J}(\hat{\boldsymbol{x}}) = \begin{pmatrix} a & 0 \\ 0 & -b \end{pmatrix}. \end{aligned}\)

Exercises on linearisation

Exercise Let the function \(\boldsymbol{f} = (f_1, f_2)^t\), with \(f_1(x,y) = -d x + x \exp(-a xy), \quad f_2(x,y) = x - y,\) \(d < 1\), \(a, d\) positive. Find the steady states (by solving the equations \(f_1 = 0, f_2 = 0\)). Compute the Jacobian matrix, and evaluate the Jacobian matrix at each steady state.

Correction The steady states are found by solving \(\begin{aligned} f_1(x,y) = -d x + x \exp(-a xy) & = 0, \\ f_2(x,y) = x - y & = 0. \end{aligned}\) From the second equation, we have \(x = y\). The first equation is equivalent to \(d x = x \exp( - a xy).\) We need to distinguish two cases: (i) \(x = 0\), and (ii) \(x \neq 0\). Case (i) leads to the solution \(\boldsymbol{x}^* = (0, 0)^t\), our first steady state. Case (ii) means that we can simplify \(x\) in the first equation: \(d = \exp(- a xy).\) Replacing \(y = x\), and solving for \(x\): \(\begin{aligned} d & = \exp(-a x y), \\ d & = \exp(-a x^2), \\ \ln d & = - a x^2, \\ - \frac{\ln d}{a} & = x^2, \quad (a > 0) \end{aligned}\) The hypothesis \(d<1\) ensures that \(\ln d < 0\) and \(- \ln d > 0\). There are therefore two real solutions for \(x\): \(x = \pm \sqrt{- \frac{\ln d}{a}}.\) The two additional steady states are \(\bar{\boldsymbol{x}}_{1,2} = \begin{pmatrix} \pm \sqrt{- \frac{\ln d}{a}} \\ \pm \sqrt{- \frac{\ln d}{a}} \end{pmatrix}.\)

The Jacobian matrix of \(\boldsymbol{f}\) is computed from the partial derivatives \(\begin{aligned} \frac{\partial f_1}{\partial x} (x,y) & = -d + (-a y) \exp ( - axy ), \\ \frac{\partial f_1}{\partial y} (x,y) & = (-ax) \exp ( -axy ), \\ \frac{\partial f_2}{\partial x} (x,y) & = 1, \\ \frac{\partial f_2}{\partial y} (x,y) & = -1. \\ \boldsymbol{J} & = \begin{pmatrix} -d - a y \exp ( - axy ) & -ax \exp ( -axy ) \\ 1 & -1 \end{pmatrix}. \end{aligned}\) The function \(f_2\) is linear. This is reflected in the Jacobian matrix, which as constant coefficients on the second row. The evaluation of the Jacobian matrix at steady state \(\boldsymbol{x}^* = (0,0)^t\) is \(\boldsymbol{J}(\boldsymbol{x}^*) = \begin{pmatrix} -d & 0 \\ 1 & -1 \end{pmatrix}.\) The evaluation of Jacobian matrix at steady state \(\bar{\boldsymbol{x}}_1 = \bigl( \sqrt{- \frac{\ln d}{a}}, \sqrt{- \frac{\ln d}{a}} \bigr)^t\) is \(\boldsymbol{J}(\bar{\boldsymbol{x}}_1) = \begin{pmatrix} -d - a \bar y_1 \exp( -a \bar x_1 \bar y_1 ) & - a \bar x_1 \exp ( - a \bar x_1 \bar y_1 ) \\ 1 & -1 \end{pmatrix}.\) Here, we use the fact that steady states satisfy the equation \(\exp( -a xy ) = d\) to simplify the exponential terms \(\boldsymbol{J}(\bar{\boldsymbol{x}}_1) = \begin{pmatrix} -d - a \bar y_1 d & - a \bar x_1 d \\ 1 & -1 \end{pmatrix}.\) Replacing \(y_1\) and \(x_1\) by \(\sqrt{- \frac{\ln d}{a}}\), we obtain \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}_1) & = \begin{pmatrix} -d - a y_1 d & - a \bar x_1 d \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d - a \sqrt{- \frac{\ln d}{a}} d & - a \sqrt{- \frac{\ln d}{a}} d \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d \Bigl( 1 + a \sqrt{- \frac{\ln d}{a}} \Bigr) & - d \sqrt{- a^2 \frac{\ln d}{a}} \\ 1 & -1 \end{pmatrix}, \\ & = \begin{pmatrix} -d \Bigl( 1 + \sqrt{- a \ln d} \Bigr) & - d \sqrt{- a \ln d} \\ 1 & -1 \end{pmatrix}. \end{aligned}\) The same lines of calculations for the steady state \(\bar{\boldsymbol{x}}_2\) leads to \(\begin{aligned} \boldsymbol{J}(\bar{\boldsymbol{x}}_2) & = \begin{pmatrix} -d \Bigl( 1 - \sqrt{- a \ln d} \Bigr) & d \sqrt{- a \ln d} \\ 1 & -1 \end{pmatrix}. \end{aligned}\)

Exercise Compute the Jacobian matrices of each of the following functions of \((x,y)\). All parameters are constants. You do not need to compute the steady states just the matrices.

van der Pol oscillator \(f_1(x,y) = \mu\bigl( (1-x^2)y - x \bigr), \; f_2(x,y) = y.\)
Two-compartment pharmacokinetics \(f_1(x,y) = a - k_{12} x + k_{21} y - k_1 x, \; f_2(x,y) = k_{12} x - k_{21} y.\)
SI epidemiological model \(f_1(x,y) = - \beta x y, \; f_2(x,y) = \beta x y - \gamma y.\)

Solution of systems of linear differential equations in dimension 2

Linear differential equations have linear derivative parts, which can be represented in matrix-vector format \(\frac{d\boldsymbol{x}(t)}{dt} = \boldsymbol{A} \boldsymbol{x}(t),\) for a vector \(\boldsymbol{x}\) square matrix \(\boldsymbol{A}\). For initial conditions \(\boldsymbol{x}(t) = \boldsymbol{x}_0\), the solution of the linear system of ODEs is \(\begin{aligned} \boldsymbol{x}(t) = e^{\boldsymbol{A}t} \boldsymbol{x}_0. \end{aligned}\) If we have at our disposal an eigenvalue decomposition of \(\boldsymbol{A} = \boldsymbol{X} \boldsymbol{D} \boldsymbol{X}^{-1}\), the exponential of the matrix is \(\begin{aligned} e^{\boldsymbol{A}t} & = \boldsymbol{X} e^{\boldsymbol{D}t} \boldsymbol{X}^{-1}, \\ & = \boldsymbol{X} \begin{pmatrix} e^{ \lambda_1 t } & 0 \\ 0 & e^{ \lambda_2 t } \end{pmatrix} \boldsymbol{X}^{-1}. \end{aligned}\) Therefore, the long-time behavior of the exponential is controlled by the eigenvalues \(\lambda_{1,2}\).

Example Solution of a linear system of ODEs

Consider the linear system of ODEs given by the Lotka-Volterra model linearised at its nonzero steady state \(\bar{\boldsymbol{x}} = (b,a)^t\) is \(\begin{aligned} \begin{pmatrix} \frac{dx}{dt} \\ \frac{dy}{dt} \end{pmatrix} = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}, \quad \begin{pmatrix} x(0) \\ y(0) \end{pmatrix} = \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}. \label{eq_linearequation} \end{aligned}\) This system approximates the nonlinear version near the steady state. In this linear system, variables \((x,y)\) are deviations from the steady state; their solutions are "centered" around 0. To solve this linear system, we will diagonalise the matrix \(\begin{aligned} A = \begin{pmatrix} 0 & - b \\ a & 0 \end{pmatrix}. \end{aligned}\) The goal is to go slowly through every step once for this system. In general it is not necessary to solve the system completely by hand; knowledge of the eigenvalues is often sufficient in many applications.

We have \(\det A = 0 - a(-b) = ab > 0\), \(\mathrm{tr}\,A = 0\) and \(\Delta = 0 - 4ab = -4ab < 0\). The eigenvalues are therefore complex conjugates: \(\lambda_{1,2} = \pm i \sqrt{ab}.\) Distinct eigenvalues means that \(A\) is diagonalisable. The eigenvector associated to \(\lambda_1 = i\sqrt{ab}\) is given by the system \(\begin{aligned} \Biggl( \begin{array}{cc|c} -i\sqrt{ab} & -b & 0 \\ a & -i\sqrt{ab} & 0 \end{array} \Biggr) \end{aligned}\) We have from the first row \(-i\sqrt{ab} x = b y\). Letting \(x = b\) and \(y = -i\sqrt{ab}\), we obtain the non-normalized eigenvector \(\tilde x_1 = (b, -i\sqrt{ab})^t\). Normalization is done by dividing by \(||\tilde x || = \sqrt{b^2 + (-i\sqrt{ab})^2} = \sqrt{b^2 + ab},\) to obtain the first eigenvector \(x = \begin{pmatrix} \frac{b}{\sqrt{b^2 + ab}} \\ \frac{-i\sqrt{ab}}{\sqrt{b^2 + ab}} \end{pmatrix} = \begin{pmatrix} \frac{b}{\sqrt{b}\sqrt{b + a}} \\ \frac{-i\sqrt{a}\sqrt{b}}{\sqrt{b}\sqrt{b + a}} \end{pmatrix} = \begin{pmatrix} \frac{\sqrt{b}}{\sqrt{b + a}} \\ \frac{-i\sqrt{a}}{\sqrt{b + a}} \end{pmatrix}.\) The second eigenvector is computed the same way (watch out for the slightly different signs!). The eigenproblem for the eigenvalue \(\lambda = -i\sqrt{ab}\) is \(\begin{aligned} \Biggl( \begin{array}{cc|c} +i\sqrt{ab} & -b & 0 \\ a & +i\sqrt{ab} & 0 \end{array} \Biggr) \end{aligned}\) Given that the only change is \(-i \to +i\), the second eigenvector is \(x_2 = \begin{pmatrix} \frac{\sqrt{b}}{\sqrt{b + a}} \\ \frac{i\sqrt{a}}{\sqrt{b + a}} \end{pmatrix}.\) The solution to the linear ODE is \(\begin{pmatrix} x(t) \\ y(t) \end{pmatrix} = \boldsymbol{X} e^{\boldsymbol{D}t} \boldsymbol{X}^{-1} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix},\) with \(\begin{aligned} \boldsymbol{X} = \frac{1}{\sqrt{b + a}} \begin{pmatrix} \sqrt{b} & \sqrt{b} \\ -i\sqrt{a} & i\sqrt{a} \end{pmatrix}, \quad \boldsymbol{D} = \begin{pmatrix} +i\sqrt{ab} & 0 \\ 0 & -i\sqrt{ab} \end{pmatrix} \end{aligned}\) The inverse of a \(2 \times 2\) matrix with coefficients \(a,b,c,d\) is \(\begin{aligned} \begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} = \frac{1}{ad-bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}. \end{aligned}\) This is conditional to \(\det = ad - bc \neq 0\), of course. With this formula, the inverse of \(\boldsymbol{X}\) is \(\boldsymbol{X}^{-1} = \frac{1}{\sqrt{b + a}} \frac{1}{\det \boldsymbol{X}} \begin{pmatrix} i \sqrt{a} & -\sqrt{b} \\ i \sqrt{a} & \sqrt{b} \end{pmatrix}.\) The determinant \(\det \boldsymbol{X} = \frac{i \sqrt{b} \sqrt{a}}{b + a} + \frac{i \sqrt{a} \sqrt{b}}{b + a} = 2i \frac{\sqrt{ab}}{b + a}.\) The inverse reduces to \(\frac{1}{\sqrt{b + a}} \frac{a + b}{2i\sqrt{ab}} \begin{pmatrix} i \sqrt{a} & -\sqrt{b} \\ i \sqrt{a} & \sqrt{b} \end{pmatrix} = \frac{-i \sqrt{b + a}}{2\sqrt{ab}} \begin{pmatrix} i \sqrt{a} & -\sqrt{b} \\ i \sqrt{a} & \sqrt{b} \end{pmatrix} = \frac{\sqrt{b + a}}{2\sqrt{ab}} \begin{pmatrix} \sqrt{a} & i\sqrt{b} \\ \sqrt{a} & -i\sqrt{b} \end{pmatrix}.\) We have now obtained the eigenvalue decompostion of \(\boldsymbol{A} = \boldsymbol{X} \boldsymbol{D} \boldsymbol{X}^{-1}\). To solve the linear ODE, we need to compute the product \(\begin{aligned} \begin{pmatrix} x(t) \\ y(t) \end{pmatrix} & = \boldsymbol{X} e^{\boldsymbol{D}t} \boldsymbol{X}^{-1} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\ & = \frac{1}{\sqrt{b + a}} \begin{pmatrix} \sqrt{b} & \sqrt{b} \\ -i\sqrt{a} & i\sqrt{a} \end{pmatrix} \begin{pmatrix} e^{+i\sqrt{ab}t} & 0 \\ 0 & e^{-i\sqrt{ab}t} \end{pmatrix} \frac{\sqrt{b + a}}{2\sqrt{ab}} \begin{pmatrix} \sqrt{a} & i\sqrt{b} \\ \sqrt{a} & -i\sqrt{b} \end{pmatrix} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\ & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} \sqrt{b} & \sqrt{b} \\ -i\sqrt{a} & i\sqrt{a} \end{pmatrix} \begin{pmatrix} e^{+i\sqrt{ab}t} & 0 \\ 0 & e^{-i\sqrt{ab}t} \end{pmatrix} \begin{pmatrix} \sqrt{a} & i\sqrt{b} \\ \sqrt{a} & -i\sqrt{b} \end{pmatrix} \begin{pmatrix} x_0 \\ y_0 \end{pmatrix}, \\ & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} \sqrt{b} e^{+i\sqrt{ab}t} & \sqrt{b} e^{-i\sqrt{ab}t}\\ -i\sqrt{a} e^{+i\sqrt{ab}t} & i\sqrt{a} e^{-i\sqrt{ab}t} \end{pmatrix} \begin{pmatrix} \sqrt{a}x_0+i\sqrt{b}y_0 \\ \sqrt{a}x_0-i\sqrt{b}y_0 \end{pmatrix}. \\ \end{aligned}\) To simplify the last steps of the calculation, we will introduce the following notation. Using Euler’s formula, \(e^{\pm i\sqrt{ab}t} = \cos(\sqrt{ab}t) \pm i \sin(\sqrt{ab}t)\). Let \(c = \cos(\sqrt{ab}t)\), \(s = \sin(\sqrt{ab}t)\), and \(C_1 = \sqrt{a}x_0+i\sqrt{b}y_0\), \(C_2 = \sqrt{a}x_0-i\sqrt{b}y_0\). The solution reads \(\begin{aligned} \begin{pmatrix} x(t) \\ y(t) \end{pmatrix} & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} \sqrt{b} e^{i\sqrt{ab}t} C_1 + \sqrt{b} e^{-i\sqrt{ab}t} C_2 \\ -i\sqrt{a}e^{i\sqrt{ab}t} C_1 + i\sqrt{a}e^{-i\sqrt{ab}t} C_2 \end{pmatrix}, \\ & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} \sqrt{b} (c + is) C_1 + \sqrt{b} (c - is) C_2 \\ -i\sqrt{a}(c + is) C_1 + i\sqrt{a}(c - is) C_2 \end{pmatrix}, \\ & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} \sqrt{b} c (C_1 + C_2) + i \sqrt{b} s (C_1 - C_2) \\ \sqrt{a} s (C_1 + C_2) + i \sqrt{a} c (-C_1 + C_2) \end{pmatrix}, \\ & = \frac{1}{2\sqrt{ab}} \begin{pmatrix} 2 \sqrt{ab} \cos(\sqrt{ab}t) x_0 - 2 b \sin(\sqrt{ab}t) y_0 \\ 2 a \sin(\sqrt{ab}t)x_0 + 2 \sqrt{ab} \cos(\sqrt{ab}t) y_0 \end{pmatrix}, \\ & = \begin{pmatrix} \cos(\sqrt{ab}t) x_0 - \sqrt{b/a} \sin(\sqrt{ab}t) y_0 \\ \sqrt{a/b} \sin(\sqrt{ab}t)x_0 + \cos(\sqrt{ab}t) y_0 \end{pmatrix}. \end{aligned}\) And that’s it! We have obtained a solution to the linear ODE (Figure 8).

Solution of the linear system of ODEs ([eq_linearequation]), with \(a = 0.1\), \(b = 0.4\).

Glossary

French	English	Note
dérivable	differentiable
matrice jacobienne	Jacobian matrix
rang	rank
noyau	kernel	notation: \(\ker\)
ensemble	set
espace vectoriel	vector space
sous-espace vectoriel	linear subspace
valeur propre	eigenvalue
vecteur propre	eigenvector
sous-espace propre	eigenspace
décomposition en valeurs propres	eigenvalue decomposition
décomposition en valeurs singulières	singular value decomposition
valeur singulière	singular value
trace	trace
déterminant	determinant	\(\det\)
base	basis
application linéaire	linear map
application	map
dimension	dimension
moindres carrés	least-squares
produit scalaire	scalar product
Vect	Span
famille libre	linearly independent set
famille génératrice	spanning set

Modifié le: jeudi 7 septembre 2023, 11:16