Inner Product Spaces¶

\(\def\vs#1{\mathsf{#1}}\) In this section, let \(\vs{V}\) denote a vector space over the field \(\mathbb{R}\), let \(\vec{u}, \vec{v},\) and \(\vec{w}\) be vectors from \(\vs{V}\), and \(a, b, c,\) and \(d\) be scalars from the field \(\mathbb{R}\).

Definition 35 (Real inner product)

Suppose to each pair of vectors \(\vec{u},\vec{v} \in \vs{V}\) there is assigned a real number, denoted \(\langle \vec{u}, \vec{v} \rangle\), which is called the real inner product on \(\vs{V}\) if it satisfies the axioms

Axiom 17 (Linearity)

\[\langle a \vec{u}_1 + b \vec{u}_2 , \vec{v} \rangle = a \langle \vec{u}_1, v \rangle + b \langle \vec{u}_2, \vec{v} \rangle\]

Axiom 18 (Symmetry)

\[\langle \vec{u}, \vec{v} \rangle = \langle \vec{v}, \vec{u} \rangle\]

Axiom 19 (Positive-definiteness)

\[\langle \vec{u}, \vec{u} \rangle \geq 0 \text{ and } \langle \vec{u}, \vec{u} \rangle = 0 \text{ iff } \vec{u} = 0.\]

Definition 36 (Real Inner Product Space)

A vector space, \(\vs{V}\) on which a real inner product is defined is called a real inner product space.

Definition 37 (Vector norm)

The third axiom of the inner product requires that it always be positive. This allows the definition of the norm,

\[||\vec{u}|| = \sqrt{ \langle \vec{u}, \vec{u} \rangle }\]

which is a measure of the length of the vector.

There are numerous examples of inner product spaces, from Euclidean \(n\)-spaces (perhaps the most day-to-day example), function and polynomial space, matrix space, and Hilbert space.

Definition 38 (Cauchy-Schwarz Inequality)

For any vectors \(\vec{u}\) and \(\vec{v}\) in an inner product space \(\vs{V}\),

\[\langle \vec{u}, \vec{v} \rangle^2 \leq \langle \vec{u}, \vec{u} \rangle \langle \vec{v}, \vec{v} \rangle \quad \text{or} \quad | \langle \vec{u}, \vec{v} \rangle \leq ||\vec{u}|| ||\vec{v}||\]

[Properties of the Norm] Let \(\vs{V}\) be an inner product space. Then the norm in \(\vs{V}\) satisfies the following properties

\(||\vec{v}|| \geq 0\) and \(||\vec{v}||=0\) if and only if \(\vec{v}=0\).
\(||k \vec{v}|| = |k| ||\vec{v}||\)
\(||\vec{u} + \vec{v}|| \leq ||\vec{u}|| + ||\vec{v}||\)

[Orthogonal Vectors] Two vectors, \(\vec{u}, \vec{v} \in \vs{V}\) are orthogonal iff

\[\langle \vec{u}, \vec{v} \rangle = 0\]

[Orthonormal Vectors] Two vectors, \(\vec{u}, \vec{v} \in \vs{V}\) are orthonormal iff

\[\begin{split}\langle \vec{u}_i, \vec{u}_j \rangle = \begin{cases} 0 & \vec{u} \neq \vec{v} \\ 1 & \vec{u} = \vec{v} \end{cases}\end{split}\]

It follows that a sequence of orthogonal or orthonormal vectors is linearly independent, and so can form a basis. In order to orthogonalise a basis use the Gram-Schmidt Process. Let \(\vec{u}_1, \vec{u}_2\) be linearly independent vectors with an angle \(\theta\) between them. Then

\[\theta = \frac{\langle \vec{u}_1, \vec{u}_2 \rangle}{||\vec{u}_1|| || \vec{u}_2||}\]

so \(\vec{u}_2\) can be expressed as the sum of two vectors in the direction of \(\vec{v}_2\) (a new, vector orthogonal to \(\vec{u}_1\), and \(\vec{u}_1\). For convenience, let \(\vec{v}_1 = \vec{u}_1\).

[Gram-Schmidt Process] For a sequence of non-orthonormal, linearly independent vectors \(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_n\), in order to produce an orthonormal sequence, \(\vec{v}_1, \vec{v}_2, \dots, \vec{v}_n\),

Let \(\vec{v}_1\)
Set

\[ \begin{align}\begin{aligned}\vec{v}_k = \vec{u}_k - \sum_{i=1}^{k-1} \frac{\langle \vec{v}_i, \vec{u}_k \rangle}{\langle \vec{v}_i, \vec{u}_i \rangle} \cdot \vec{v}_i\\:math:`\forall k \in \{ 1, ..., n \}`.\end{aligned}\end{align} \]
Normalise each vector by

\[\hat{v}_i = \frac{\vec{v}_i}{||\vec{v}_i||}\]

[Complex inner product spaces] Let \(\vs{V}\) be a vector space over \(\mathbb{C}\). Suppose each pair of vectors

Roots of Polynomials¶

Any polynomial \(p(t) \in \mathbb{C}[t]\) of positive degree \(n\) can be factored into \(n\) linear factors

\[p(t) = c(t-\alpha_1)(t-\alpha_2)\cdots (t-\alpha_n)\]

where \(c \neq 0\), and the roots \(\alpha_1, \alpha_2, \dots \alpha_n\) are uniquely determined from their order. Then gathering repeated factors,

\[ \begin{align}\begin{aligned} p(t) = c(t-\beta_1)^{r_1} (t-\beta_2)^{r_2} \cdots (t-\beta_m)^{r_m}\\where :math:`\beta_1, \beta_2, \dots \beta_m` are the distinct complex\end{aligned}\end{align} \]

roots of \(p(t)\), and each \(r_k \ge 1\) which is the algebraic multiplicity of \(\beta_k\).

Let

\[p(t) = t^n + a_{n-1}t^{n-1} + \cdots + a_1 t + a_0\]

be a monic polynomial which can be factored

\[ \begin{align}\begin{aligned} p(t) = (t-\alpha_1)(t-\alpha_2)\cdots (t-\alpha_n)\\where :math:`\alpha_1, \dots \alpha_n` are complex roots, then\end{aligned}\end{align} \]

\[\sum_{i=1}^n \alpha_i = - a_{n-1} \qquad \prod_{i=1}^n = (-1)^n a_0\]

Diagonalisation¶

[Similarity] Let \(A,B\) be square matrices with entries from \(F\). Then \(A\) is similar to \(B\) if

\[B = P^{-1} A P\]

for some invertible square matrix \(P\) with entries from \(F\).

[Properties of similar matrices]

Any square matrix \(A\) is similar to itself.
If \(A\) is similar to \(B\), then \(B\) is similar to \(A\).
If \(A\) is similar to \(B\), and \(B\) is similar to \(C\), then \(A\) is similar to \(C\).

[Characteristic Polynomial of similar matrices] Let \(A, B\) be similar square matrices. Then \(A\) and \(B\) have the same characteristic polynomial, and hence the same trace, determinant, and eigenvalues.

[Diagonalisability] Let \(A\) be a square matrix over \(F\). If \(A\) is similar to a diagonal matrix over \(F\) then A is diagonalisable over \(F\).

Let \(A\) be a square matrix over \(F\). \(A\) is diagonalisable over \(F\) iff there exists an invertible matrix \(P\) over \(F\) whose columns are the eigenvectors of \(A\). That is

\[P^{-1} A P = D\]

for \(D\) a diagonal matrix, :math:`D =: diag(lambda_1, dots, lambda_n)` for distinct eigenvalues

\(\lambda_1, \dots, \lambda_n\).

[Jordan-normal Form] A matrix is said to be in Jordan-normal form if all of the non-zero entries off the main diagonal are immediately above an element on the main diagonal, and have identical diagonal elements to the left and below them.

Not all matrices are diagonalisable, but over \(\mathbb{C}\) it is always possible to find a matrix \(P\) such that

\[ \begin{align}\begin{aligned}P^{-1} A P = J\\for a matrix :math:`J` which is in Jordan-normal form.\end{aligned}\end{align} \]

[Row Rank] Let \(F\) be a field, and \(A \in M_{m \times n}(F)\) be an \(m \times n\) matrix over \(F\). The row rank of \(A\) is the number of non-zero rows in the reduced echelon matrix row equivalent of \(A\).

[Column Rank] Let \(F\) be a field, and \(A \in M_{m \times n}(F)\) be an \(m \times n\) matrix over \(F\). The column rank of \(A\) is the number of non-zero columns in the reduced echelon matrix column equivalent of \(A\).

For any \(m \times n\) matrix, \(A\), the row rank and the column rank are equal.

[Rank of a Matrix] The rank of a matrix, \(A\), is

\[\rank(A) = \rank(f) = \dim(\img(f))\]

Let \(A\) be a complex \(m \times n\) matrix, and let :math:`lambda in

mathbb{C}` be an eigenvalue of \(A\). Then

\[S = \{ \vec{x} \in \mathbb{C}^n : A \vec{x} = \lambda \vec{x} \}\]

is a subspace of \(\mathbb{C}^n\).

[Eigenspace] Let \(A\) be an \(n \times n\) complex matrix, and let :math:`lambda in

mathbb{C}` be an eigenvalue of \(A\). Then the

\(\lambda\)-eigenspace of \(A\) is

\[\Eig_A(\lambda) = \{ \vec{x} \in \mathbb{C}^n : A \vec{x} = \lambda \vec{x} \}\]

Let \(A\) be an \(n \times n\) complex matrix, and let :math:`lambda in

mathbb{C}` be an eigenvalue of \(A\). The geometric multiplicity

of \(\lambda\) is \(\dim \Eig_A(\lambda)\), and the dimension of the \(\lambda\)-eigenspace of \(A\).

The geometric multiplicity of an eigenvalue \(\lambda\) is less than or equal to the algebraic multiplicity of \(\lambda\) in the characteristic polynomial of \(A\).

Let \(A\) be an \(n \times n\) matrix over \(\mathbb{C}\). Suppose \(\lambda_1, \dots, \lambda_k\) are distinct eigenvalues of \(A\), with associated eigenvectors \(\vec{v}_1, \dots, \vec{v}_k\). Then,

The eigenvectors are linearly independent.
The sum of the eigenspaces :math:`Eig_A(lambda_1) + cdots +
Eig_A(lambda_k)` is a direct sum, i.e.

\[\Eig_A(\lambda_1) + \cdots + \Eig_A(\lambda_k) \equiv \Eig_A(\lambda_1) \oplus \cdots \oplus \Eig_A(\lambda_k)\]

Lagrange’s equations for oscillating systems¶

Equations of motion in generalised coordinates, \(q_i\), can be expressed in terms of the Lagrangian, \({\cal L}\), of the system. Each generalised coordinate has an associated generalised force, \(\pdv{{\cal L}}{q_i}\), and a generalised momentum, :math:`dot{q} =

pdv{q_i}{t}`. This allows the equations of motion of the system to be

expressed as

\[ \begin{align}\begin{aligned}\dv{t} \qty( \pdv{{\cal L}}{\dot{q}_i} ) = \pdv{{\cal L}}{q_i}\\these can be treated as a matrix formulation. In an oscillating system\end{aligned}\end{align} \]

the kinetic energy is quadratic with respect to the generalised momenta, and potential is quadratic with respect to the generalised coordinates. We thus have,

\[ \begin{align}\begin{aligned} {\cal L} = \half m \dot{q}^{\rm T} T \dot{q} - \half k q^{\rm T} V q\\with :math:`T` and :math:`V` being matrices for kinetic and potential\end{aligned}\end{align} \]

energies respectively.

The characteristic frequencies of a system of springs.

=[thick,decorate,decoration=zigzag,pre length=0.1cm,post length=0.1cm,segment length=5]

[node distance=2cm] (wall) ;

(mass1) \(m\); (mass2) \(m\);

(wall2) ;

(wall) – (mass1) node [midway, above] \(k\); (mass1) – (mass2) node [midway, above] \(k\); (mass2) – (wall2) node [midway, above] \(k\);

; ;

Here \(x\) and \(y\) are generalised coordinates for the system. The kinetic energy is

\[T = \half m \qty( \dot{x}^2 + \dot{y}^2 )\]

and the potential is

\[ \begin{align}\begin{aligned} V = \half k x^2 + \half k y^2 + \half k (x-y)^2 = \half k \qty(2 x^2 + 2y^2 - 2xy )\\In matrix form this is\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} T &= \half m \begin{pmatrix} \dot{x} & \dot{y} \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \begin{pmatrix} \dot{x} \\ \dot{y} \end{pmatrix} \\ V &= \half k \begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \\ \end{aligned}\end{split}\\This can be solved by diagonalising the matrix\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} \begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix}\end{aligned}\end{split}\\In order to do this we find the eigenvectors of :math:`V`; the\end{aligned}\end{align} \]

characteristic equation is

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} \qty| \begin{matrix} 2 - \mu & -1 \\ -1 & 2 - \mu \end{matrix} | &= 0 \\ (2 - \mu)(2-\mu)-1 &=0 \\ \mu^2 - 4 \mu + 3 &=0 \\ (\mu - 1)(\mu - 3) &= 0 \\ \mu &= 1 \quad \text{ or } \quad 3\end{aligned}\end{split}\\We have,\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} \begin{pmatrix} x \\ y \end{pmatrix} =C \begin{pmatrix} x^{\prime} \\ y^{\prime}\end{pmatrix}\end{aligned}\end{split}\\a change of variables gives\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} D = \begin{pmatrix} 1 & 0 \\ 0 & 3 \end{pmatrix}\end{aligned}\end{split}\\for :math:`C^{-1}M C = D`, where\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned}V = \half k^{\prime} \qty( x^{\prime 2} + 3 y^{\prime 2} )\\:math:`\dot{x}` and :math:`\dot{y}` transform in the same way as\end{aligned}\end{align} \]

\(x\) and \(y\), so we have \(T=I\), so

\[ \begin{align}\begin{aligned}C^{-1} T C = C^{-1} I C = C^{-1} C = I\\is unchanged, and so\end{aligned}\end{align} \]

\[ \begin{align}\begin{aligned} T = \half m^{\prime} \qty( \dot{x}^{\prime 2} + \dot{y}^{\prime 2} )\\and\end{aligned}\end{align} \]

\[{\cal L} = T - V = \half m^{\prime} \qty( \dot{x}^{\prime 2} + y^{\prime 2} ) - \half k^{\prime} \qty( x^{\prime 2} + 3 y^{\prime 2})\]

The two Lagrange equations are uncoupled (there are no \(xy\) cross terms), so they can be solved seperately,

\[\begin{split}\begin{aligned} \dv{t} \qty( \pdv{{\cal L}}{\dot{x}^{\prime}} ) - \pdv{{\cal L}}{x^{\prime}} &= 0 \\ \dv{t} \qty( \half m \cdot 2 \dot{x}^{\prime} ) - \qty( - \half k \cdot 2 x^{\prime} ) &= 0 \\ m \ddot{x}^{\prime} + k x^{\prime} &= 0 \\ x^{\prime} &= A \sin(\omega_x t + \alpha) \\ \omega_x &= \sqrt{\frac{k}{m}}\end{aligned}\end{split}\]

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} \dv{t} \qty( \pdv{{\cal L}}{\dot{y}^{\prime}} ) - \pdv{{\cal L}}{y^{\prime}} &= 0 \\ \dv{t} \qty( \half m \cdot 2 \dot{y}^{\prime} ) - \qty( - \half k \cdot 6 y^{\prime} ) &= 0 \\ m \ddot{y}^{\prime} + k \cdot 3y{\prime} &= 0 \\ x^{\prime} &= B \sin(\omega_y t + \beta) \\ \omega_y &= \sqrt{\frac{3k}{m}}\end{aligned}\end{split}\\These normal modes then correspond to,\end{aligned}\end{align} \]

(\(\mu=1\)),

\[ \begin{align}\begin{aligned}\begin{split} \begin{aligned} \begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} &= \mu \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x \\ y \end{pmatrix} \\ 2x - y &= x \\ y &= x\end{aligned}\end{split}\\(:math:`\mu=3`),\end{aligned}\end{align} \]

\[\begin{split}\begin{aligned} \begin{pmatrix}2 & -1 \\ -1 & 2 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} &= 3 \begin{pmatrix} x \\ y \end{pmatrix} \\ 2x - y &= 3x \\ y &= -x\end{aligned}\end{split}\]