⚛️
Linear Algebra
  • Basic Operations
  • Norms
  • Vector Properties
  • Matrix Properties
  • Special Matrices & Vectors
  • Eigen Decomposition
  • Quadratic Forms & Positive Definiteness
  • Singular Value Decomposition
Powered by GitBook
On this page
  • Vector Equality
  • Vector Addition
  • Geometric Interpretation
  • Vector Scaling
  • Vector Subtraction
  • Geometric Interpretation
  • Scaled Vector Addition
  • Counting Flops & Memops
  • Vector Function
  • Vector Functions That Map a Vector to a Vector
  • Matrix Product
  • ​Hadamard Product
  • ​Dot (Inner) Product
  • Cost
  • ​Transpose
  • ​Inverse

Basic Operations

NextNorms

Last updated 2 years ago

Vector Equality

Let x∈Rnx \isin \mathbb{R}^{n}x∈Rn ​and y∈Rny \isin \mathbb{R}^{n}y∈Rn ​with:

x=(x0x1...xn−1),y=(y0y1...yn−1)x = \begin{pmatrix} x_0 \\ x_1 \\ ... \\ x_{n-1} \end{pmatrix}, y = \begin{pmatrix} y_0 \\ y_1 \\ ... \\ y_{n-1} \end{pmatrix}x=​x0​x1​...xn−1​​​,y=​y0​y1​...yn−1​​​

Then x=yx = yx=y, if and only if:

x0=y0;x1=y1;...xn−1=yn−1;x_0 = y_0; \\ x_1 = y_1; \\ ... \\ x_{n-1} = y_{n-1};x0​=y0​;x1​=y1​;...xn−1​=yn−1​;

That is, the corresponding elements of xxx​ and yyy​ are equal.

Vector Addition

Vector addition x+yx+yx+y (sum of vectors) is defined by:

x+y=(x0x1...xn−1)+(y0y1...yn−1)=(x0+y0x1+y1...xn−1+yn−1)x+y = \begin{pmatrix} x_0 \\ x_1 \\ ... \\ x_{n-1} \\ \end{pmatrix} + \begin{pmatrix} y_0 \\ y_1 \\ ... \\ y_{n-1} \\ \end{pmatrix} = \begin{pmatrix} x_0 + y_0 \\ x_1 + y_1 \\ ... \\ x_{n-1} + y_{n-1} \end{pmatrix}x+y=​x0​x1​...xn−1​​​+​y0​y1​...yn−1​​​=​x0​+y0​x1​+y1​...xn−1​+yn−1​​​

In other words, the vectors are added element-wise, yielding a new vector of the same size.

Geometric Interpretation

Vector Scaling

Vector Subtraction

Geometric Interpretation

Scaled Vector Addition

It is often referred to as the AXPY operation, which stands for alpha times x plus y. We emphasize that it is typically used in situations where the output vector overwrites the input vector y.

Counting Flops & Memops

Consider the following scaled vector addition:

Now, imagine all the data for this scaled vector addition starts by being stored in the main memory of a computer. Further, in order to compute, data must be brought into registers and there are only a few registers available.

Now, we know that the scaling factor 2 in the above example is going to be used many times. So, we store it in a register and keep it there during the entire computation. This costs us 1 memory operation (memops).

Vector Function

A vector (valued) function is a mathematical function of one or more scalars and/or vectors whose output is a vector.

For example,

Another example can be,

The AXPY and DOT operations are also examples of vector functions.

Vector Functions That Map a Vector to a Vector

We will now focus on vector functions as functions that map a vector to another vector:

For example,

Another example can be,

Do note the following:

Matrix Product

For given matrices A and B, their product is given by:

​Hadamard Product

Hadamard product is element-wise multiplication. Following equations provide details of how the this product operation is calculated for given matrices A and B:

​Dot (Inner) Product

The dot (inner) product between two vectors a and b is given by:

Alternatively,

Cost

​Transpose

The transpose of a matrix A is given by:

​Inverse

The following equation provides the deifnition of inverse of a matrix:

Not all square matrices have an inverse​.

For such cases, and non-square matrices, "pseudoinverse" is defined.

In this interpretation, let x∈Rnx \isin \mathbb{R}^{n}x∈Rn and y∈Rny \isin \mathbb{R}^{n}y∈Rn. These 2 vectors actually lie in a plane and if we view them in that plane, we can observe the result of doing x+yx + yx+y. If lay the vectors "heel to toe" , e.g. if we lay vector yyy 's heel (where yyy starts) to toe of xxx (where xxx ends), x+yx + yx+y can be given by going from heel of xxx to toe of yyy. Vice-versa is also true.

Another geometric interpretation is, if we lay yyy​'s heel to toe of xxx or we lay xxx's heel to toe of yyy, this would create a parallelogram. The result of x+yx + yx+y​is given by the diagonal of the parallelogram. This is known as the parallelogram method.

Scaling a vector is same as stretching the vector. Let x∈Rnx \isin \mathbb{R}^{n}x∈Rn​ and α∈R\alpha \isin \mathbb{R}α∈R​. Then, vector scaling is given by:

αx=α(x0x1...xn−1)=(αx0αx1...αxn−1)\alpha x = \alpha \begin{pmatrix} x_0 \\ x_1 \\ ... \\ x_{n-1} \end{pmatrix} = \begin{pmatrix} \alpha x_0 \\ \alpha x_1 \\ ... \\ \alpha x_{n-1} \\ \end{pmatrix}αx=α​x0​x1​...xn−1​​​=​αx0​αx1​...αxn−1​​​

Let x∈Rnx \isin \mathbb{R}^nx∈Rn and ​y∈Rny \isin \mathbb{R}^ny∈Rn​. Then, vector subtraction can be given by:

x−y=x+(−y)=(x0x1...xn−1)−(y0y1...yn−1)=(x0−y0x1−y1...xn−1−yn−1)x - y = x + (-y) = \begin{pmatrix} x_0 \\ x_1 \\ ... \\ x_{n-1} \\ \end{pmatrix} - \begin{pmatrix} y_0 \\ y_1 \\ ... \\ y_{n-1} \\ \end{pmatrix} = \begin{pmatrix} x_0 - y_0 \\ x_1 - y_1 \\ ... \\ x_{n-1} - y_{n-1} \\ \end{pmatrix}x−y=x+(−y)=​x0​x1​...xn−1​​​−​y0​y1​...yn−1​​​=​x0​−y0​x1​−y1​...xn−1​−yn−1​​​

Taking the negative of yyy​ means it points in the opposite direction. Now, we have two vectors xxx​ and −y-y−y that are heel to toe. x−yx - yx−y​ can be given by going from heel of xxx​ to toe of −y-y−y​.

Same as for Vector Addition, the parallelogram method can be applied here as well. x−yx - yx−y​ is the other diagonal of the parallelogram.

One of the most commonly encountered operations when implementing more complex linear algebra operations is the scaled vector addition, which (given x∈Rnx \isin \mathbb{R}^nx∈Rn​ and y∈Rny \isin \mathbb{R}^ny∈Rn​)computes to:

y:=αx+y=α(x0x1...xn−1)+(y0y1...yn−1)=(αx0+y0αx1+y1...αxn−1+yn−1)y := \alpha x + y = \alpha \begin{pmatrix} x_0 \\ x_1 \\ ... \\ x_{n-1} \\ \end{pmatrix} + \begin{pmatrix} y_0 \\ y_1 \\ ... \\ y_{n-1} \\ \end{pmatrix} = \begin{pmatrix} \alpha x_0 + y_0 \\ \alpha x_1 + y_1 \\ ... \\ \alpha x_{n-1} + y_{n-1} \\ \end{pmatrix}y:=αx+y=α​x0​x1​...xn−1​​​+​y0​y1​...yn−1​​​=​αx0​+y0​αx1​+y1​...αxn−1​+yn−1​​​
y:=αx+y=2(1−23)+(−121)y := \alpha x + y \\ = 2 \begin{pmatrix} 1 \\ -2 \\ 3 \end{pmatrix} + \begin{pmatrix} -1 \\ 2 \\ 1 \end{pmatrix}y:=αx+y=2​1−23​​+​−121​​

Further, for each pair of one element from xxx and one element from yyy​, we need to read these elements and write the result back to yyy. That means 2 memops for reading, 1 more for writing.

Finally, for each pair of elements from xxx and yyy, we perform a multiplication and then an addition. That would be 2 floating point operations (flops).

Now, if the size of the vectors is n, then we would have to do 3n+13n+13n+1memops and 2n2n2n​ flops.

f(α,β)=(α+βα−β)f(\alpha, \beta) = \begin{pmatrix} \alpha + \beta \\ \alpha - \beta \end{pmatrix}f(α,β)=(α+βα−β​)
f(α,(χ0χ1χ2))=(χ0+αχ1+αχ2+α)f(\alpha, \begin{pmatrix} \chi_0 \\ \chi_1 \\ \chi_2 \end{pmatrix}) = \begin{pmatrix} \chi_0 + \alpha \\ \chi_1 + \alpha \\ \chi_2 + \alpha \\ \end{pmatrix}f(α,​χ0​χ1​χ2​​​)=​χ0​+αχ1​+αχ2​+α​​
f:Rn→Rmf: \mathbb{R}^n \rightarrow \mathbb{R}^mf:Rn→Rm
f(αβ)=(α+βα−β)f\begin{pmatrix} \alpha \\ \beta \end{pmatrix} = \begin{pmatrix} \alpha + \beta \\ \alpha - \beta \end{pmatrix}f(αβ​)=(α+βα−β​)
f(α(χ0χ1χ2))=g(αχ0χ1χ2)=(χ0+αχ1+αχ2+α)f\begin{pmatrix} \alpha \\ \begin{pmatrix} \chi_0 \\ \chi_1 \\ \chi_2 \end{pmatrix} \end{pmatrix} = g \begin{pmatrix} \alpha \\ \chi_0 \\ \chi_1 \\ \chi_2 \end{pmatrix} = \begin{pmatrix} \chi_0 + \alpha \\ \chi_1 + \alpha \\ \chi_2 + \alpha \end{pmatrix}f​α​χ0​χ1​χ2​​​​​=g​αχ0​χ1​χ2​​​=​χ0​+αχ1​+αχ2​+α​​

f:Rn→Rm,λ∈Randx∈Rnf:\mathbb{R}^n \rightarrow \mathbb{R}^m, \lambda \isin \mathbb{R} \hspace{1.5mm} and \hspace{1.5mm} x \isin \mathbb{R}^nf:Rn→Rm,λ∈Randx∈Rn, then f(λx)=λf(x)f(\lambda x) = \lambda f(x)f(λx)=λf(x)happens sometimes.

f:Rn→Rmandx,y∈Rnf:\mathbb{R}^n \rightarrow \mathbb{R}^m \hspace{1.5mm} and \hspace{1.5mm} x,y \isin \mathbb{R}^nf:Rn→Rmandx,y∈Rn, then f(x+y)=f(x)+f(y)f(x+y) = f(x) + f(y)f(x+y)=f(x)+f(y) happens sometimes.

C=A×BCij=∑kAikBkjC = A \times B \\ C_{ij} = \sum_k A_{ik}B_{kj}C=A×BCij​=k∑​Aik​Bkj​
C=A⊙BCij=AijBijC = A \odot B \\ C_{ij} = A_{ij}B_{ij}C=A⊙BCij​=Aij​Bij​
a⃗.b⃗=α=aTb=bTaα:=∑iaibi\vec{a}.\vec{b} = \alpha = a^Tb = b^Ta \\ \alpha := \sum_ia_ib_ia.b=α=aTb=bTaα:=i∑​ai​bi​
α:=(((0+a0b0)+a1b1)+...)+an−1bn−1\alpha : = (((0 + a_0b_0) + a_1b_1) + ...) + a_{n-1}b_{n-1}α:=(((0+a0​b0​)+a1​b1​)+...)+an−1​bn−1​

The cost of a dot product with vectors of size nnn is 2n2n2n memops and 2n2n2n​ flops.

B=ATBji=AijB = A^T \\ B_{ji} = A_{ij}B=ATBji​=Aij​
I=A−1A=AA−1I = A^{-1}A = AA^{-1}I=A−1A=AA−1
Page cover image