#import "@preview/unequivocal-ams:0.1.1": ams-article, theorem, proof

#show: ams-article.with(
  title: [A Digression on Abstract Linear Algebra],
  authors: (
    (
      name: "Youwen Wu",
      organization: [University of California, Santa Barbara],
      email: "youwen@ucsb.edu",
      url: "https://youwen.dev",
    ),
  ),
  bibliography: bibliography("refs.bib"),
)

= Introduction

Many introductory linear algebra classes focus on _application_. In general,
this is a red herring and is engineer-speak for "we will teach you how to
crunch numbers with no regard for conceptual understanding."

If you are a math major (or math-adjacent, such as Computer Science), this
class is essentially useless for you. You will learn how to perform trivial
numerical operations such as the _matrix multiplication_, _matrix-vector
multiplication_, _row reduction_, and other trite tasks better suited for
computers.

If you are taking this course, you might as well learn linear algebra properly.
Otherwise, you will have to re-learn it later on, anyways. Completing a math
course without gaining a theoretical appreciation for the topics at hand is an
unequivocal waste of time. I have prepared this brief crash course designed to
fill in the theoretical gaps left by this class.

= Basic Notions

== Vector spaces

Before we can understand vectors, we need to first discuss _vector spaces_. Thus
far, you have likely encountered vectors primarily in physics classes,
generally in the two-dimensional plane. You may conceptualize them as arrows in
space. For vectors of size $>3$, a hand waving argument is made that they are
essentially just arrows in higher dimensional spaces.

It is helpful to take a step back from this primitive geometric understanding
of the vector. Let us build up a rigorous idea of vectors from first
principles.

=== Vector axioms

The so-called _axioms_ of a _vector space_ (which we'll call the vector space
$V$) are as follows:

#enum[
  Commutativity: $u + v = v + u, " " forall u,v in V$
][
  Associativity: $(u + v) + w = u + (v + w), " " forall u,v,w in V$
][
  Zero vector: $exists$ a special vector, denoted $0$, such that $v + 0 = v, " " forall v in V$
][
  Additive inverse: $forall v in V, " " exists w in V "such that" v + w = 0$. Such an additive inverse is generally denoted $-v$
][
  Multiplicative identity: $1 v = v, " " forall v in V$
][
  Multiplicative associativity: $(alpha beta) v = alpha (beta v) " " forall v in V, "scalars" alpha, beta$
][
  Distributive property for vectors: $alpha (u + v) = alpha u + alpha v " " forall u,v in V, "scalars" alpha$
][
  Distributive property for scalars: $(alpha + beta) v = alpha v + beta v " " forall v in V, " scalars" alpha, beta$
]

It is easy to show that the zero vector $0$ and the additive inverse $-v$ are
_unique_. We leave the proof of this fact as an exercise.

These may seem difficult to memorize, but they are essentially the same
familiar algebraic properties of numbers you know from high school. The
important thing to remember is which operations are valid for what objects. For
example, you cannot add a vector and scalar, as it does not make sense.

_Remark_. For those of you versed in computer science, you may recognize this
as essentially saying that you must ensure your operations are _type-safe_.
Adding a vector and scalar is not just false, it is an _invalid question_
entirely because vectors and scalars and different types of mathematical
objects. See #cite(<chen2024digression>, form: "prose") for more.

=== Vectors big and small

In order to begin your descent into what mathematicians colloquially recognize
as _abstract vapid nonsense_, let's discuss which fields constitute a vector space. We
have the familiar space where all scalars are real numbers, or $RR$. We
generally discuss 2-D or 3-D vectors, corresponding to vectors of length 2 or
3; in our case, $RR^2$ and $RR^3$.

However, vectors in $RR$ can really be of any length. Discard your primitive
conception of vectors as arrows in space. Vectors are simply arbitrary length
lists of numbers (for the computer science folk: think C++ `std::vector`).

_Example_. $ vec(1,2,3,4,5,6,7,8,9) $

Moreover, vectors need not be in $RR$ at all. Recall that a vector space need
only satisfy the aforementioned _axioms of a vector space_.

_Example_. The vector space $CC$ is similar to $RR$, except it includes complex
numbers. All complex vector spaces are real vector spaces (as you can simply
restrict them to only use the real numbers), but not the other way around.

In general, we can have a vector space where the scalars are in an arbitrary
field $FF$, as long as the axioms are satisfied.

_Example_. The vector space of all polynomials of degree 3, or $PP^3$. It is
not yet clear what this vector may look like. We shall return to this example
once we discuss _basis_.

== Vector addition. Multiplication

Vector addition, represented by $+$, and multiplication, represented by the
$dot$ (dot) operator, can be done entrywise.

_Example._

$
  vec(1,2,3) + vec(4,5,6) = vec(1 + 4, 2 + 5, 3 + 6) = vec(5,7,9)
$
$
  vec(1,2,3) dot vec(4,5,6) = vec(1 dot 4, 2 dot 5, 3 dot 6) = vec(4,10,18)
$

This is simple enough to understand. Again, the difficulty is simply ensuring
that you always perform operations with the correct _types_. For example, once
we introduce matrices, it doesn't make sense to multiply or add vectors and
matrices in this fashion.

== Vector-scalar multiplication

Multiplying a vector by a scalar simply results in each entry of the vector
being multiplied by the scalar.

_Example_.

$ beta vec(a, b, c) = vec(beta dot a, beta dot b, beta dot c) $

== Matrices

Before discussing any properties of matrices, let's simply reiterate what we
learned in class about their notation. We say a matrix with rows of length $m$,
and columns of size $n$ (in less precise terms, a matrix with length $m$ and
height $n$) is a $m times n$ matrix.

Given a matrix

$ A = mat(1,2,3;4,5,6;7,8,9) $

we refer to the entry in row $j$ and column $k$ as $A_(j,k)$ .

=== Matrix transpose

A formalism that is useful later on is called the _transpose_, and we obtain it
from a matrix $A$ by switching all the rows and columns. More precisely, each
row becomes a column instead. We use the notation $A^T$ to represent the
transpose of $A$.

$
  mat(1,2,3;4,5,6)^T = mat(1,4;2,5;3,6)
$

Formally, we can say $(A_(j,k))^T = A_(k,j)$.