Learn Linear Algebra

Theorem

Let A A , B B , and C C be matrices of the same size, and let r r and s s be scalars.

  1. A+B=B+A A + B = B + A
  2. (A+B)+C=A+(B+C) (A + B) + C = A + (B + C)
  3. A+0=A A + 0 = A
  4. r(A+B)=rA+rB r(A + B) = rA + rB
  5. (r+s)A=rA+sA (r + s)A = rA + sA
  6. r(sA)=(rs)A r(sA) = (rs)A

Proof (a):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} , B=[b1b2bn] B = \begin{bmatrix} \vec{b}_1 & \vec{b}_2 & \dots & \vec{b}_n \end{bmatrix}

A+B=[a1a2an]+[b1b2bn] A + B = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} + \begin{bmatrix} \vec{b}_1 & \vec{b}_2 & \dots & \vec{b}_n \end{bmatrix}

=[a1+b1a2+b2an+bn] = \begin{bmatrix} \vec{a}_1 + \vec{b}_1 & \vec{a}_2 + \vec{b}_2 & \dots & \vec{a}_n + \vec{b}_n \end{bmatrix}

=[b1+a1b2+a2bn+an] = \begin{bmatrix} \vec{b}_1 + \vec{a}_1 & \vec{b}_2 + \vec{a}_2 & \dots & \vec{b}_n + \vec{a}_n \end{bmatrix}

=B+A = B + A

Proof (b):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} , B=[b1b2bn] B = \begin{bmatrix} \vec{b}_1 & \vec{b}_2 & \dots & \vec{b}_n \end{bmatrix} , and C=[c1c2cn] C = \begin{bmatrix} \vec{c}_1 & \vec{c}_2 & \dots & \vec{c}_n \end{bmatrix}

(A+B)+C=([a1+b1a2+b2an+bn])+[c1c2cn] (A + B) + C = \left( \begin{bmatrix} \vec{a}_1 + \vec{b}_1 & \vec{a}_2 + \vec{b}_2 & \dots & \vec{a}_n + \vec{b}_n \end{bmatrix} \right) + \begin{bmatrix} \vec{c}_1 & \vec{c}_2 & \dots & \vec{c}_n \end{bmatrix}

=[(a1+b1)+c1(a2+b2)+c2(an+bn)+cn] = \begin{bmatrix} (\vec{a}_1 + \vec{b}_1) + \vec{c}_1 & (\vec{a}_2 + \vec{b}_2) + \vec{c}_2 & \dots & (\vec{a}_n + \vec{b}_n) + \vec{c}_n \end{bmatrix}

=[a1+(b1+c1)a2+(b2+c2)an+(bn+cn)] = \begin{bmatrix} \vec{a}_1 + (\vec{b}_1 + \vec{c}_1) & \vec{a}_2 + (\vec{b}_2 + \vec{c}_2) & \dots & \vec{a}_n + (\vec{b}_n + \vec{c}_n) \end{bmatrix}

=[a1a2an]+([b1b2bn]+[c1c2cn]) = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} + \left( \begin{bmatrix} \vec{b}_1 & \vec{b}_2 & \dots & \vec{b}_n \end{bmatrix} + \begin{bmatrix} \vec{c}_1 & \vec{c}_2 & \dots & \vec{c}_n \end{bmatrix} \right)

=A+(B+C) = A + (B + C)

Proof (c):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} .

A+0=[a1a2an]+[000] A + 0 = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} + \begin{bmatrix} \vec{0} & \vec{0} & \dots & \vec{0} \end{bmatrix}

=[a1+0a2+0an+0] = \begin{bmatrix} \vec{a}_1 + \vec{0} & \vec{a}_2 + \vec{0} & \dots & \vec{a}_n + \vec{0} \end{bmatrix}

=[a1a2an] = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix}

=A = A .

Proof (d):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} and B=[b1b2bn] B = \begin{bmatrix} \vec{b}_1 & \vec{b}_2 & \dots & \vec{b}_n \end{bmatrix} .

r(A+B)=r([a1+b1a2+b2an+bn]) r(A + B) = r\left( \begin{bmatrix} \vec{a}_1 + \vec{b}_1 & \vec{a}_2 + \vec{b}_2 & \dots & \vec{a}_n + \vec{b}_n \end{bmatrix} \right)

=[r(a1+b1)r(a2+b2)r(an+bn)] = \begin{bmatrix} r(\vec{a}_1 + \vec{b}_1) & r(\vec{a}_2 + \vec{b}_2) & \dots & r(\vec{a}_n + \vec{b}_n) \end{bmatrix}

=[ra1+rb1ra2+rb2ran+rbn] = \begin{bmatrix} r\vec{a}_1 + r\vec{b}_1 & r\vec{a}_2 + r\vec{b}_2 & \dots & r\vec{a}_n + r\vec{b}_n \end{bmatrix}

=rA+rB = rA + rB .

Proof (e):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} .

(r+s)A=(r+s)[a1a2an] (r + s)A = (r + s)\begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix}

=[(r+s)a1(r+s)a2(r+s)an] = \begin{bmatrix} (r + s)\vec{a}_1 & (r + s)\vec{a}_2 & \dots & (r + s)\vec{a}_n \end{bmatrix}

=[ra1+sa1ra2+sa2ran+san] = \begin{bmatrix} r\vec{a}_1 + s\vec{a}_1 & r\vec{a}_2 + s\vec{a}_2 & \dots & r\vec{a}_n + s\vec{a}_n \end{bmatrix}

=[ra1ra2ran]+[sa1sa2san] = \begin{bmatrix} r\vec{a}_1 & r\vec{a}_2 & \dots & r\vec{a}_n \end{bmatrix} + \begin{bmatrix} s\vec{a}_1 & s\vec{a}_2 & \dots & s\vec{a}_n \end{bmatrix}

=rA+sA = rA + sA .

Proof (f):

Let A=[a1a2an] A = \begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} .

r(sA)=r(s[a1a2an]) r(sA) = r\left(s\begin{bmatrix} \vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_n \end{bmatrix} \right)

=r[sa1sa2san] = r\begin{bmatrix} s\vec{a}_1 & s\vec{a}_2 & \dots & s\vec{a}_n \end{bmatrix}

=[r(sa1)r(sa2)r(san)] = \begin{bmatrix} r(s\vec{a}_1) & r(s\vec{a}_2) & \dots & r(s\vec{a}_n) \end{bmatrix}

=[(rs)a1(rs)a2(rs)an] = \begin{bmatrix} (rs)\vec{a}_1 & (rs)\vec{a}_2 & \dots & (rs)\vec{a}_n \end{bmatrix}

=(rs)A = (rs)A .

Theorem

Let A A and B B denote matrices whose sizes are appropriate for the following sums and products.

  1. (AT)T=A (A^T)^T = A
  2. (A+B)T=AT+BT (A + B)^T = A^T + B^T
  3. For any scalar r r , (rA)T=rAT (rA)^T = rA^T
  4. (AB)T=BTAT (AB)^T = B^T A^T

Proof (a):

Let A A be an m×n m \times n matrix. By the definition of the transpose, the (i,j)(i, j)-th entry of AT A^T is (AT)ij=Aji (A^T)_{ij} = A_{ji} .

Taking the transpose again, the (i,j)(i, j)-th entry of (AT)T (A^T)^T becomes ((AT)T)ij=(AT)ji=Aij ((A^T)^T)_{ij} = (A^T)_{ji} = A_{ij} .

Then (AT)T=A (A^T)^T = A .

Proof (b):

Let A A and B B be m×n m \times n matrices. The (i,j)(i, j)-th entry of A+B A + B is (A+B)ij=Aij+Bij (A + B)_{ij} = A_{ij} + B_{ij} .

Taking the transpose, the (i,j)(i, j)-th entry of (A+B)T (A + B)^T is ((A+B)T)ij=(A+B)ji=Aji+Bji ((A + B)^T)_{ij} = (A + B)_{ji} = A_{ji} + B_{ji} .

The (i,j)(i, j)-th entry of AT+BT A^T + B^T is also (AT+BT)ij=Aji+Bji (A^T + B^T)_{ij} = A_{ji} + B_{ji} .

So (A+B)T=AT+BT (A + B)^T = A^T + B^T .

Proof (c):

Let A A be an m×n m \times n matrix and r r be a scalar. The (i,j)(i, j)-th entry of rA rA is (rA)ij=r(Aij) (rA)_{ij} = r(A_{ij}) .

Taking the transpose, the (i,j)(i, j)-th entry of (rA)T (rA)^T is ((rA)T)ij=(rA)ji=r(Aji) ((rA)^T)_{ij} = (rA)_{ji} = r(A_{ji}) .

The (i,j)(i, j)-th entry of rAT rA^T is (rAT)ij=r(AT)ij=r(Aji) (rA^T)_{ij} = r(A^T)_{ij} = r(A_{ji}) .

Then (rA)T=rAT (rA)^T = rA^T .

Proof (d):

Let A A be an m×n m \times n matrix and B B be an n×p n \times p matrix. The (i,j)(i, j)-th entry of AB AB is (AB)ij=k=1nAikBkj (AB)_{ij} = \sum_{k=1}^n A_{ik}B_{kj} .

Taking the transpose, the (i,j)(i, j)-th entry of (AB)T (AB)^T is ((AB)T)ij=(AB)ji=k=1nAjkBki ((AB)^T)_{ij} = (AB)_{ji} = \sum_{k=1}^n A_{jk}B_{ki} .

The (i,j)(i, j)-th entry of BTAT B^T A^T is (BTAT)ij=k=1n(BT)ik(AT)kj (B^T A^T)_{ij} = \sum_{k=1}^n (B^T)_{ik}(A^T)_{kj} .

By the definition of the transpose, (BT)ik=Bki (B^T)_{ik} = B_{ki} and (AT)kj=Ajk (A^T)_{kj} = A_{jk} .

Then (BTAT)ij=k=1nBkiAjk (B^T A^T)_{ij} = \sum_{k=1}^n B_{ki}A_{jk} .

So (AB)T=BTAT (AB)^T = B^T A^T .

Theorem

Let A A be an m×n m \times n matrix, and let B B and C C have sizes for which the indicated sums and products are defined.

  1. A(BC)=(AB)C A(BC) = (AB)C (associative law of multiplication)
  2. A(B+C)=AB+AC A(B + C) = AB + AC (left distributive law)
  3. (B+C)A=BA+CA (B + C)A = BA + CA (right distributive law)
  4. r(AB)=(rA)B=A(rB) r(AB) = (rA)B = A(rB) for any scalar r r
  5. ImA=A=AIn I_m A = A = A I_n (identity for matrix multiplication)

Proof (a):

Let A A be an m×n m \times n matrix, B B an n×p n \times p matrix, and C C a p×q p \times q matrix.

Define the product BC BC as: (BC)ij=k=1pbikckj. (BC)_{ij} = \sum_{k=1}^p b_{ik} c_{kj}. Now, multiply A A with BC BC . The (i,j)(i, j)-th entry of A(BC) A(BC) is: (A(BC))ij=k=1naik(BC)kj=k=1naik(l=1pbklclj).=k=1nl=1paikbklclj. (A(BC))_{ij} = \sum_{k=1}^n a_{ik} (BC)_{kj} = \sum_{k=1}^n a_{ik} \left( \sum_{l=1}^p b_{kl} c_{lj} \right). \\= \sum_{k=1}^n \sum_{l=1}^p a_{ik} b_{kl} c_{lj}. Next, compute AB AB first. The (i,k)(i, k)-th entry of AB AB is: (AB)il=k=1naikbkl. (AB)_{il} = \sum_{k=1}^n a_{ik} b_{kl}. Now multiply AB AB with C C . The (i,j)(i, j)-th entry of (AB)C (AB)C is: ((AB)C)ij=l=1p(AB)ilclj=l=1p(k=1naikbkl)clj.=k=1nl=1paikbklclj. ((AB)C)_{ij} = \sum_{l=1}^p (AB)_{il} c_{lj} = \sum_{l=1}^p \left( \sum_{k=1}^n a_{ik} b_{kl} \right) c_{lj}. \\= \sum_{k=1}^n \sum_{l=1}^p a_{ik} b_{kl} c_{lj}. Since the entries of A(BC) A(BC) and (AB)C (AB)C are identical, we have: A(BC)=(AB)C. A(BC) = (AB)C.

Proof (b):

Let A A , B B , and C C be matrices such that A(B+C) A(B + C) is defined.

By definition of matrix addition: B+C=[bij+cij]. B + C = \begin{bmatrix} b_{ij} + c_{ij} \end{bmatrix}. A(B+C)=A[bij+cij]=[j=1naij(bjk+cjk)]. A(B + C) = A\begin{bmatrix} b_{ij} + c_{ij} \end{bmatrix} = \begin{bmatrix} \sum_{j=1}^n a_{ij}(b_{jk} + c_{jk}) \end{bmatrix}. Distributing: A(B+C)=[j=1naijbjk]+[j=1naijcjk]=AB+AC. A(B + C) = \begin{bmatrix} \sum_{j=1}^n a_{ij}b_{jk} \end{bmatrix} + \begin{bmatrix} \sum_{j=1}^n a_{ij}c_{jk} \end{bmatrix} \\= AB + AC.

Proof (c):

Let A A , B B , and C C be matrices such that (B+C)A (B + C)A is defined.

B+C=[bij+cij]. B + C = \begin{bmatrix} b_{ij} + c_{ij} \end{bmatrix}. (B+C)A=[bij+cij]A=[j=1n(bij+cij)ajk]. (B + C)A = \begin{bmatrix} b_{ij} + c_{ij} \end{bmatrix}A = \begin{bmatrix} \sum_{j=1}^n (b_{ij} + c_{ij})a_{jk} \end{bmatrix}. Distributing: (B+C)A=[j=1nbijajk]+[j=1ncijajk]=BA+CA. (B + C)A = \begin{bmatrix} \sum_{j=1}^n b_{ij}a_{jk} \end{bmatrix} + \begin{bmatrix} \sum_{j=1}^n c_{ij}a_{jk} \end{bmatrix} \\= BA + CA.

Proof (d):

Let A A and B B be matrices such that AB AB is defined, and let r r be a scalar.

By definition of scalar multiplication: r(AB)=r[j=1naijbjk]=[rj=1naijbjk]. r(AB) = r\begin{bmatrix} \sum_{j=1}^n a_{ij}b_{jk} \end{bmatrix} = \begin{bmatrix} r\sum_{j=1}^n a_{ij}b_{jk} \end{bmatrix}. Distributing r r : r(AB)=[j=1n(raij)bjk]=(rA)B, r(AB) = \begin{bmatrix} \sum_{j=1}^n (ra_{ij})b_{jk} \end{bmatrix} = (rA)B, and similarly: r(AB)=[j=1naij(rbjk)]=A(rB). r(AB) = \begin{bmatrix} \sum_{j=1}^n a_{ij}(rb_{jk}) \end{bmatrix} = A(rB). Thus, r(AB)=(rA)B=A(rB) r(AB) = (rA)B = A(rB) .

Proof (e):

Let A A be an m×n m \times n matrix, and let Im I_m and In I_n be the identity matrices of size m×m m \times m and n×n n \times n , respectively.

The columns of A A can be written as Aej A\vec{e}_j , where ej \vec{e}_j is the j j -th standard basis vector.

Multiplying Im I_m with A A : ImA=[e1em]A=A. I_m A = \begin{bmatrix} \vec{e}_1 & \cdots & \vec{e}_m \end{bmatrix} A = A. Similarly: AIn=A[e1en]=A. A I_n = A\begin{bmatrix} \vec{e}_1 & \cdots & \vec{e}_n \end{bmatrix} = A. Thus, ImA=A I_m A = A and AIn=A A I_n = A .