Rotation matrix to angle-axis

This time I'm gonna go through a process of extracting rotation axis and angle from the rotation matrix R \in \mathbb{R}^3. Lets recap some properties of the rotation matrix that we will be using.

  • Is a square matrix.
  • Is normalized, column and row vectors have unit length.
  • Is orthogonal, column vectors are mutually orthogonal to each other. The same holds good for rows. In other words the dot product of any two pairs of row or column vectors is 0
  • For orthogonal matrix holds that RR^T=I \Rightarrow R^T = R^{-1}
  • det(R)=1 or -1 for reflection.

To find the axis and angle of rotation we will call eigenvectors and eigenvalues for help. The eigenvalues of an orthogonal rotation matrix must satisfy one of the following:

  1. All eigenvalues are 1.
  2. One eigenvalue is 1 and the other two are -1. This corresponds to rotation angle \phi=\pi
  3. One eigenvalue is 1 and the other two are complex conjugates of the form e^{i\phi} and e^{-i\phi}
  4. The rotation axis is the eigenvector corresponding to the eigenvalue 1

Angle of rotation from eigenvalues

As long as the matrix is orthogonal which is true for our rotation matrix the trace (sum of the diagonal elements) is independent of the coordinate system used. In other words the trace is independent of the axis of rotation, it depends only on the rotation angle. We also know the sum of eigenvalues equals to the trace of the matrix.

    \[tr(R) = 1+e^{i\phi}+e^{-i\phi} \]

    \[tr(R) = 1+\cos\phi+i\sin\phi+\cos\phi-i\sin\phi = 1+2\cos\phi\]

    \[tr(R) = 1+2\cos\phi \quad \Rightarrow \quad \cos\phi = \frac{tr(R)-1}{2}\]

so the rotation angle is

    \[\phi = \cos^{-1}\left(\frac{tr(R)-1}{2}\right)\]

Angle of rotation from Rodrigues' formula

We can also derive this equation from the Rodrigues' formula. By doing so we will also prove that the eigenvalues of the rotation matrix are 1, \quad e^{i\phi}, \quad e^{-i\phi}.

    \[R = I + [n]_\times \sin \phi +[n]_\times^2\left(1-\cos \phi \right)\]

As we said before the trace is independent of the axis of rotation, it depends only on the rotation angle. Lets do the trace of both sides of this equation.

    \[tr(R) = tr(I + [n]_\times \sin \phi +[n]_\times^2\left(1-\cos \phi \right))\]

The trace of tr(I) = 3 and the trace of any skew symmetric matrix is zero tr([n]_\times)=0 We also substitute [n]_\times^2=nn^T-I The reason for that is that the tr(nn^T)=1 which helps us during the simplification.

    \[tr(R) = tr(I + [n]_\times \sin \phi + (nn^T-I)\left(1-\cos \phi \right))\]

expand the (nn^T-I)\left(1-\cos \phi \right)

    \[tr(R) = tr(I + [n]_\times \sin \phi + nn^T-I-nn^T\cos \phi + I\cos \phi )\]

and simplify...

    \[tr(R) = 3 + 1 - 3 - \cos \phi + 3\cos \phi = 1  + 2\cos \phi\]

Earlier we saw that sum of the eigenvalues is

    \[tr(R) = 1+e^{i\phi}+e^{-i\phi} \]

From the Rodrigues' formula we proved that

    \[tr(R)=1  + 2\cos \phi\]

And again we get the same equation for the rotation angle

    \[\phi = \cos^{-1}\left(\frac{tr(R)-1}{2}\right)\]

From the above we also proved that the sum of eigenvalues  equals to 1  + 2\cos \phi

    \[1  + 2\cos \phi = 1+e^{i\phi}+e^{-i\phi} \]

There is one issue here. If you invert the rotation axis and negate the rotation angle you get the same rotation. This means you can get two possible angle/axis pairs corresponding to the same rotation.

Rotation axis

There are several ways we can find the rotation axis. Perhaps the easiest to understand is the following.

When we multiply the rotation matrix by a vector which is aligned with the rotation axis of the matrix the product will be the same vector. This vector will not move or get scaled.  Let label this vector n

    \[Rn = n\]

By moving things around we get

    \[0=In - Rn = (I-R)n\]

Let's define new matrix B=(I-R) We can see that when the matrix B gets multiplied by the vector n the product is 0.

    \[Bn=0 \quad \text{for} \quad n \not= 0\]

We say the vector n belongs to the nullspace of the matrix B or it is a kernel of the matrix B. If B is invertible the only solution would be n=0 (trivial solution) which is not what we want. The matrix B is a result of I-R where R is orthogonal matrix with determinant=1 which makes the B matrix singular -> determinant of B is zero. In this case we can find the solution as such n \not= 0 This is called non trivial solution and it is our axis of rotation.

If the singular matrix B \in \mathbb{R}^3 has at least two linearly independent rows, then the vector in the nullspace of B is a cross product of these rows. In other words the cross product of two vectors in \mathbb{R}^3 always lies in the nullspace of the matrix with the vectors as rows. Why is that? Lets look inside the multiplication of the matrix B and the vector n

    \[Bn = \begin{bmatrix}a & b & c\\d&e&f\\g&h&i\end{bmatrix} \begin{bmatrix}x\\y\\z \end{bmatrix}=\begin{bmatrix}ax + by + cz\\dx+ey+fz\\gx+hy+iz\end{bmatrix}\]

If B_{j} is a jth row of B then the dot product of each row with the vector n must be zero.

    \[Bn = \begin{bmatrix}B_{1}\cdot n\\B_{2}\cdot n\\B_{3}\cdot n\end{bmatrix} = \begin{bmatrix}0\\0\\0\end{bmatrix}\]

This means that the vector n must be orthogonal to each row in the matrix B. Our Matrix B is singular det(B)=0 thus at least one vector is linearly dependent.  Indeed each row vector in our matrix is a linear combination of the other two. In fact all row vectors lie in a plane. It should not be a surprise that this holds good for the columns as well since we started with an orthogonal matrix.

We said that the vector n must be an orthogonal to each row vector of the B to satisfy Bn=0 Now, this is quiet easy to find since all row vectors lie in the plane so we just need to find a perpendicular vector to that plane. To do that we take a cross product of two rows, for example the first two and then normalize it.

    \[n = B_{1}\times B_{2}\]

and normalize it

    \[\hat n = \frac{n}{|n|}\]

The \hat n is our normalized axis of rotation.

What we have just done was in essence extraction of eigenvector belonging to the eigenvalue = 1.  As we said the axis of rotation has this property

    \[Rn = n\]

On first looks you can see that the vector n is an eigenvector with a corresponding eigenvalue \lambda=1.

    \[Rn = \lambda n\]

Using Cayley transformation

To get the axis of rotation we can use the Cayley formula

    \[[n]_\times= (R - I)(R + I)^{-1}\]

Where [n]_\times is a skew symmetric matrix representing the axis of rotation with components n=(x,y,z)=(-n_{2,3}, n_{1,3}, -n_{1,2}) We can also go back to the rotation matrix

    \[R= (I + [n]_\times)(I - [n]_\times)^{-1} \]

The matrix [n]_\times is scaled by tan(\phi / 2) which we can use to normalize the axis.

Rotation axis from Rodrigues' formula

Other option is to use this relation

    \[R-R^T  = I + [n]_\times \sin \phi +[n]_\times^2\left(1-\cos \phi \right) - (I + [n]_\times \sin \phi +[n]_\times^2\left(1-\cos \phi \right))^T\]

    \[R-R^T  = I + [n]_\times \sin \phi +[n]_\times^2\left(1-\cos \phi \right) - I - [n]_\times^T \sin \phi - [n]_\times^2^T\left(1-\cos \phi \right)\]

    \[R-R^T  = ([n]_\times-[n]_\times^T) \sin \phi + ([n]_\times^2- [n]_\times^2^T)\left(1-\cos \phi \right) \]

Since [n]_\times is skew symmetric then [n]_\times = -[n]_\times^T This rule helps us to simplify the following

    \[[n]_\times-[n]_\times^T = 2[n]_\times\]

We also know that the outer product is

    \[nn^T=[n]_\times^2+I \quad \Rightarrow \quad [n]_\times^2=nn^T-I\]

Using these substitutions we get

    \[R-R^T  = 2[n]_\times \sin \phi + (nn^T-I-(nn^T-I)^T)\left(1-\cos \phi \right) \]

Since nn^T is symmetric then nn^T = (nn^T)^T

    \[R-R^T  = 2[n]_\times \sin \phi + (nn^T-(nn^T)^T)\left(1-\cos \phi \right) =  2[n]_\times \sin \phi\]

 And to get the normalized axis of rotation

    \[[n]_\times = \frac{R-R^T}{2\sin(\phi)}\]

From this we can see the axis of rotation is actually only R-R^T with length |R-R^T|=2\sin(\phi)

If we don't know \phi we can use the length  |R-R^T| to normalize the axis vector.

Now we need to convert the skew symmetric matrix back to the vector representation [n]_\times \rightarrow n. We know

    \[[n]_\times=\begin{pmatrix}0 & -n_3 & n_2\\n_3 & 0 & -n_1\\-n_2 & n_1 & 0\end{pmatrix}\]

so the n is

    \[n= [-[n]_\times(2,3),  [n]_\times(1,3), -[n]_\times(1,2)]\]

This entry was posted in Computer Vision. Bookmark the permalink.

Leave a Reply