In case the camera undergoes rotation only around its optical center, the homography transformation has a really simple form—it's basically a rotation matrix, but is multiplied by camera matrix parameters since homography works in image pixel space. As a first step, we factor out camera parameters from the homography matrix. After that, it must be a rotation matrix (up to scale). Since there might be noise in the homography parameters, the resulting matrix might not be a proper rotation matrix, for example, an orthogonal matrix with a determinant equal to one. That's why we construct the closest (in the Frobenius norm) rotation matrix using a singular value decomposition.
The following shows the expected results:
Rotation vector: ...