Almost Orthogonal Matrices

\(\operatorname{AlmostOrthogonal}(n,k,\lambda)\) is the manifold matrices with singular values in the interval \((1-\lambda, 1+\lambda)\) for a \(\lambda \in [0,1]\).

\[\operatorname{AlmostOrthogonal}(n,k,\lambda) = \{X \in \mathbb{R}^{n\times k}\:\mid\:\left|1-\sigma_i(X)\right| < \lambda,\ i=1, \dots, k\}\]

It is realized via an SVD-like factorization:

\[\begin{split}\begin{align*} \pi \colon \operatorname{St}(n,k) \times \mathbb{R}^k \times \operatorname{SO}(k) &\to \operatorname{AlmostOrthogonal}(n,k,\lambda) \\ (U, \Sigma, V) &\mapsto Uf_\lambda(\Sigma) V^\intercal \end{align*}\end{split}\]

where we have identified the vector \(\Sigma\) with a diagonal matrix in \(\mathbb{R}^{k \times k}\). The function \(f_\lambda\colon \mathbb{R} \to (1-\lambda, 1+\lambda)\) takes a function \(f\colon \mathbb{R} \to (-1, +1)\) and rescales it to be a function on \((1-\lambda, 1+\lambda)\) as

\[f_\lambda(x) = 1+\lambda f(x).\]

The function \(f_\lambda\) is then applied element-wise to the diagonal of \(\Sigma\).

If \(\lambda = 1\) is chosen, the resulting space is not a manifold, although this should not hurt optimization in practice.

Warning

In the limit \(\lambda = 0\), the resulting manifold is exactly Special Orthogonal Group. For this reason, we discourage the use of small values of \(\lambda\) as the algorithm in this class becomes numerically unstable for very small \(\lambda\). We recommend to use geotorch.SO rather than this one in this scenario.

Note

There are no restrictions in place for the image of the function \(f\). For a function \(f\) with image \([a,b]\), the function \(f_\lambda\) will take values in \([\lambda (1+a), \lambda (1+b)]\). As such, rescaling the function \(f\), one may use this class to perform optimization with singular values constrained to any prescribed interval of \(\mathbb{R}_{\geq 0}\).

class geotorch.AlmostOrthogonal(size, lam, f='sin', triv='expm')[source]

Manifold of matrices with singular values in the interval \((1-\lambda, 1+\lambda)\).

The possible default maps are the \(\sin,\,\tanh\) functions and a scaled sigmoid. The sigmoid is scaled as \(\operatorname{scaled\_sigmoid}(x) = 2\sigma(x) - 1\) where \(\sigma\) is the usual sigmoid function. This is done so that the image of the scaled sigmoid is \((-1, 1)\).

Parameters:

size (torch.size) – Size of the tensor to be parametrized
lam (float) – Radius of the interval. A float in the interval \((0, 1]\)
f (str or callable or pair of callables) –
Optional. Either:
- One of ["scaled_sigmoid", "tanh", "sin"]
- A callable that maps real numbers to the interval \((-1, 1)\)
- A pair of callables such that the first maps the real numbers to \((-1, 1)\) and the second is a (right) inverse of the first
Default: "sin"
triv (str or callable) – Optional. A map that maps skew-symmetric matrices onto the orthogonal matrices surjectively. This is used to optimize the \(U\) and \(V\) in the SVD. It can be one of ["expm", "cayley"] or a custom callable. Default: "expm"

sample(distribution='uniform', init_=None)[source]

Returns a randomly sampled orthogonal matrix according to the specified distribution. The options are:

"uniform": Samples a tensor distributed according to the Haar measure on \(\operatorname{SO}(n)\)

"torus": Samples a block-diagonal skew-symmetric matrix. The blocks are of the form \(\begin{pmatrix} 0 & b \\ -b & 0\end{pmatrix}\) where \(b\) is distributed according to init_. This matrix will be then projected onto \(\operatorname{SO}(n)\) using self.triv

The output of this method can be used to initialize a parametrized tensor that has been parametrized with this or any other manifold as:

>>> layer = nn.Linear(20, 20)
>>> M = AlmostOrthogonal(layer.weight.size(), lam=0.5)
>>> torch.nn.utils.parametrize.register_parametrization(layer, "weight", M)
>>> layer.weight = M.sample()

Parameters:

distribution (string) – Optional. One of ["uniform", "torus"]. Default: "uniform"
init_ (callable) – Optional. To be used with the "torus" option. A function that takes a tensor and fills it in place according to some distribution. See torch.init. Default: \(\operatorname{Uniform}(-\pi, \pi)\)

in_manifold(X, eps=1e-05)

Checks that a given matrix is in the manifold.

Parameters:

X (torch.Tensor or tuple) – The input matrix or matrices of shape (*, n, k).
eps (float) – Optional. Threshold at which the singular values are considered to be zero Default: 1e-5