I’ll try to give examples of efficient gaussian process computation here, like the vec trick (Kronecker product trick), efficient toeliptz and circulant matrix computations, RTS smoothing and Kalman filtering using state space representations, and so on.

Fast Toeplitz Matrix-Vector Products and Solving

Circulant matrix-vector products can be fast due to the way circulant matrices can be decomposed using their fourier transforms. Toeplitz matrices can be ‘embedded’ into a circulant matrix and their matrix-vector products can be computed efficiently too. Sample code is shown below.

Main reference:

J. Dongarra, P. Koev, and X. Li, “Matrix-Vector and Matrix-Matrix Multiplications”. In Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, editors. “Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide”. SIAM, Philadelphia, 2000. Available online.

This can then be used to compute multivariate log-densities quickly. Furthermore, this multiplication can then be used in conjugate gradient solvers to efficiently compute inverse-matrix-vector products in \(O(n \log n)\) time and \(O(n)\) space.

Python Code
import numpy as np
from scipy.linalg import toeplitz

def toeplitz_matmul(c, r = None, v = None):
	
	if v is None: v = np.zeros(c.shape)
	if r is None: r = c.conj()
        if len(v.shape) == 1: v = v.reshape(-1, 1)

	n = c.shape[0]; assert c.shape == r.shape
	
	embeded_col = np.hstack((c, np.flip(r[1:])))
	padded_v = np.vstack((v, np.zeros((n - 1, v.shape[1]))))
	
	fft_T = np.fft.fft(embeded_col, axis = 0).reshape(-1, 1)
	fft_v = np.fft.fft(padded_v, axis = 0)
	return np.fft.ifft(fft_T*fft_v, axis = 0).real[:n, :]

I’m trying to contribute some code to SciPy, pull request can be found here.

Toeplitz matrices can be inverted in quadratic time using the Levinson algorithm, that also has the capability of returning a determinant as a side product using no more computation (as I understand).

Some code that calculates this log determinant can be found here:

Marano S, Edwards B, Ferrari G and Fah D (2017), “Fitting Earthquake Spectra: Colored Noise and Incomplete Data”, Bulletin of the Seismological Society of America., January, 2017. Vol. 107(1), pp. 276-291. Implementation available online..

The fast matrix-vector multiplication can be used with a preconditioned conjugate gradient algorithm to implement fast inverse-vector products.

Toeplitz Matrix Cholesky Decomposition

… and also circulant matrix solving in the comments (using scipy and ctypes).

I got the toeplitz_cholesky library from here and compiled it. I’m going to check out toeblitz in the future. I believe that the following algorithm calculates the cholesky decomposition in \(O(n^2)\) time.

C/Python Code
# in bash: clang -shared -fpic toeplitz_cholesky.c -o tc.dylib -O3

import time
import numpy as np
import matplotlib.pyplot as plt

plt.style.use("ggplot"); plt.ion()
# sc.linalg.solve_circulant, append t[(n - 2):0:-1]

dll = np.ctypeslib.load_library("tc", ".")
np_poin = np.ctypeslib.ndpointer

def kernel(n = 100):
	k = np.linspace(0, 5, n)
	k = np.exp(-(k - k[0])**2)
	k[0] += 1e-10
	return k

def toep_chol_prepare(n):
	type_input_1 = np.ctypeslib.ctypes.c_int64
	type_input_2 = np_poin(dtype = np.double, ndim = 1, shape = n)
	type_output = np_poin(dtype = np.double, ndim = 2, shape = (n, n))

	dll.t_cholesky_lower.argtypes = [type_input_1, type_input_2]
	dll.t_cholesky_lower.restype = type_output
	return dll.t_cholesky_lower

def timer(n = 100):
	func_ptr = toep_chol_prepare(n)
	tic = time.time()
	L = func_ptr(n, kernel(n))
	toc = time.time()
	return toc - tic, L

if __name__ == "__main__":
	runtime, L = timer(10000) # half a second!
	x = np.matmul(L.T, np.random.normal(size = 10000))
	plt.plot(x)
	input("Press the enter key to quit.")

Efficient Inference of GP Covariances

Sometimes, it is easier to do parameter inference in the frequency domain. Here, I use SymPy to get the theoretical spectrum of a GP (using a Fourier transform of the covariance - note that to get from samples to the PSD, the PSD is defined as the expected value of the series squared due to Wiener-Khinchin) and we use the fact that the empirical spectrum divided by the theoretical spectrum has an \(Exp(1)\) distribution (\(\chi^2_2 \stackrel{d}{=} 0.5Exp(0.5)\)) to get to the likelihood.

I was writing a Gaussian Process Vocoder that synthesizes speech from mel spectrograms (using an LSTM to get from the mel spectrograms to the spectral kernel’s parameters), but the whole thing looks too similar to Tokuda & Zen (Directly Modelling Speech Waveforms …) - which I discovered after writing a good chuck of the code. I might complete it at some point, it uses the spectral kernel to obtain a zero mean GP with the right frequencies and block-stationary treatments of these non-stationary GPs to synthesize audio. Tokuda & Zen used an LSTM-RNN to obtain the cepstral coefficients, while I used it to obtain spectral parameters.

Have a look at Gaussian Process Speech Synthesis (Draft) for code.

Sampling Periodic GPs using the State Space Representation

The representation is due to Solin & Sarkka (2014).

Python Code
import numpy as np
from scipy.special import iv

"""
Adapted from matlab code from Arno Solin and Simo Sarkka (2014).
Explicit Link Between Periodic Covariance Functions and
State Space Models. Available at https://users.aalto.fi/~asolin/.

magnSigma2   : Magnitude scale parameter
lengthScale  : Distance scale parameter
period       : Length of repetition period
mlengthScale : Matern lengthScale
N            : Degree of approximation
nu           : Matern smoothness parameter
mN           : Degree of approximation if squared exonential
valid        : If false, uses Bessel functions

df(t)/dt = F f(t) + L w(t)
spectral denisty of w(t) is Qc
y_k = H f(t_k) + r_k, r_k ~ N(0, R)

"""

def reformat(conv_function):
    def wrapper(*args, **kwargs):
        F, L, Qc, H, Pinf = conv_function(*args, **kwargs)
        if len(H.shape) == 1: H = H.reshape(1, -1)
        if type(Qc) is not np.ndarray: Qc = np.array(Qc).reshape(1, 1)
        if len(Qc.shape) == 2: Qc = np.diag(Qc)
        return F, L, Qc, H, Pinf
    return wrapper

class StateSpaceModels:

    @staticmethod
    @reformat
    def Periodic(magnSigma2 = 1, lengthScale = 1, period = 1, N = 6):

        q2 = 2 * magnSigma2 * np.exp(-1/lengthScale**2) *\
            iv(np.arange(0, N + 1), 1/lengthScale**2)
        q2[0] *= 0.5

        w0 = 2*np.pi/period
        F = np.kron(np.diag(np.arange(0, N + 1)), [[0, -w0], [w0, 0]])
        L = np.eye(2 * (N + 1))
        Qc = np.zeros(2 * (N + 1))
        Pinf = np.kron(np.diag(q2), np.eye(2))
        H = np.kron(np.ones((1, N + 1)), [1, 0])

        return F, L, Qc, H, Pinf

    @staticmethod
    @reformat
    def Matern52(magnSigma2 = 1, lengthScale = 1):

        lambda_ = np.sqrt(5) / lengthScale
        F = np.array([
            [          0,             1,          0],
            [          0,             0,          1],
            [-lambda_**3, -3*lambda_**2, -3*lambda_]])
        L = np.array([[0], [0], [1]])
        Qc = (400*np.sqrt(5)*lengthScale*magnSigma2) / (3 * lengthScale**6)
        H = np.array([1, 0, 0])

        kappa = 5*magnSigma2 / (3*lengthScale**2)
        Pinf = np.array([
        	[magnSigma2, 0    , -kappa],
        	[0         , kappa,      0],
        	[-kappa    , 0    , 25*magnSigma2/lengthScale**4]])

        return F, L, Qc, H, Pinf

    @staticmethod
    @reformat
    def Quasiperiodic(magnSigma2 = 1, lengthScale = 1,
        period = 1, mlengthScale = 1, N = 6, nu = 3/2, mN = 6):

        F1, L1, Qc1, H1, Pinf1 = StateSpaceModels.Periodic(magnSigma2, lengthScale, period, N)
        F2, L2, Qc2, H2, Pinf2 = StateSpaceModels.Matern52(1, mlengthScale)

        F = np.kron(F1, np.eye(len(F2))) + np.kron(np.eye(len(F1)), F2)
        L = np.kron(L1, L2)
        Qc = np.kron(Pinf1, Qc2)

        Pinf = np.kron(Pinf1, Pinf2)
        H = np.kron(H1, H2)

        return F, L, Qc, H, Pinf
library(Rcpp)
library(reticulate)

SSM = import('ssm')$StateSpaceModels

example_ssm = SSM$Periodic()
F  = example_ssm[[1]]
L  = example_ssm[[2]]
Qc = example_ssm[[3]]
H  = example_ssm[[4]]

cppFunction('
NumericVector sim(arma::mat F,
                  arma::mat L,
                  arma::vec Qc,
                  arma::vec H,
                  double dt = 0.01,
                  size_t n = 10000) {

    NumericVector x(n);
    arma::vec f(F.n_rows);
    arma::vec u(L.n_rows);
    f = rnorm(F.n_rows);

    arma::mat Q_root = sqrt(Qc * dt);

    for (int i = 0; i < n; i++) {
        u = rnorm(L.n_rows);
        f = f + F*f*dt + (L*Q_root)%u;
    	x[i] = arma::dot(H, f);
    }
    return x;
}', depends = 'RcppArmadillo')

Efficient inverse-matrix multiplication

import numpy as np

naive = np.linalg.solve(A, b)

L = np.linalg.cholesky(A)
efficient = np.linalg.solve(L.T, np.linalg.solve(L, b))

End

2020

Efficient Gaussian Process Computation

I’ll try to give examples of efficient gaussian process computation here, like the vec trick (Kronecker product trick), efficient toeliptz and circulant matrix computations, RTS smoothing and Kalman filtering using state space representations, and so on.

4 min read

Gaussian Process Speech Synthesis (Draft)

Very untidy first working draft of the idea mentioned on the efficient computation page. Here, I fit a spectral mixture to some audio data to build a “generative model” for audio. I’ll implement efficient sampling later, and I’ll replace the arbitrary way this is trained with an LSTM-RNN to go straight from text/spectrograms to waveforms.

4 min read
Back to Top ↑

2019

Gaussian Process Middle C

First of my experiments on audio modelling using gaussian processes. Here, I construct a GP that, when sampled, plays middle c the way a grand piano would.

1 min read
Back to Top ↑

2018

Back to Top ↑