Kernels¶
George comes equipped with a suite of standard covariance functions or kernels that can be combined to build more complex models. The standard kernels fall into the following categories:
Stationary kernels — functions that depend only on the radial distance between points in some userdefined metric, and
Nonstationary kernels — functions that depend on the value of the input coordinates themselves.
Combining kernels describes how to combine kernels to build more sophisticated models and Implementing new kernels explains how you would go about incorporating a custom kernel.
Common parameters¶
Every kernel accepts the two keyword arguments ndim
and axes
. By
default, kernels are only one dimensional so you must specify the ndim
argument if you want the kernel to work with higher dimensional inputs.
By default, higher dimensional kernels are applied to every dimension but you
can restrict the evaluation to a subspace using the axes
argument.
For example, if you have a 3 dimensional input space but you want one of the
kernels to only act in the first dimension, you would do the following:
from george import kernels
kernel = 10.0 * kernels.Matern32Kernel(1.0, ndim=3, axes=0)
Similarly, if you wanted the kernel to act on only the second and third dimensions, you could do something like:
kernel = 10.0 * kernels.ExpSquaredKernel([1.0, 0.5], ndim=3, axes=[1, 2])
Finally, all of the stationary kernels can be “blocked”. This means that the
kernel will only be applied within some parameter range. In practice, the
covariance matrix will have a block diagonal structure. To use this feature,
you use the block
keyword argument:
kernel = 10.0 * kernels.ExpSquaredKernel(1.0, block=(1.0, 1.0))
# or...
kernel = kernels.ExpSquaredKernel([1.0, 1.5], ndim=2,
block=[(1.0, 1.0), (0.5, 1.5)])
Implementation details & modeling interface¶
It’s worth understanding how these kernels are implemented. Most of the hard work is done at a low level (in C++) and the Python is only a thin wrapper to this functionality. This makes the code fast and consistent across interfaces but it also means that it isn’t currently possible to implement new kernel functions without recompiling the code. Almost every kernel has hyperparameters that you can set to control its behavior and these are controlled using the modeling.
k = 2.0 * kernels.Matern32Kernel(5.0)
print(k.get_parameter_names())
# ['k1:log_constant', 'k2:metric:log_M_0_0']
print(k.get_parameter_vector())
# [ 0.69314718 1.60943791]
You’ll notice that, in this case, the parameter vector is the logarithm of the parameters given when building the kernel. This will be the case for any strictly positive parameters because it is always better to fit in the logarithm of these types of parameters. You probably also noticed that the parameters have names. This opens up a few interesting features. For example, if you want to change any of the parameters, you can do it as follows:
import numpy as np
k["k1:log_constant"] = np.log(10.0)
print(k.get_parameter_vector())
# [ 2.30258509 1.60943791]
# ... or:
k[0] = np.log(2.0)
print(k.get_parameter_vector())
# [ 0.69314718 1.60943791]
Finally, if you want to update the entire vector, you can use the
set_vector()
method:
k.set_parameter_vector(k.get_parameter_vector() + np.random.randn(2))
Another feature common to the kernels is that you can “freeze” and “thaw” parameters by name. For example, let’s say that you want to keep the amplitude of your kernel fixed and fit for only the scale length:
k = 2.0 * kernels.Matern32Kernel(5.0)
k.freeze_parameter("k1:log_constant")
print(k.get_parameter_names())
# ['k2:metric:log_M_0_0']
print(k.get_parameter_vector())
# [ 1.60943791]
Bringing a parameter back into the fold is as easy as
k.thaw_parameter("k1:log_constant")
print(k.get_parameter_names())
# ['k1:log_constant', 'k2:log_M_0_0']
print(k.get_vector())
# [ 0.69314718 1.60943791]
Stationary kernels¶
Stationary kernels are a class of functions that depend on the input coordinates \(\mathbf{x}_i\) and \(\mathbf{x}_j\) through their squared distance under some metric \(C\):
The currently supported metrics are:
“isotropic” — the scale length is equal in all dimensions,
“axisaligned” — there is a different scale length in each dimension, and
“general” — arbitrary covariances between dimensions are allowed.
The “isotropic” and “axisaligned” metrics are parameterized by the logarithms of their scale lengths. For example:
from george.metrics import Metric
m = Metric(2.0, ndim=2)
print(m.get_parameter_vector())
# [ 0.69314718]
gives a twodimensional isotropic metric with
and
m = Metric([2.0, 4.0], ndim=2)
print(m.get_parameter_vector())
# [ 0.69314718 1.38629436]
specifies the following matrix
Note
Another way to define the isotropic metric is that it scales the square of the distance between points such that the following equality holds for a kernel evaluated at two points a distance \(r\): apart: \(k(r^2;\,\textrm{metric}=\lambda) = k(r^2 / \lambda;\,\mathrm{metric}=1)\).
In the “general” case, the matrix is parameterized by the elements of the Cholesky decomposition \(C = L\,L^\mathrm{T}\) with logarithms along the diagonal. For example:
m = Metric([[2.0, 0.1], [0.1, 4.0]], ndim=2)
print(m.get_parameter_vector())
# [ 0.34657359 0.07071068 0.69252179]
All the stationary kernels take the metric
specification as a keyword
argument:
k = kernels.ExpSquaredKernel(metric=[[5.0, 0.1], [0.1, 4.0]], ndim=2)
print(k.get_parameter_vector())
# [ 0.80471896 0.04472136 0.69289712]
The currently available stationary kernels are:

class
george.kernels.
RationalQuadraticKernel
(log_alpha=None, metric=None, metric_bounds=None, lower=True, block=None, bounds=None, ndim=1, axes=None)¶ This is equivalent to a “scale mixture” of
ExpSquaredKernel
kernels with different scale lengths drawn from a gamma distribution. See R&W for more info but here’s the equation:\[k(r^2) = \left[1  \frac{r^2}{2\,\alpha} \right]^\alpha\] Parameters
log_alpha – The Gamma distribution parameter.

class
george.kernels.
Matern52Kernel
(metric=None, metric_bounds=None, lower=True, block=None, bounds=None, ndim=1, axes=None)¶ The Matern5/2 kernel is stationary kernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \left( 1+\sqrt{5\,r^2}+ \frac{5\,r^2}{3} \right)\, \exp \left (\sqrt{5\,r^2} \right )\]

class
george.kernels.
ExpSquaredKernel
(metric=None, metric_bounds=None, lower=True, block=None, bounds=None, ndim=1, axes=None)¶ The exponentialsquared kernel is a stationary kernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \exp \left ( \frac{r^2}{2} \right )\]

class
george.kernels.
ExpKernel
(metric=None, metric_bounds=None, lower=True, block=None, bounds=None, ndim=1, axes=None)¶ The exponential kernel is a stationary kernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \exp \left ( \sqrt{r^2} \right )\]

class
george.kernels.
Matern32Kernel
(metric=None, metric_bounds=None, lower=True, block=None, bounds=None, ndim=1, axes=None)¶ The Matern3/2 kernel is stationary kernel where the value at a given radius \(r^2\) is given by:
\[k(r^2) = \left( 1+\sqrt{3\,r^2} \right)\, \exp \left (\sqrt{3\,r^2} \right )\]
Nonstationary kernels¶
Nonstationary kernels are specified by a (symmetric) function of the input
coordinates themselves.
They are applied identically to every axis so the axes
keyword argument
will probably come in handy.
For example, to implement a quasiperiodic kernel with a threedimensional input space where you only want to apply the periodicity along the first (e.g. time) dimension, you would use something like:
k = kernels.ExpSine2Kernel(gamma=0.1, log_period=5.0, ndim=3, axes=0)
k *= 10.0 * kernels.ExpSquaredKernel(metric=5.0, ndim=3, axes=0)
k += 4.0 * kernels.Matern32Kernel(metric=4.0, ndim=3, axes=[1, 2])
The currently available nonstationary kernels are:

class
george.kernels.
ConstantKernel
(log_constant=None, bounds=None, ndim=1, axes=None)¶ This kernel returns the constant
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = c\]where \(c\) is a parameter.
 Parameters
log_constant – The log of \(c\) in the above equation.

class
george.kernels.
ExpSine2Kernel
(gamma=None, log_period=None, bounds=None, ndim=1, axes=None)¶ The expsinesquared kernel is a commonly used periodic kernel. Unlike the
CosineKernel
, this kernel never has negative covariance which might be useful for your problem. Here’s the equation:\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \exp \left( \Gamma\,\sin^2\left[ \frac{\pi}{P}\,\leftx_ix_j\right \right] \right)\] Parameters
gamma – The scale \(\Gamma\) of the correlations.
log_period – The log of the period \(P\) of the oscillation (in the same units as \(\mathbf{x}\)).

class
george.kernels.
LocalGaussianKernel
(location=None, log_width=None, bounds=None, ndim=1, axes=None)¶ A local Gaussian kernel.
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \exp\left( \frac{(x_i  x_0)^2 + (x_j  x_0)^2}{2\,w} \right))\] Parameters
location – The location \(x_0\) of the Gaussian.
log_width – The (squared) width \(w\) of the Gaussian.

class
george.kernels.
PolynomialKernel
(log_sigma2=None, order=None, bounds=None, ndim=1, axes=None)¶ A polynomial kernel
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = (\mathbf{x}_i \cdot \mathbf{x}_j + \sigma^2)^P\] Parameters
order – The power \(P\). This parameter is a constant; it is not included in the parameter vector.
log_sigma2 – The variance \(\sigma^2 > 0\).

class
george.kernels.
DotProductKernel
(bounds=None, ndim=1, axes=None)¶ The dot product kernel
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \mathbf{x}_i \cdot \mathbf{x}_j\]with no parameters.

class
george.kernels.
EmptyKernel
(bounds=None, ndim=1, axes=None)¶ This kernel is a noop

class
george.kernels.
CosineKernel
(log_period=None, bounds=None, ndim=1, axes=None)¶ The simplest periodic kernel. This
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \cos\left( \frac{2\,\pi\,x_i  x_j}{P} \right)\]where the parameter \(P\) is the period of the oscillation. This kernel should probably always be multiplied be a stationary kernel (e.g.
ExpSquaredKernel
) to allow quasiperiodic variations. Parameters
log_period – The period of the oscillation.

class
george.kernels.
LinearKernel
(log_gamma2=None, order=None, bounds=None, ndim=1, axes=None)¶ The linear regression kernel
\[k(\mathbf{x}_i,\,\mathbf{x}_j) = \frac{(\mathbf{x}_i \cdot \mathbf{x}_j)^P}{\gamma^2}\] Parameters
order – The power \(P\). This parameter is a constant; it is not included in the parameter vector.
log_gamma2 – The scale factor \(\gamma^2\).
Combining kernels¶
More complicated kernels can be constructed by algebraically combining the basic kernels listed in the previous sections. In particular, all the kernels support addition and multiplication. For example, an exponentialsquared kernel with a nontrivial variance can be constructed as follows:
from george import kernels
kernel = 1e3 * kernels.ExpSquaredKernel(3.4)
This is equivalent to:
from math import log
kernel = kernels.Product(kernels.ConstantKernel(log_constant=log(1e3)),
kernels.ExpSquaredKernel(3.4))
As demonstrated in Hyperparameter optimization, a mixture of kernels can be implemented with addition:
k1 = 1e3 * kernels.ExpSquaredKernel(3.4)
k2 = 1e4 * kernels.Matern32Kernel(14.53)
kernel = k1 + k2
Implementing new kernels¶
As mentioned previously, because of technical limitations, new kernels can only be implemented by recompiling george. See Implementing new kernels for a detailed example of implementing a new kernel.