gpytorch
gpytorch copied to clipboard
How to use a fixed noise Gaussian likelihood in a multi-task setting
Howdy folks,
GPyTorch provides Gaussian likelihood objects for fixed noise (FixedNoiseGaussianLikelihood
) and for multi-task models (MultitaskGaussianLikelihood
). I was wondering if someone could provide me some guidance on how to get a fixed noise multi-task Gaussian likelihood?
Thanks in advance
Galto
@Galto2000 I think we'd just need to implement FixedNoiseMultitaskGaussianLikelihood
. Basically, you'd specify an n x t
matrix of noises rather than a length n
vector of noises, and the interface would otherwise be the same.
@jacobrgardner , would you please give a little more guidance, perhaps something like a high level recipe, on how I would go about implementing something like a FixedNoiseMultitaskGaussianLikelihood
:) ?
Also, if you have some time, would you also please provide me with some clarification regarding my other issue (https://github.com/cornellius-gp/gpytorch/issues/890) - it's kind of related to this one.
I feel that the chips are starting to fall into place, but I just need an extra nudge from you and I think this would be a great challenge for me to wrap my head around some of the implementation details in GpyTorch.
Thanks in advance
Galto
This is something that I'd like as well, let me see if I can find some time to work on this this week.
@Balandat, that would be great, thank you.
@Galto2000 I'm assuming what you'd like to do here is provide the noise for the different tasks, but not the cross-task covariance - this should still be inferred. Is this correct?
@Balandat , yes, I think that is correct.
For instance, I am interested in doing multi-task, multi-sensor fusion; i.e. condition a model posterior on observations from different types of sensors (each of which has different noise) where the sensors output vector quantities, which makes it multi-task.
Thank you
Galto
FWIW, I put up an early draft for this in 49e810b1a4f29fe1e0a102ad6f5963e90ae0dbdd - will have to do some cleaning up and testing before I make this a PR.
Hmm, I'm realizing that the fact that MultitaskMultivariateNormal
can be using either interleaved or non-interleaved representation significantly complicates things here. It'll take a little bit of work to iron this out.
It seems that we should address #539 first in order to make this less of a pain to implement.
I'm trying to wrap my head around the interleaved concept as well on how multi-tasking is achieved in GPyTorch. Is there any relevant literature that I can reference ?
yeah basically if you have n
points and t
tasks, gpytorch represents the joint covariance as an nt x nt
matrix. You can represent that in different ways, either K_{data} \kron K_{task}
, in which case you have n
t x t
matrices on the diagonal (i.e. "interleaved" w.r.t the data points), or K_{task} \kron K_{data}
, in which case you have t
n x n
matrices on the diagonal. Depending on the use case one or the other representation may make more sense, hence the suggestion in #539.
Here K_{data}
is the data covariance that depends on the hyperparameters, and K_{task}
is a learned (often low-rank) correlation matrix. See #912 for some changes to the parameterization of that.
Hi @Balandat ,
One of my goals is to do "sensor fusion" (or data fusion) using GPs. In my case I have two different sensors measuring the same vector entity (a velocity in 2D). The sensors have different noise characteristics: sigma1
and sigma2
.
I read a paper (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.1154&rep=rep1&type=pdf) that solves this from a heteroskedastic point of view.
I found your notebook with a heteroskedastic example: test_HadamardMultitaskMultiOutputHeteroskedasticLikelihood_universal.ipynb.txt
Firstly, I am trying to wrap my head around the example: mostly the last one where a GP is passed to the noise_covar of a likelihood - if you have some time could you perhaps explain this to me?
Secondly: if you have some time, would you perhaps have some suggestions on how I would be able achieve "sensor fusion" by treating my two different data sets as a heteroskedastic data set, where the noise levels are quantized: data with noise corresponding to the first sensor and data with noise corresponding to the second sensor?
Thanks in advance
Galto
actually I meant this one: test_MultitaskHeteroskedasticLikelihood.ipynb.txt The very last example in the series
Hi @Balandat ,
In regards to the fixed noise Gaussian likelihood in a multi-task setting, I saw your draft - how do get these changes? I pip-installed GPyTorch - do I need to clone from github now?
Cheers
Galto
@Galto2000 sorry I haven't gotten to work much on this - the draft isn't really in a usable state at this point, so unless you plan on actively developing it's probably not worth checking it out (which you would do by cloning the repo and checking out that branch). I'll try to get back to this soon-ish.
actually I meant this one: test_MultitaskHeteroskedasticLikelihood.ipynb.txt The very last example in the series.
Sorry what series do you mean exactly? Can you link to this?
Firstly, I am trying to wrap my head around the example: mostly the last one where a GP is passed to the noise_covar of a likelihood - if you have some time could you perhaps explain this to me?
It's pretty straightforward: if you have noise observations you can build a separate noise model. Typically this is fit on log-transformed data to ensure positivity and model multiplicative uncertainty. Then the prediction of that model at the input X
is used as the noise level (rather than using fixed noises or a constant one). This has two benefits: (i) regularize the noise levels, in case these are themselves subject to observation noise [which they typically will be] and (ii) allow out-of-sample noise predictions, which is important for some more advanced acquisition functions in Bayesian Optimization. We have such a model checked in in BoTorch: https://github.com/pytorch/botorch/blob/master/botorch/models/gp_regression.py#L224
Thanks @Balandat for your reply.
I was referring to this little bit of code, that you posted some time ago:
train_x = torch.linspace(0, 1, 75)
sem_y1 = 0.05 + (0.75 - 0.05) * torch.linspace(0, 1, 75)
sem_y2 = 0.75 - (0.75 - 0.05) * torch.linspace(0, 1, 75)
train_y = torch.stack([
torch.sin(train_x * (2 * math.pi)) + sem_y1 * torch.randn(train_x.size()),
torch.cos(train_x * (2 * math.pi)) + sem_y2 * torch.randn(train_x.size()),
], -1)
train_y_log_var = torch.stack([(s ** 2).log() for s in (sem_y1, sem_y2)], -1)
log_noise_model = MultitaskGPModel(
train_x,
train_y_log_var,
MultitaskGaussianLikelihood(num_tasks=2),
num_tasks=2,
)
likelihood = _MultitaskGaussianLikelihoodBase(
num_tasks=2,
noise_covar=HeteroskedasticNoise(log_noise_model),
)
model = MultitaskGPModel(train_x, train_y, likelihood, num_tasks=2, rank=2)
I was wondering that I could do something analogous to get around the issue of not yet having MultitaskFixedGaussianNoise
available.
In my case I have two observations of Y,
over X
, but at two different noise levels.
So I have observations y1
over x1
with known noise n1
and observations y2
over x2
with known noise n2
, and as such I concatenate or stack the tensors as follows
Y = [y1, y2], X=[x1,x2] and N = [n1, n2]
Now, pass N
and X
to a GP model with a linear kernel and pass that as noise_covar
in a _MultitaskGaussianLikelihoodBase
that will serve as the likelihood for a GP model that takes Y and X as their inputs. Multi-sensor fusion using GPs is my goal here.
It's going to be less computationally efficient than a MultitaskFixedGaussianNoise
, but at this time that wouldn't bother me since the data is relatively small and it would be temporary until MultitaskFixedGaussianNoise
comes online.
You see any issues with this approach?
Cheers
Galto
Hello, happy new year!
I was wondering if there is an ETA for the MultitaskFixedGaussianNoise
?
I tried the "heteroskedastic approach", in order to instil some fixed noise behavior in a multi-task setting, but there are many issues with doing it that way.
I am currently circumventing not having a MultitaskFixedGaussianNoise
through using model-lists and (single task) FixedGaussianNoise
and assuming the outcomes are independent, in order to make progress and in the hope that when MultiTaskFixedGaussianNoise
comes available it would be relatively simple change at the end.
Cheers
Galto
I know that Max is out for the next week. BoTorch has support for MTGPs with fixed noise... would something like https://botorch.org/v/0.1.0/api/models.html#fixednoisemultitaskgp help?
On Mon, Jan 6, 2020 at 7:46 AM Galto2000 [email protected] wrote:
Hello, happy new year!
I was wondering if there is an ETA for the MultitaskFixedGaussianNoise ?
I tried the "heteroskedastic approach", in order to instil some fixed noise behavior in a multi-task setting, but there are many issues with doing it that way.
I am currently circumventing not having a MultitaskFixedGaussianNoise through using model-lists and (single task) FixedGaussianNoise and assuming the outcomes are independent, in order to make progress and in the hope that when MultiTaskFixedGaussianNoise comes available it would be relatively simple change at the end.
Cheers
Galto
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cornellius-gp/gpytorch/issues/901?email_source=notifications&email_token=AAAW34IJJYUFRECHRW2BWM3Q4NG3TA5CNFSM4JA4PBV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIF2U5Q#issuecomment-571189878, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAW34LEUKWMDNZ7GCVQRQ3Q4NG3TANCNFSM4JA4PBVQ .
@eytan
Thanks for pointing out the BoTorch fixed noise multitask - I'll check it out.
Are there any news on this thread? It'd be a useful thing to have!
@wjmaddox, @qingfeng10 I know you are/were thinking about this in as well. Are you working on this / planning to work on this in the near future?
It depends on the type of noise that's desired here. For a model that will have missing observations for some tasks, then it's probably preferable to just use a botorch FixedNoiseMTGP.
It is possible (although it will be slow for large n) to just drop in a fixed noise gaussian likelihood directly into a MTGP as this code snippet:
import torch
import math
from botorch.models import KroneckerMultiTaskGP
from gpytorch.likelihoods import FixedNoiseGaussianLikelihood
train_x = torch.linspace(0, 1, 75)
sem_y1 = 0.05 + (0.75 - 0.05) * torch.linspace(0, 1, 75)
sem_y2 = 0.75 - (0.75 - 0.05) * torch.linspace(0, 1, 75)
train_y = torch.stack([
torch.sin(train_x * (2 * math.pi)) + sem_y1 * torch.randn(train_x.size()),
torch.cos(train_x * (2 * math.pi)) + sem_y2 * torch.randn(train_x.size()),
], -1)
train_y_log_var = torch.stack([(s ** 2).log() for s in (sem_y1, sem_y2)], -1)
likelihood = FixedNoiseGaussianLikelihood(train_y_log_var.exp().view(-1))
# KroneckerMultiTaskGP is basically the same model class as a MultiTaskGP
# in the gpytorch example
mtgp = KroneckerMultiTaskGP(
train_x.unsqueeze(-1),
train_y,
num_tasks=2,
)
mtgp.likelihood = likelihood
# NOTE that i didn't check interleaving here to see if this is returning the correct noise
# just verifying that this implementation works
# if you need to interleave, should be able to transpose and then squeeze
mtgp.likelihood(mtgp(train_x))
# returns MultitaskMultivariateNormal(loc: torch.Size([150]))
# can train with something like below
from botorch.optim.fit import fit_gpytorch_torch
from gpytorch.mlls import ExactMarginalLogLikelihood
mll = ExactMarginalLogLikelihood(mtgp.likelihood, mtgp)
fit_gpytorch_torch(mll)
I have a multitask problem where the data is transformed so that the noise is the same on all of the tasks and I can reasonably assume cross-task noise = 0. Is there a simple way that I can wrap FixedNoiseGaussianLikelihood to make it play nice with multitask in this context?
Yes, that's basically what's done with BoTorch's FixedNoiseMTGP
(https://botorch.org/v/0.1.0/api/models.html#fixednoisemultitaskgp).
Alternatively, if you really need the Kronecker structure, you could just set the likelihood noise to be what you want it to be and detach it:
from gpytorch.likelihoods import MultitaskGaussianLikelihood
likelihood = MultitaskGaussianLikelihood(num_tasks=4, likelihood_rank=0)
likelihood.noise = 0.1 # for example
likelihood.task_noises = torch.tensor([1., 1., 1., 1.])
# for example, although both of these need to be non-zero
likelihood.raw_noise.detach_()
likelihood.raw_task_noises.detach_()
The example a couple of comments above is for noise that is potentially different across tasks and observations.
@wjmaddox Sorry, I meant that the noise is per-observation, but the same across tasks for a given observation. Appreciate the help though!
Are you observing all tasks for each observation?
@wjmaddox yes.
Something like this ought to work for you (and I probably ought to clean this up as a PR at some point):
# using botorch.models.KroneckerMultitaskGP here but the API should be the same
# for a MTGP like in https://docs.gpytorch.ai/en/stable/examples/03_Multitask_Exact_GPs/Multitask_GP_Regression.html
from botorch.models import KroneckerMultiTaskGP
train_x = torch.randn(10, 2)
train_y = torch.randn(10, 4)
train_y_var = torch.rand(10).exp()
from gpytorch.likelihoods.multitask_gaussian_likelihood import _MultitaskGaussianLikelihoodBase
from gpytorch.likelihoods.noise_models import FixedGaussianNoise
from gpytorch.lazy import ConstantDiagLazyTensor, KroneckerProductLazyTensor
class FixedTaskNoiseMultitaskLikelihood(_MultitaskGaussianLikelihoodBase):
def __init__(self, noise, *args, **kwargs):
noise_covar = FixedGaussianNoise(noise=noise)
super().__init__(noise_covar=noise_covar, *args, **kwargs)
self.has_global_noise = False
self.has_task_noise = False
def _shaped_noise_covar(self, shape, add_noise=True, *params, **kwargs):
if not self.has_task_noise:
data_noise = self.noise_covar(*params, shape=torch.Size((shape[:-2],)), **kwargs)
eye = torch.ones(1, device=data_noise.device, dtype=data_noise.dtype)
# TODO: add in a shape for batched models
task_noise = ConstantDiagLazyTensor(
eye, diag_shape=torch.Size((self.num_tasks,))
)
return KroneckerProductLazyTensor(data_noise, task_noise)
else:
# TODO: copy over pieces from MultitaskGaussianLikelihood
raise NotImplementedError("Task noises not supported yet.")
# setup is that the covariance is `D \kron I` where `D` is user supplied
likelihood = FixedTaskNoiseMultitaskLikelihood(num_tasks=4, noise=train_y_var, rank=0)
model = KroneckerMultiTaskGP(
train_x,
train_y,
likelihood=likelihood
)
# now test the posterior
test_x = torch.randn(20, 2)
model.eval()
model(test_x).rsample((torch.Size((32,)))).shape
Let me know if that has issues or needs to be expanded somehow.
Has anyone solved this problem?