dask-ml
dask-ml copied to clipboard
Incremental Wrapper ValueError: Layer not in the HighLevelGraph's layers:
What happened**: I am getting value errors when I implement an incremental wrapper around my PyTorch model with skorch.
ValueError: Layer ('fit-cb2461b73d6a9abbf8a6eacb8d7c983d', 3) not in the HighLevelGraph's layers: ['original-array-ca0ac32624c3fb9bb23d2ca89c89e186', 'array-ca0ac32624c3fb9bb23d2ca89c89e186', 'transpose-c840e9da8930d4f38c6497326ee6e59d', 139711252509824, 'getitem-4f71595287eec8f0490287973829e6f2', 'reshape-fe8d57675329c0348f3a175d62f073d0']
What you expected to happen: I expected my neural network model to begin training with my dataset incrementally
Minimal Complete Verifiable Example:
import dask.array as da
import torch.nn as nn
from skorch import NeuralNetRegressor
import torch.optim as optim
from dask_ml.wrappers import Incremental
from dask_ml.datasets import make_regression
class GRU(nn.Module):
def __init__(self, inputsize, outputsize):
super(GRU, self).__init__()
self.inputsize = inputsize
self.outputsize = outputsize
self.hiddenlayers = nn.GRU(self.inputsize, self.inputsize, num_layers=2, batch_first=True)
self.outputlayer = nn.Linear(self.inputsize, self.outputsize)
def forward(self, x):
output, hidden = self.hiddenlayers(x)
x = hidden[-1]
x = self.outputlayer(x)
return x
niceties = {
"callbacks": False,
"warm_start": False,
"train_split": None,
"max_epochs": 1,
}
model = NeuralNetRegressor(
module=GRU,
module__inputsize=10,
module__outputsize=1,
criterion=nn.L1Loss(),
optimizer=optim.SGD,
optimizer__lr=0.01,
optimizer__momentum=0.9,
batch_size=100,
**niceties
)
inc = Incremental(model, scoring="r2")
# Trains the model incrementally using chunked data
X, y = make_regression(n_samples=10000, n_features=10, n_targets=1, chunks=100)
# Creates data for github example
y = X[:, 1].reshape(-1, 1)
# reshapes the input for sequential model
X = da.stack([X, X, X], axis=1)
print(X.shape)
print(y.shape)
inc.fit(X, y)
print(inc.score(X, y))
Anything else we need to know?:
Environment: Anaconda 3
- Dask version: 1.7.0
- Python version: 3.9.7
- Operating System: Pop OS 21.10
- Install method (conda, pip, source): conda
Can you post the full traceback?
Does NeuralNetRegressor implement partial_fit?
Traceback (most recent call last):
File "/coolstuff.py", line 66, in <module>
inc.fit(X, y)
File "/home/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/wrappers.py", line 493, in fit
self._fit_for_estimator(estimator, X, y, **fit_kwargs)
File "/home/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/wrappers.py", line 477, in _fit_for_estimator
result = fit(
File "/home/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/_partial.py", line 136, in fit
value = Delayed((name, nblocks - 1), new_dsk)
File "/home/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/delayed.py", line 497, in __init__
raise ValueError(
ValueError: Layer ('fit-3a23b2bee452186733b389b1c5a56db7', 99) not in the HighLevelGraph's layers: ['stack-a184c924722ff5ca46fb6c0ef85f0d12', 'normal-862d9ba0e2bc042085f36c63710cfe0b', 140630240623552, 'getitem-0d109ff96323bf4206bd2b008436a85a', 'reshape-84d7afd23da422ab179600d3f13f97b8']
The neural net regressors from skorch does implement partial fit
Dask-ML v1.7.0 is from Sep. 2020. Try upgrading, I think it's been fixed since then (it certainly raises an error on the main branch): https://github.com/dask/dask-ml/blob/67d28b15dfff7869e9a04def203aa129a3540b27/dask_ml/_partial.py#L96-L98
That fixed one issue that I was having. However, there is still 1 error with the incremental wrapper:
Traceback (most recent call last):
File "/home/christopherwoolford/Documents/Research/Deep learning research with Dr. Aledhari/Gene Regulatory Elements Prediction/GRU/GRUStuff/GRU/GRUwithDask.py", line 89, in <module>
train = inc.fit(X, y)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/wrappers.py", line 579, in fit
self._fit_for_estimator(estimator, X, y, **fit_kwargs)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/wrappers.py", line 563, in _fit_for_estimator
result = fit(
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/_partial.py", line 137, in fit
return value.compute()
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/base.py", line 290, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/base.py", line 573, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/threaded.py", line 81, in get
results = get_async(
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/local.py", line 506, in get_async
raise_exception(exc, tb)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/local.py", line 314, in reraise
raise exc
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/local.py", line 219, in execute_task
result = _execute_task(task, data)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/dask_ml/_partial.py", line 20, in _partial_fit
model.partial_fit(x, y, **kwargs)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/skorch/net.py", line 1174, in partial_fit
self.fit_loop(X, y, **fit_params)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/skorch/net.py", line 1074, in fit_loop
self.check_data(X, y)
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/skorch/regressor.py", line 69, in check_data
if get_dim(y) == 1:
File "/home/christopherwoolford/anaconda3/envs/Torch/lib/python3.9/site-packages/skorch/utils.py", line 193, in get_dim
return y.dim()
AttributeError: 'tuple' object has no attribute 'dim'
There appears to be an error in the wrapper where the inserted data is not being converted into torch tensor. I modified the code and just used skorch and that was running just fine. However the incremental wrapper is not converting the input data into the necessary form for skorch