diffxpy
diffxpy copied to clipboard
BUG: `struct.error` when `de.test.wald(...)`
I was using diffxpy
to find marker-genes for a cluster vs all other clusters in a scRNA-dataset stored in adata
.
When performing the following:
# Performing differential expression to find the markers for
# cluter 20, which was defined as a CD33+ myeloid by HGA (Human Gene Atlas).
adata.obs['twty_All'] = [
'group 1' if int(i) == 20 else 'group 2' for i in adata.obs['leiden']
]
cl20_test = de.test.wald(
data=adata,
formula_loc="~ 1 + twty_All",
factor_loc_totest="twty_All"
)
The following error is produced, which I do not know how to resolve:
training location model: False
training scale model: True
iter 0: ll=23072758620.633224
iter 1: ll=23072758620.633224, converged: 0.00% (loc: 100.00%, scale update: False), in 0.00sec
Fitting 26542 dispersion models: (progress not available with multiprocessing)
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/diffxpy/testing/tests.py", line 736, in wald
**kwargs,
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/diffxpy/testing/tests.py", line 244, in _fit
**train_args
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/batchglm/models/base/estimator.py", line 124, in train_sequence
self.train(**d, **kwargs)
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/batchglm/train/numpy/base_glm/estimator.py", line 112, in train
nproc=nproc
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/batchglm/train/numpy/base_glm/estimator.py", line 351, in b_step
nproc=nproc
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/site-packages/batchglm/train/numpy/base_glm/estimator.py", line 478, in _b_step_loop
) for j in idx_update]
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/multiprocessing/pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
put(task)
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "[USERNAME]/anaconda3/envs/scMachineLearning1/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
I have noted that there are similar issues involving multiprocessing
but I haven't had the time to check them out in detail.
(de.__version__ = 'v0.7.4'
, sc.__version__ = '1.4.6
)
I had the same issue and I think I narrowed it down to a python versioning issue. I was running an older version of Python (3.6) and in newer versions of python this issue was fixed, specifically by this commit.
Yes, this should be addressed in python 3.8 I heared, but we are also working on internally mitigating this, I will keep you posted!
Hi, I have the same issue. Are there any solutions already?