PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

[Question]: Taskflow batch_size question

Open SIKtt opened this issue 2 years ago • 2 comments

请提出你的问题

在使用Taskflow时对于默认的batch_size参数遇到了下面的问题,通过调整参数batch_size=1能够避免问题出现。 想问一下这个问题出现是由于什么原因?

Taskflow(task='information_extraction', schema=[], task_path='./model', device_id=args.device)
Python 3.8.5
----
paddle               1.0.2
paddle-bfloat        0.1.7
paddle2onnx          1.0.1
paddlefsl            1.1.0
paddlenlp            2.4.0
paddlepaddle         2.3.2
paddlepaddle-gpu     2.3.2
----
CUDA: 11.2
cuDNN Version: 8.1
Traceback (most recent call last):
  File "/usr/local/python3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/python3/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "./.local/lib/python3.8/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 217, in _thread_loop
    batch = self._dataset_fetcher.fetch(indices,
  File "./.local/lib/python3.8/site-packages/paddle/fluid/dataloader/fetcher.py", line 134, in fetch
    data = self.collate_fn(data)
  File "./.local/lib/python3.8/site-packages/paddle/fluid/dataloader/collate.py", line 77, in default_collate_fn
    return [default_collate_fn(fields) for fields in zip(*batch)]
  File "./.local/lib/python3.8/site-packages/paddle/fluid/dataloader/collate.py", line 77, in <listcomp>
    return [default_collate_fn(fields) for fields in zip(*batch)]
  File "./.local/lib/python3.8/site-packages/paddle/fluid/dataloader/collate.py", line 58, in default_collate_fn
    batch = np.stack(batch, axis=0)
  File "<__array_function__ internals>", line 5, in stack
  File "./.local/lib/python3.8/site-packages/numpy/core/shape_base.py", line 427, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

SIKtt avatar Oct 14 '22 09:10 SIKtt

想问一下,使用的时候 schema是空吗?

wawltor avatar Oct 14 '22 11:10 wawltor

是空的,后面用了set_schema方法

SIKtt avatar Oct 14 '22 13:10 SIKtt

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] avatar Dec 14 '22 00:12 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

github-actions[bot] avatar Dec 28 '22 10:12 github-actions[bot]