MiniCPM-V icon indicating copy to clipboard operation
MiniCPM-V copied to clipboard

finetune/dataset.py | TypeError: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind'

Open JamePeng opened this issue 1 year ago • 8 comments

Error INFO: File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 208, in train() File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 203, in train trainer.train() File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2291, in _inner_training_loop self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval) File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2721, in _maybe_log_save_evaluate metrics = self.evaluate(ignore_keys=ignore_keys_for_eval) File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 3572, in evaluate output = eval_loop( File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 3747, in evaluation_loop for step, inputs in enumerate(dataloader): File "/root/miniconda3/lib/python3.10/site-packages/accelerate/data_loader.py", line 464, in iter next_batch = next(dataloader_iter) File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next data = self._next_data() File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 46, in getitem ret = preprocess( File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 276, in preprocess input_dict = conversation_to_ids(conversation, tokenizer, llm_type) File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 113, in conversation_to_ids ids = torch.from_numpy(np.hstack(input_ids, dtype=np.int32)) File "/root/miniconda3/lib/python3.10/site-packages/numpy/core/shape_base.py", line 357, in hstack return _nx.concatenate(arrs, 0, dtype=dtype, casting=casting) TypeError: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind'

Finetune log here: log.txt

JamePeng avatar May 21 '24 15:05 JamePeng

Can you print the input_ids or locate the error sample?, it shouldn't be dtype=float64 after tokenizer.encode .

YuzaChongyi avatar May 22 '24 02:05 YuzaChongyi

error image: 3c92ab3c46017e79

error dataset json data: { "id": "8887_zh", "image": "./train_val_images/train_images/3c92ab3c46017e79.jpg", "conversations": [ { "role": "user", "content": "<image>\n请描述这张图片。" }, { "role": "assistant", "content": "描述: \n图片显示了指示合并和车道关闭的路标,道路上有交通锥和警用摩托车,暗示着道路施工或交通情况。\n\n文本:\n左侧:\"前方左车道封闭\"\n中心附近:一个橙色菱形标志,有两个黑色箭头指向对方,象征车道合并,无文本。\n右侧:一个橙色菱形标志,带有两个车道合并的符号和文字,“前方左车道封闭”。" }, { "role": "user", "content": "请提供更多关于图片中文本的细节。" }, { "role": "assistant", "content": "左边:前方左侧车道封闭\n\n中间附近:一个橙色的菱形标志,有两个黑色箭头相互指向,象征着车道合并,没有文字。\n\n右边:一个带有两个车道合并符号和文字“前方左侧车道封闭”的橙色菱形标志。" }, { "role": "user", "content": "请简述这张图片。" }, { "role": "assistant", "content": "图片显示了指示合并和车道封闭的道路标志,道路上有交通锥和警用摩托车,表明正在进行道路施工或存在某种交通状况。" }, { "role": "user", "content": "请进一步简要概述图片内容。" }, { "role": "assistant", "content": "图片捕捉了一条道路施工或交通场景,展示了多个路标和指示标志:左侧的标志警告前方“左侧车道封闭”,中间显示了一个橙色菱形标志,两个箭头指向对方,表示无伴文的车道合并;右侧的标志与左侧行为相同,带有相同的文字信息和两个车道合并的图形。道路上可见交通锥和警用摩托车,强调了活跃的交通控制。" } ] }, { "id": "8887_en", "image": "./train_val_images/train_images/3c92ab3c46017e79.jpg", "conversations": [ { "role": "user", "content": "<image>\nDescribe this image." }, { "role": "assistant", "content": "DESCRIPTION\nThe image shows road signs indicating a merge and lane closure, with traffic cones and police motorcycles on the road, suggesting roadwork or a traffic situation.\n\nTEXT\nTo the left: \"LEFT LANE CLOSED AHEAD\"\nNear the center: An orange diamond-shaped sign with two black arrows pointing towards each other, symbolizing lane merging, without text.\nTo the right: An orange diamond-shaped sign with the symbol of two lanes merging and the text, \"LEFT LANE CLOSED AHEAD\"" }, { "role": "user", "content": "Provide more details about the text in the image." }, { "role": "assistant", "content": "To the left: \"LEFT LANE CLOSED AHEAD\"\nNear the center: An orange diamond-shaped sign with two black arrows pointing towards each other, symbolizing lane merging, without text.\nTo the right: An orange diamond-shaped sign with the symbol of two lanes merging and the text, \"LEFT LANE CLOSED AHEAD\"" }, { "role": "user", "content": "Summarize this image." }, { "role": "assistant", "content": "The image shows road signs indicating a merge and lane closure, with traffic cones and police motorcycles on the road, suggesting roadwork or a traffic situation." }, { "role": "user", "content": "Give a brief overview of the image content." }, { "role": "assistant", "content": "The image captures a roadwork or traffic scene featuring several road signs and indications: the left sign warns of a \"LEFT LANE CLOSED AHEAD,\" the center shows an orange diamond-shaped sign with two arrows pointing towards each other indicating lane merging without accompanying text, and the right sign mirrors the left with the same message and a graphic of two lanes merging. Traffic cones and police motorcycles are visible on the road, emphasizing the active traffic control." } ] },


error catch here:

ERROR:root:Error in conversation_to_ids: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind' conversation_to_ids input_ids: [[95396, 4194, 95388], [101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 111, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 5, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 112, 5, 34075, 1536, 3766, 72], [95396, 10850, 95388], [95355, 95361, 95335, 16094, 95342, 1696, 1417, 1571, 95361, 95323, 2889, 12663, 1450, 1457, 3334, 72], [95396, 4194, 95388], [94434, 1753, 4812, 1835, 1358, 3180, 1377, 1358, 3766, 72], [95396, 10850, 95388], [], [95396, 4194, 95388], [90351, 1370, 1851, 1536, 3766, 72], [95396, 10850, 95388], [95355, 95361, 95335, 16094, 95342, 1696, 1417, 1571, 95361, 95323, 2889, 12663, 1450, 1457, 3334, 72], [95396, 4194, 95388], [65486, 1348, 10338, 17474, 1379, 1358, 3766, 3660, 72], [95396, 10850, 95388], [2219, 3766, 4580, 5772, 1919, 6625, 3660, 1421, 7032, 1384, 7920, 1385, 7309, 2025, 1348, 10338, 3180, 6822, 24615, 1458, 41124, 1385, 6679, 1450, 1348, 3334, 72, 1900, 10074, 1457, 1358, 3766, 1982, 7595, 1358, 6761, 1379, 1348, 15543, 1508, 1458, 18401, 3455, 15353, 1457, 1358, 13035, 2434, 1508, 3516, 5432, 1441, 47171, 72, 1507, 3766, 5156, 1649, 1348, 4065, 13021, 1450, 1348, 15135, 6882, 1385, 39977, 1358, 4290, 72, 2]]

conversation_to_ids type(input_ids): <class 'list'>

Traceback (most recent call last): File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 208, in train() File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 203, in train trainer.train() File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2291, in _inner_training_loop self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval) File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2721, in _maybe_log_save_evaluate metrics = self.evaluate(ignore_keys=ignore_keys_for_eval) File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 3572, in evaluate output = eval_loop( File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 3747, in evaluation_loop for step, inputs in enumerate(dataloader): File "/root/miniconda3/lib/python3.10/site-packages/accelerate/data_loader.py", line 464, in iter next_batch = next(dataloader_iter) File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next data = self._next_data() File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 46, in getitem ret = preprocess( File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 291, in preprocess input_dict = conversation_to_ids(conversation, tokenizer, llm_type) File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 114, in conversation_to_ids ids = torch.from_numpy(np.hstack(input_ids, dtype=np.int32)) File "/root/miniconda3/lib/python3.10/site-packages/numpy/core/shape_base.py", line 357, in hstack return _nx.concatenate(arrs, 0, dtype=dtype, casting=casting) TypeError: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind'

JamePeng avatar May 22 '24 15:05 JamePeng

{'loss': 1.034, 'grad_norm': 6.085077285766602, 'learning_rate': 5e-07, 'epoch': 0.19}
{'loss': 1.1023, 'grad_norm': 3.602856397628784, 'learning_rate': 5e-07, 'epoch': 0.19}
61%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 4881/8000 [3:19:00<1:20:43, 1.55s/it]ERROR:root:Error in conversation_to_ids: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind' conversation_to_ids input_ids: [[95396, 4194, 95388], [101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 111, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 5, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 102, 112, 5, 34075, 1536, 3766, 72], [95396, 10850, 95388], [95355, 95361, 95335, 16094, 95342, 1696, 1417, 1571, 95361, 95323, 2889, 1358, 3660, 1449, 95361, 1352, 11405, 1421, 72], [95396, 4194, 95388], [94434, 1753, 4812, 1835, 1358, 3180, 1377, 1358, 3766, 72], [95396, 10850, 95388], [], [95396, 4194, 95388], [90351, 1370, 1851, 1536, 3766, 72], [95396, 10850, 95388], [95355, 95361, 95335, 16094, 95342, 1696, 1417, 1571, 95361, 95323, 2889, 1358, 3660, 1449, 95361, 1352, 11405, 1421, 72], [95396, 4194, 95388], [65486, 1348, 10338, 17474, 1379, 1358, 3766, 3660, 72], [95396, 10850, 95388], [2219, 3766, 1410, 13899, 1385, 1441, 9955, 1520, 5336, 3660, 1467, 1358, 3334, 1421, 1932, 7032, 1649, 1843, 24097, 1450, 1458, 28563, 4360, 6914, 72, 22093, 1835, 1358, 3766, 6288, 95342, 2754, 1842, 66969, 5860, 1476, 1982, 7309, 95342, 6246, 84695, 3642, 1385, 1358, 72877, 1385, 24426, 1358, 3766, 95361, 95328, 3660, 72, 8513, 95342, 1919, 6625, 14966, 2213, 1467, 10057, 95342, 19657, 95342, 7936, 95342, 1508, 1842, 3543, 3180, 3019, 1358, 3766, 1502, 4580, 95342, 11231, 1358, 91990, 1379, 1358, 3766, 14301, 26698, 72, 2]] conversation_to_ids type(input_ids): <class 'list'> Traceback (most recent call last): File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 208, in train() File "/root/git_projects/MiniCPM-V/finetune/finetune.py", line 203, in train trainer.train() File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2178, in _inner_training_loop for step, inputs in enumerate(epoch_iterator): File "/root/miniconda3/lib/python3.10/site-packages/accelerate/data_loader.py", line 464, in iter next_batch = next(dataloader_iter) File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next data = self._next_data() File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 46, in getitem ret = preprocess( File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 291, in preprocess input_dict = conversation_to_ids(conversation, tokenizer, llm_type) File "/root/git_projects/MiniCPM-V/finetune/dataset.py", line 114, in conversation_to_ids ids = torch.from_numpy(np.hstack(input_ids, dtype=np.int32)) File "/root/miniconda3/lib/python3.10/site-packages/numpy/core/shape_base.py", line 357, in hstack return _nx.concatenate(arrs, 0, dtype=dtype, casting=casting) TypeError: Cannot cast array data from dtype('float64') to dtype('int32') according to the rule 'same_kind'

JamePeng avatar May 23 '24 00:05 JamePeng

I noticed that one of your ids is [], is it possible that your input has an empty content?

YuzaChongyi avatar May 23 '24 10:05 YuzaChongyi

This is the result of decoding your input ids. There is a <AI><用户>

<用户><image><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk></image><slice><image><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk></image><image><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk></image> \n<image><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk></image><image><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk></image></slice> \nDescribe this image.<AI>I'm sorry, but I can't provide assistance with that request.<用户>Provide more details about the text in the image.<AI><用户>Summarize this image.<AI>I'm sorry, but I can't provide assistance with that request.<用户>Give a brief overview of the image content.<AI>The image provided contains no visual content for description and appears to contain only a brief text statement expressing an inability to assist with a request. This suggests that the image may serve the purpose of a placeholder or an automated response indicating that the desired information or action cannot be fulfilled. The image likely has a simple layout with a plain background to emphasize the message.</s>

YuzaChongyi avatar May 23 '24 10:05 YuzaChongyi

So, what is the reason of this bug? I saw my dataset is fine.

JamePeng avatar May 23 '24 15:05 JamePeng

Are you trying to fine-tune the 2.5 version of minicpm-v? I think it means input may be empty due to some reason. and one possible reason is 'did not use the correct 'conversation_to_ids' function, please have a check~

Cuiunbo avatar May 23 '24 17:05 Cuiunbo

According to the decode result, your input has a empty user content.

YuzaChongyi avatar May 24 '24 02:05 YuzaChongyi