data-juicer icon indicating copy to clipboard operation
data-juicer copied to clipboard

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Results 117 data-juicer issues
Sort by recently updated
recently updated
newest added

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

### Before Reporting 报告之前 - [x] I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。 - [x] I have read the...

bug

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

./data-juicer/my_pretrained_method 中的文件都是从对应名称项目git clone下来的

good first issue
dj:multimodal
dj:op

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question

### Before Asking 在提问之前 - [x] I have read the [README](https://github.com/alibaba/data-juicer/blob/main/README.md) carefully. 我已经仔细阅读了 [README](https://github.com/alibaba/data-juicer/blob/main/README_ZH.md) 上的操作指引。 - [x] I have pulled the latest code of main branch to run again and...

question