求助
训练完成后无法训练特征索引
只有这个显示
控制台报'AsyncRequest' object has no attribute '_json_response_data',疑似卡住了,可我的数据集也就10多分钟的语音,大小为5.86 MB ,117个文件
点击训练特征索引,任务管理器查看cpu使用率一度飙升到50%左右,过一会降到了10%左右,之后磁盘使用率大部分时间是0,少部分时间跳到1%有马上跳到0
各位大佬知道怎么解决吗?
点训练特征索引后还生成了trained_IVF822_Flat_nprobe_1_test_v2.index,就是没有add开头的索引文件
hi! did you solve this?
No, I can only generate the required index files by halving the dataset. Maybe my 4060 graphics card has too little video memory? Only 8G. Or is it another hardware issue? 13th Gen Intel(R) Core(TM) i7-13650HX (2.60 GHz) 16GB of RAM
How did you discover this and why do you think halving the dataset works? I have a dataset of around ~45 mins. I somehow got an added_ prefixed .index file but I have no idea how I got it, I am not able to reproduce it and only get the trained_*.index file. I am on 8 L4s setup, 24gb vram each, so memory is definitely not an issue. 400gb ram, 96 cpus.
I am not sure how to get added_*.index file and can't really get any guide on it? let me know if you figure it out! @zeng-hao123
I am just a novice myself, and I just want to try this project out of personal interest, so I don’t quite understand the principles involved. I only use certain features of this project. As for how I knew to halve the file, I asked the AI, and they told me that my dataset was in compressed formats like OGG and MP3, which would become very large after running, far exceeding the memory of my graphics card. So, they told me to halve the file, and ultimately, I successfully obtained the file. As long as you click on the training feature index, the ultimate goal of this process is to obtain the added_.index file. The training_.index file is just a product of this process. Once completed, it can only be used for sharing or retraining. That's my understanding.
sorry if i missed it, but how large was your dataset? and you're right, the ultimate goal is to get the add_*.index file.
I initially used a 5.86MB dataset (mostly Ogg, with some MP3, and a duration of 11 minutes and 57 seconds). I later changed it to a 3.13MB dataset (mostly Ogg, with some MP3, and a duration of 6 minutes and 15 seconds) and achieved results. Adding more datasets did not yield results. I ran this project on a gaming laptop with 16GB of RAM and a 4060 GPU with 8GB of VRAM.
you were right! reducing the dataset by a huge amount does the job. i saw this in the FAQs, confirming the solution:
"The lack of an 'added' index file after One-click training may be due to the training set being too large, causing the addition of the index to get stuck"
the solution is still pretty weird and does not exactly make sense to me currently as to why it works.