xlearn
xlearn copied to clipboard
Transform data to binary format before training a model
Hi! Can you please help me to understand the package API. Specifically, can I convert data from libsvm/libffm format to binary format before training a model?
Thanks in advance.
@orenov Hi, xLearn can convert libsvm/libffm to binary automatically. For example:
You have a TXT file called train.txt, and if you run
./xlearn_train ./train.txt
You can find a new file called train.txt.bin
in current file path.
xLearn will check if current path has a .bin
file automatically before training.
Yes. Hi @aksnzhy . Thanks for so quick response.
I'd like to transform data to binary format before training. Like fully separate training phase and data preparation. Ok.
Can you please give me some intuition what can happen if I have no *.bin file at the moment, but then I start training in 4 separate runs with different hyperparameters (with the same data file). Is *.bin file will be created correctly? As all 4 separate scripts won't find *.bin file and start procedure to create it.
I think you can set 4 separate data with different file name, and then xLearn will create 4 different binary data correctly.
And also, the xLearn cross-validation can split the big data file into 4 small file automatically, if you need it.
bin 文件速度会更快?
文件一大跑起来感觉很慢哈 ,有啥建议?
@orenov Hi, xLearn can convert libsvm/libffm to binary automatically. For example:
You have a TXT file called train.txt, and if you run
./xlearn_train ./train.txt
You can find a new file called
train.txt.bin
in current file path.xLearn will check if current path has a
.bin
file automatically before training.
格式转换之后还有训练过程比较慢,看代码目前应该没有单独的transform流程吧,对于大文件,数据预先转换成binary格式会快很多