DeepMicrobes icon indicating copy to clipboard operation
DeepMicrobes copied to clipboard

Something about "subset*.tfrec"

Open yongrenr opened this issue 1 year ago • 7 comments

Hello! I think your work is great. I'd love to try running your code, but after following your steps this happens, can you tell me how I can fix it? Looking forward to your reply! image image

yongrenr avatar May 21 '24 11:05 yongrenr

Hi, the issue seems to be due to parallel, not DeepMicrobes. Please try installing parallel first (not use parallel provided here) and make sure that installation is fine.

MicrobeLab avatar May 22 '24 07:05 MicrobeLab

Hi, the issue seems to be due to parallel, not DeepMicrobes. Please try installing parallel first (not use parallel provided here) and make sure that installation is fine. Thank you for your promptness!I have solved this problem, the reason is linux's own problem. I have now encountered another problem. I would like to ask, what is the format and reading method of the input file? I tried to input my own FA format file, but the following problem occurred: image image image

I tried changing label_id = int(identifier.split('|')[1]) to label_id = int(identifier.split(' ')[1]), but the above ascii problem still occurred.Looking forward to your reply!

yongrenr avatar May 22 '24 07:05 yongrenr

Hi, I'm not sure how to parse the seq_id in your file. This issue is not related to DeepMicrobes codes. The index error showed that the code failed to get the second number as desired.

MicrobeLab avatar May 22 '24 07:05 MicrobeLab

Hi, I'm not sure how to parse the seq_id in your file. This issue is not related to DeepMicrobes codes. The index error showed that the code failed to get the second number as desired.

Hello, I am very interested in your work. I have a few simple questions I'd like to ask you: 1.I want to reproduce your final classification experiment and I would like to know if the data you are using is DeepMicrobes/mag_reads_150bp_1w? How do I run them in batches?

2.If I have two files, one for label.txt and one for sequence.txt, how do I use your model for training and classification? The label.txt is the processed classification label, and the sequence.txt is the sequence in the fasta file. We look forward to hearing from you!

yongrenr avatar May 23 '24 03:05 yongrenr

Hi,

  1. Sorry, not sure what do you mean by "in batches";
  2. I would recommend that you format your fastq headers to the same as ours.

MicrobeLab avatar May 24 '24 03:05 MicrobeLab

Hi,

  1. Sorry, not sure what do you mean by "in batches";
  2. I would recommend that you format your fastq headers to the same as ours. Hello, I can't express it clearly enough. I would like to ask if the label_id of the seq, label_id = training_set_read_parser (rec) in the image is the label corresponding to the seq. If so, I think my approach should be effective. What do you think? Looking forward to hearing from you!!! image

yongrenr avatar May 24 '24 09:05 yongrenr

I think you can feel free to change the codes as long as no bug occurs :)

MicrobeLab avatar May 24 '24 09:05 MicrobeLab