bedtools2 icon indicating copy to clipboard operation
bedtools2 copied to clipboard

bedtools map: **** ERROR: illegal number "1.00000". Exiting...

Open tjakobi opened this issue 2 years ago • 6 comments

Using bedtools map I get this strange error that is different from the results with previous bedtools versions (i.e. 2.28.0):

tjakobi@t470-tjakobi> bedtools --version
bedtools v2.30.0
tjakobi@t470-tjakobi>  bedtools map -c 10 -o max -a A_FILE -b B_FILE
***** ERROR: illegal number "0.4900000". Exiting...
tjakobi@t470-tjakobi> bedtools --version
bedtools v2.28.0
tjakobi@t470-tjakobi>  bedtools map -c 10 -o max -a A_FILE -b B_FILE
chr1    2066786 2072522 intron_chr1:2066786-2072522     1       +       1       5736    5736   1.000000      XYZ     .       .

A_FILE:

chr1    2066786 2072522 intron_chr1:2066786-2072522     1       +       1       5736    5736    1.000000       XYZ     .

B_FILE:

chr1    32672415        32672602        intron_chr1:32672415-32672602   1       +       6       150     187     0.8021390

Looking at the source code of bedtools, the error must originate from ParseTools.cpp:

CHRPOS str2chrPos(const char * __restrict str, size_t ulen) {

	if (ulen == 0) {
		ulen = strlen(str);
	}

	const char* endpos = str;
	long long result = 0;
	bool neg = false;
	char last = 0;

	if(*endpos == '-') neg = true, endpos ++;

	for(;(last = *endpos); endpos ++) {
		if(last < '0' || last > '9') break;
		result = result * 10 + last - '0';
	}

	if(last) {
		if(*endpos == 'e' || *endpos == 'E') {
			char* endpos = NULL;
			CHRPOS ret = (CHRPOS)strtod(str, &endpos);

			if(endpos && *endpos == 0) {
				return ret;
			}
		}
		fprintf(stderr, "***** ERROR: illegal number \"%s\". Exiting...\n", str);
		exit(1);
	}

	return neg?-result:result;
}

However, this seems to be a function to convert a string to a position?

Is this expected behavior? I could not find any other issue regarding this specific error.

Thank you,

Tobias

tjakobi avatar Mar 07 '22 23:03 tjakobi

Can you test this with the latest version on github by cloning the repository and compiling it? I believe this has been fixed. cc @38

arq5x avatar Mar 08 '22 11:03 arq5x

I just tried bedtools v2.30.0-48-g868a9a24 after cloning and compiling, the same error persists.

tjakobi avatar Mar 08 '22 16:03 tjakobi

After look into it, I think the root cause is the input is not a valid BED12 file.

Bedtools uses the number of fields to determine what variation of bed file it's parsing. For your instance, bedtools is trying to parse the input as BED12 while "1.000000" is where the block count located which should be a integer.

str2chrPos is used at this point is basically due to the some legacy, low quality code and we are planning to fix it in the future. But this isn't directly related to this issue - because block count needs to be a integer by definition.

38 avatar Mar 10 '22 17:03 38

If you add an addition field to your input, you should get expected output:

chr1    2066786 2072522 intron_chr1:2066786-2072522     1       +       1       5736    5736    1.000000        XYZ     .       .
chr1    32672415        32672602        intron_chr1:32672415-32672602   1       +       6       150     187     0.8021390       .

This is because the additional field prevent bedtools from identifying the input as bed12

38 avatar Mar 10 '22 18:03 38

That makes sense, thank you for looking into that. It's a BED file produced by a third party tool; given that it worked with an older bedtools version I did not check the actual BED file validity.

Thank you for the detailed response!

tjakobi avatar Mar 10 '22 21:03 tjakobi

I've got the same issue with a script that incrementally adds columns with non-integer values to a bed file using bedtools map. The turn around was to count the number of columns throughout the script and to add an artificial column when reaching 12 before running bedtools map, and then to remove this 13th column at the end of the script. The most difficult part was to figure out what was the issue, which brought me to this page.

This script was working fine with bedtools 2.25 as far as I remember.

If this is feasible, avoiding the error and considering a bed6 file when the columns 6-12 do not match with a bed12 file, could be a good option here (with a potential warning, maybe).

Best -Gael

retrogenomics avatar Jan 24 '23 17:01 retrogenomics