Text classification : Target 4294967295 is out of bounds
System Information (please complete the following information):
- ML.NET Version: 2.0
Describe the bug
Detailed info can be found in #https://github.com/dotnet/machinelearning-modelbuilder/issues/2369
It seems that if label's value is NaN or any values that not appear in ValueToKey term mapper, that label will be mapped to 0.
https://github.com/dotnet/machinelearning/blob/9d798f1bb3fb17fe97eba77a694c35e2cb46a4b7/src/Microsoft.ML.Data/Transforms/ValueToKeyMappingTransformerImpl.cs#L720
So maybe a fix can be proccessing data and filter out the rows where label is null before training in text classification?
Yeah, the torch sharp model is 1 based. I don't know if they have an "unknown" value or not. If they do we need to map 0 to that correctly. If not we need to filter it out before we pass it.
Hi guys!
Please clarify how you can temporarily solve the problem?
To work with labels, I made an enum in advance, where the elements began with both 0 and 1. When starting from 0 I get the error: System.Runtime.InteropServices.ExternalException (0x80004005): Target 4294967295 is out of bounds. When starting from 1 I get the error: System.Runtime System.Runtime.InteropServices.ExternalException (0x80004005): Target 39 is out of bounds.
The fix on our side is in dataset filter out the labels where value is empty or null