Support string sorting
Describe the feature and the current behavior/state.
Currently string sorting in the graph is unsupported:
Does not work:
tf.sort(['d', 'a', 'b'])
InvalidArgumentError: Value for attr 'T' of string is not in the list of allowed values: bfloat16, half, float, double, int8, int16, int32, int64, complex64, complex128
; NodeDef: {{node Neg}}; Op<name=Neg; signature=x:T -> y:T; attr=T:type,allowed=[DT_BFLOAT16, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_INT8, DT_INT16, DT_INT32, DT_INT64, DT_COMPLEX64, DT_COMPLEX128]> [Op:Neg]
Works (but not in the graph):
>>> x = tf.constant(['d', 'a', 'b'])
>>> tf.convert_to_tensor(sorted(x.numpy()), tf.string)
<tf.Tensor: shape=(3,), dtype=string, numpy=array([b'a', b'b', b'd'], dtype=object)>
Can tf.sort be amended to allow string sorting?
Will this change the current api? How?
No
Who will benefit with this feature?
Anyone who might need to string-sort in the graph
Any Other info.
I checked the code of operation, the problem is that tf.sort actually call the TOPK OP that does not support tf.string input type but only real number type.
REGISTER_OP("TopKV2")
.Input("input: T")
.Input("k: int32")
.Output("values: T")
.Output("indices: int32")
.Attr("sorted: bool = true")
.Attr("T: realnumbertype") // real num type only
.SetShapeFn(TopKShapeFn);
If change of string type is acceptable with interface design and the scope of change ? if it is acceptable, I'd like to commit the c++ code of impl.
@jvishnuvardhan
I checked the code of operation, the problem is that tf.sort actually call the TOPK OP that does not support tf.string input type but only real number type.
REGISTER_OP("TopKV2") .Input("input: T") .Input("k: int32") .Output("values: T") .Output("indices: int32") .Attr("sorted: bool = true") .Attr("T: realnumbertype") // real num type only .SetShapeFn(TopKShapeFn);If change of string type is acceptable with interface design and the scope of change ? if it is acceptable, I'd like to commit the c++ code of impl.
May be have it included in tf.strings like
tf.strings.sort(['d', 'b', 'a'])
>> ['a', 'b', 'c']
Hi,
Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.
The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.
Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.