ranking icon indicating copy to clipboard operation
ranking copied to clipboard

How to find label from the estimator predict call for ranker eval purposes

Open vitalyli opened this issue 2 years ago • 2 comments

The eval_input_fn feeding a tuple of (features, label) from a ELWC dataset. Could someone share how to extract a corresponding label from the estimator predict call for the ranker eval purposes..?

for prediction in estimator.predict(eval_input_fn):
    doc_count = len(prediction)

    print(prediction)

    #  [ 2.556894   2.7441869  1.4277446 -0.6233128 -2.2694526 -5.4062233
    #    -2.8042    -3.6856973 -3.3688583 -6.5343776 -6.943048  -5.972501 ]

    smax = stable_softmax(prediction)
    rank_idx = np.argmax(smax)

    print(rank_idx)

    # 1
  
    # what was the label for this rank group?

vitalyli avatar Jul 30 '21 05:07 vitalyli

@vitalyli the predict call function returns the scores for each example in ELWC. It does not take the label as input nor returns it. If you want to access the label, you can access it directly from the eval_input_fn.

ramakumar1729 avatar Aug 03 '21 18:08 ramakumar1729

The eval_input_fn is a function pointer right, the estimator base class takes eval_input_fn as an input and starts pulling a stream of samples from my test set. Each result in that stream would be a tuple: (features,label). The issue with predict call, is that the estimator is tossing out label and returns predictions only. Which I understand why it's hiding it, to keep everyone honest. However I want to be able to do predict, softmax, argmax and get index of predicted ranking and then compare, that with the label index. The Estimator currently doesn't allow to override this behavior in any way. Meanwhile the order in which samples come out from the eval_input_fn is via a dataset that pulls it from N files it finds on disk. The oder of what estimator is processing is not clear if that can be repeated by running second loop outside of estimator in order to align label with predictions. My ask is if predict call could have an option to return label indexes as was supplied by the eval_input_fn. The default and only mode currently is, it's tossing labels without a way to get to them.

vitalyli avatar Aug 03 '21 19:08 vitalyli