human-eval icon indicating copy to clipboard operation
human-eval copied to clipboard

Task 145 makes no sense

Open jack-jjm opened this issue 1 year ago • 2 comments

The prompt, canonical solution and tests for task 145 are:

def order_by_points(nums):
    """
    Write a function which sorts the given list of integers
    in ascending order according to the sum of their digits.
    Note: if there are several items with similar sum of their digits,
    order them based on their index in original list.

    For example:
    >>> order_by_points([1, 11, -1, -11, -12]) == [-1, -11, 1, -12, 11]
    >>> order_by_points([]) == []
    """
    def digits_sum(n):
        neg = 1
        if n < 0: n, neg = -1 * n, -1 
        n = [int(i) for i in str(n)]
        n[0] = n[0] * neg
        return sum(n)
    
    return sorted(nums, key=digits_sum)


def check(candidate):
    # Check some simple cases
    assert candidate([1, 11, -1, -11, -12]) == [-1, -11, 1, -12, 11]
    assert candidate([1234,423,463,145,2,423,423,53,6,37,3457,3,56,0,46]) == [0, 2, 3, 6, 53, 423, 423, 423, 1234, 145, 37, 46, 56, 463, 3457]
    assert candidate([]) == []
    assert candidate([1, -11, -32, 43, 54, -98, 2, -3]) == [-3, -32, -98, -11, 1, 2, 43, 54]
    assert candidate([1,2,3,4,5,6,7,8,9,10,11]) == [1, 10, 2, 11, 3, 4, 5, 6, 7, 8, 9]
    assert candidate([0,6,6,-76,-21,23,4]) == [-76, -21, 0, 4, 23, 6, 6]

    # Check some edge cases that are easy to work out by hand.
    assert True, "This prints if this assert fails 2 (also good for debugging!)"

This makes no sense for negative inputs. For example, look at the first test:

assert candidate([1, 11, -1, -11, -12]) == [-1, -11, 1, -12, 11]

One reasonable interpretation for "digit sum" for a negative integer would be just the digit sum of the absolute value. Another would be the negative of the digit sum of the absolute value. But neither of those rules seem to be used here. Instead, if we look at the canonical solution, the interpretation seems to be that we should think of the minus sign as applying only to the first digit in the number, so a number like -186 breaks down as (-1, 8, 6) and has a digit sum of 13.

Theoretically it would be possible to infer this rule from the example given in the prompt, but it's still extremely difficult to guess the intended meaning here, even for a human. This may be a deliberate design choice, but I think it's worth flagging here so people can at least be aware of it.

jack-jjm avatar May 20 '24 14:05 jack-jjm

I think I have found the same problem. As a human, if I cannot get the standard answer, even with the example in the prompt, I can hardly know about the exact meaning about what I should do.

xiao-yao-er avatar Mar 07 '25 12:03 xiao-yao-er

Did you check with Hierarchical Prompting Taxonomy?

SmartManoj avatar Mar 11 '25 14:03 SmartManoj