botocore icon indicating copy to clipboard operation
botocore copied to clipboard

MTurk `Answer` in `list_assignments_for_hit` are unparsed

Open awatts opened this issue 8 years ago • 1 comments

First of all, thanks for finally brining MTurk support into boto3/botocore!

One thing I've discovered that seems like a regression versus boto is when you use list_assignments_for_hit and get a list of Assignments that contain an Answer key that contains unparsed QuestionFormAnswers XML. Here's a (somewhat anonymized example):

{'Assignments': [{'AcceptTime': datetime.datetime(2017, 3, 9, 20, 41, 3, tzinfo=tzlocal()),
   'Answer': '<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<QuestionFormAnswers xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionFormAnswers.xsd">\n<Answer>\n<QuestionIdentifier>practiceResp</QuestionIdentifier>\n<FreeText/>\n</Answer>\n<Answer>\n<QuestionIdentifier>errors</QuestionIdentifier>\n<FreeText/>\n</Answer>\n<Answer>\n<QuestionIdentifier>userAgent</QuestionIdentifier>\n<FreeText>Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>audio_stall</QuestionIdentifier>\n<FreeText>no</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>audio_type</QuestionIdentifier>\n<FreeText>over-ear</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>audio_qual</QuestionIdentifier>\n<FreeText>excellent</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>SpeakerOdd</QuestionIdentifier>\n<FreeText>no</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>MattDescribe</QuestionIdentifier>\n<FreeText>normal|serious</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>comments</QuestionIdentifier>\n<FreeText/>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.age</QuestionIdentifier>\n<FreeText>52</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.sex</QuestionIdentifier>\n<FreeText>anonymized</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.ethnicity</QuestionIdentifier>\n<FreeText>anonymized</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.race</QuestionIdentifier>\n<FreeText>anonymized</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.raceother</QuestionIdentifier>\n<FreeText/>\n</Answer>\n<Answer>\n<QuestionIdentifier>rsrb.protocol</QuestionIdentifier>\n<FreeText>Replaced to anonymize</FreeText>\n</Answer>\n<Answer>\n<QuestionIdentifier>exposure4aResp</QuestionIdentifier>\n<FreeText>Cut to save space</FreeText>\n</Answer>\n</QuestionFormAnswers>\n',
   'ApprovalTime': datetime.datetime(2017, 3, 12, 16, 40, 10, tzinfo=tzlocal()),
   'AssignmentId': 'replaced to anonymize',
   'AssignmentStatus': 'Approved',
   'AutoApprovalTime': datetime.datetime(2017, 3, 24, 21, 57, 4, tzinfo=tzlocal()),
   'HITId': 'replaced to anonymize',
   'SubmitTime': datetime.datetime(2017, 3, 9, 20, 57, 4, tzinfo=tzlocal()),
   'WorkerId': 'replaced to anonymize'}],
 'NextToken': 'p1:+QIGhI093U4uzAjFUdVkZPREWJ5EkoHVjwK9kVZYsFQM91QeIYI/+N0zmjh/uQ==',
 'NumResults': 1,
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '21487',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Thu, 16 Mar 2017 14:40:44 GMT',
   'x-amzn-requestid': '890af077-0a56-11e7-871f-611f68d6492f'},
  'HTTPStatusCode': 200,
  'RequestId': '890af077-0a56-11e7-871f-611f68d6492f',
  'RetryAttempts': 0}}

In original boto, get_assignment returned a (Assignment, HIT) tuple, where the Assignment had an answers property that in turn was a list of boto.mturk.connection.QuestionFormAnswer objects, each of which had a qid (QuestionIdentifier) and fields (the response) property.

As it stands with the MTurk support in botocore, I would have to parse the XML for each Answer. Would it be possible for it to be automatically parsed into a dictionary? Something along these lines:

{ 'QuestionFormAnswers': [
    { 'QuestionIdentifier': ...,
       'FreeText': ...
    }, ...
  ]
}

awatts avatar Mar 16 '17 18:03 awatts

Adding this as a feature request, thanks.

dstufft avatar Mar 23 '17 14:03 dstufft