Add annotation.reference information to the CSV exports
The problem CSVs are much more approachable than JSON files for the average user, and instructors using annotation exports for various kinds of analysis want to see the relationship between annotations.
Example ticket: https://app.hubspot.com/contacts/6291320/record/0-5/18074066011
The solution In the "export" option in the client, include the "reference" information so people using the exports for analysis can easily relate or reconstruct the annotation threads.
Example: Current CSV export: Created at Author Page URL Group Type Quote Comment Tags 2024-12-20 10:09 mdiroberts https://example.com/ abc internal testing? Reply reply 2024-10-30 14:02 mdiroberts https://example.com/ abc internal testing? Annotation documents anno question
Proposed CSV export: Created at Author Page URL Group Type ID Reference Quote Comment Tags 2024-12-20 10:09 mdiroberts https://example.com/ abc internal testing? Reply "X72iLr7kEe-8vIsqCNlnHw" "F5YqwJbpEe-kJWcL3BHQxQ" reply 2024-10-30 14:02 mdiroberts https://example.com/ abc internal testing? Annotation "F5YqwJbpEe-kJWcL3BHQxQ" NULL documents anno question
The references field in the API is an array containing every ancestor of the annotation in the thread. Some ancestors may have been deleted, so you need the full list to be sure of being able to associate a reply with its top-level annotation. In JSON this is straightforward to encode as an array. In CSV we'd need to choose an encoding. The simplest solution is a comma-separated list, making sure that the field is properly escaped when exported.
For encoding: currently the list of tags on an annotation are handled correctly by Google Sheets when importing the csv, though I've seen issues with Excel properly decoding them. Excel will keep assuming that each tag in the list is a new column value, steadily displacing all f the data for subsequent rows.
Some additional context from an instructor (to help with prioritization):
JSON files are not practical. I need the text of annotation and replies in a text format to use in a text-based program for qualitative research. It is important to know which reply attaches to what “original” annotation for the purpose of data analysis, since I will need to treat replied to annotation differently to original annotations. Also, for an in-depth content analysis, I need to know which rely matches to what annotation. The replies are almost useless unless I know what they are replying to.
I'd forgotten we'd already had to solve encoding lists for handling the tags field. We should treat references in the same way. The request makes sense and is likely quite straightforward to implement.
JSON files are not practical. I need the text of annotation and replies in a text format to use in a text-based program for qualitative research.
For what it's worth, an interim solution may be to use AI to help with this:
- Go to ChatGPT
- Start a new chat and attach an exported JSON file
- Enter a prompt like: "Convert the records in this JSON file to CSV. Include only these fields: ID, username, text, tags, references."
This worked for me for a small file of 10-20 annotations. Not sure if it will work with a much larger one.