ud-annotatrix icon indicating copy to clipboard operation
ud-annotatrix copied to clipboard

Editor unusable with large file upload

Open nschneid opened this issue 2 years ago • 7 comments

I uploaded a large .conllu file (1400 sentences), and while it loaded successfully and I can browse the trees, the editor is too laggy for me to make edits—e.g. if I try to change a deprel it gets stuck. If I just upload the first few sentences from the file, everything works fine.

nschneid avatar Apr 01 '22 18:04 nschneid

Yeah, this is a bit of an architectural problem. Currently, we just serialize the entire treebank and try to load it on the client here: https://github.com/jonorthwash/ud-annotatrix/blob/f6865f17487daea9b0d296035d32a6b85cc5ea41/client/server.js#L122 via https://github.com/jonorthwash/ud-annotatrix/blob/f6865f17487daea9b0d296035d32a6b85cc5ea41/server/routes.js#L125-L131

This is a bit sad, since we really only need to load the sentences one at a time, which we already sorta-support (but just don't use?): https://github.com/jonorthwash/ud-annotatrix/blob/f6865f17487daea9b0d296035d32a6b85cc5ea41/server/models/corpus.js#L33

keggsmurph21 avatar Apr 02 '22 16:04 keggsmurph21

FWIW, I think we could improve here, but it's a bit tough to make these large changes with confidence. I think maybe if we added some static type information it would be more feasible :smile:

keggsmurph21 avatar Apr 02 '22 16:04 keggsmurph21

I am currently investigating the issue. The problem might be caused by too much load on the working memory or the local storage, will have to look further to see which.

Zensho avatar Jun 09 '22 19:06 Zensho

if I try to change a deprel it gets stuck

@nschneid, when you say "gets stuck", do you mean that Annotatrix freezes, or just a UI problem like the label staying where it is after updating the tree?

jonorthwash avatar Jun 10 '22 14:06 jonorthwash

As I recall it refused to change the tree or allow further edits.

nschneid avatar Jun 10 '22 15:06 nschneid

The problem seems to be that localStorage in general allows no more than 5200000 characters in total, and with it only taking in strings any large corpus simply exceeds the quota. I will try to use IndexedDB to persist the data as an alternative.

Zensho avatar Jun 10 '22 15:06 Zensho

I think maybe if we added some static type information it would be more feasible 😄

Hi Kevin, I am currently working on this issue, and I am wondering what does the above sentence mean, and how would it help in achieving reading only one sentence at a time?

Zensho avatar Jun 16 '22 18:06 Zensho