javarosa icon indicating copy to clipboard operation
javarosa copied to clipboard

Investigate what XmlTextConsolidator.consolidateText does and consider removing

Open lognaturel opened this issue 5 years ago • 2 comments

In #192, @dcbriccetti pulled XmlTextConsolidator.consolidateText out of XFormParser along with its cryptic comment:

/**
  * According to Clayton Sims, in Feb 2012:
  * For escaped unicode strings we end up with a lot of cruft, so we really
  * want to go through and convert the kxml parsed text (which have lots of
  * characters each as their own string) into one single string
  */

From this comment, it's unclear whether the functionality was added to fix a bug or in an attempt to improve performance.

In #192, @dcbriccetti identified that the node removal was very expensive and improved that. @dcbriccetti, I know it's forever ago (hopefully good memories!), but any chance that you remember whether you identified cases where this was actually helpful?

@JohnTheBeloved has recently been giving performance another look and has seen that even with #192's performance improvements, consolidateText is time-consuming. He has commented it out with a few test forms and achieved higher performance with no apparent loss of functionality. He has not tried parsing a form that includes escaped unicode strings, though. @JohnTheBeloved did I summarize that correctly?

Writing and reading from cache is much more time-consuming than consolidateText so it represents a relatively small proportion of total form load time in Collect. This means priority is low but it would be good to understand why the functionality is there and whether it can be safely removed.

lognaturel avatar Jun 21 '19 04:06 lognaturel