mathdown icon indicating copy to clipboard operation
mathdown copied to clipboard

User-visible warning for disconnected/not saving state

Open cben opened this issue 10 years ago • 15 comments

Yesterday I pushed broken CM/firepad combo such that pages loaded, you could edit but edits weren't saved! There was no user-visible indication this is the case. But simply being offline gives a similar result.

  • [x] Visible indication that firebase is offline and not syncing.
  • [ ] Close the loop verifying edits really made it into firebase (REST api?), such that unsaved edits — for any reason — for several seconds give the warning. It's not enough to check firepad's idea of version made it to the server; I suspect in this case there was an exception early in firepad such that it's version. Ideal might be comparing full text (not too frequently)?

Local storage backup wouldn't hurt too, but that's another story.

cben avatar Oct 03 '14 11:10 cben

Asked for advice at firepad group: https://groups.google.com/forum/#!topic/firepad-io/s1aTZXZniAM

cben avatar Apr 29 '15 14:04 cben

This is partially implemented, and works very nicely when typing and when internet goes offline/online. But I haven't yet tested how it behaves in the face of bugs like #85 — I'm relying on Firepad knowing when it's synced but there might be a situation when Firepad itself isn't aware there was a change in the editor. I think I should use headless firepad API to periodically (e.g. once a minute) do a clean text reconstruction of what is currently saved in firebase and compare to the editor.
Also, cross-compare with Firebase's connected status for more useful reports e.g. "unsaved because offline" vs "online but not saved — BUG?"...

cben avatar May 22 '15 15:05 cben

Bah. What I pushed was somewhat broken (interleaved for extra karma with pushing a an extremely sloppy "redesign" breaking the [New document] button and the whole display on IE)

  • The status indication overlaps the upper part of the header line, interfering with clicking stuff like [bugs]. e0f1b36ccfc2287670c2e24a7da8318c7db6db5e fixed this for most of the width, but the transparent "saved" or "done" text is still there, making it very confusing why you sometimes can't click the underlying [bugs] link (if your window width is such that they overlap).
  • On IE8, not only the transitions don't animate (which I expected an am fine with) but opacity doesn't work at all, so "done" and "saved" texts are visible all the time.

I'm somewhat fond of the fading out to "done" / "saved" so looking at solutions that don't give them up. http://viget.com/inspire/css-pointer-events-and-a-pure-css3-animating-tooltip and http://www.greywyvern.com/?post=337 sound promising...

cben avatar May 25 '15 11:05 cben

BTW, pointer-events: none initially looked like a good direction, except it doesn't work on IE<11, and also didn't work on my Chrome until I set visibility: hidden on #status_container. Setting that also helped visibility transition on status to work, and visibility support is robust so I'm going with only that.

cben avatar May 25 '15 12:05 cben

Back to the real goal: would current indication based on firepad's synced event catch #85? => NO.

(Reproducing #85 took some effort. Going back to the buggy CM-MJ wasn't enough; it seems firebase-debug.js not only reports but also tolerates exceptions better, and I only managed to reproduce with the non-debug firepad.js)

cben avatar May 26 '15 14:05 cben

  • [ ] If we detect this kind of error, report it somehow in firebase (with full base version + doc text?) so future analysis can at least know how much this happened (and retroactively merge?).

cben avatar Jun 03 '15 04:06 cben

Aha moment! After sendOperation() called with invalid operation., firepad goes back to report that it's "synced". I can — and should — fix this upstream.

cben avatar Jun 03 '15 07:06 cben

Progress: Headless firepad works. (For some reason it didn't for ~ a day when I initially played in the JS console but now it works both from script and console) And it does seem to reliably warn when firepad gets borked and doesn't save. There are race conditions leading to false positives, should debounce it after document stops changing.

But the much bigger issue is performance. As commented on 0e8a2d41cd8082115a6164796758ebe6d5b886df , I'm seeing a memory leak and slowdown; after about a day of holding 2 tabs open (with a tiny document), memory grew to 1.5GB and CPU approached 100%. I think there was also high GC churn but didn't look at numbers. [This is with new Headless -> getText -> comparison every 1sec, for debugging.] Memory leak doesn't seem to happen in current prod version (stays ~50MB / tab, half of which is JS memory). CPU is explained by Headless getText() latency growing to several seconds, with the timer piling up another Headless -> getText every second.

Is memory leak root cause of slowness? I did experince whole computer slowdown. Is getText pileup root cause of leak? Easy to imagine - Headless objects would pile up - but unlikely, as leak is observable immediately in a fresh tab. Are they independent problems?

Ideas:

  • Dig deeper - heap profile, CPU profile on getText, ...
  • Lower checking interval to something like 1 minute. This is necessary in any case, as I won't have perfect understanding of performance ramifications. But first let's debug 1sec which amplifies problems.
  • Reuse one Headless instance instead of new one every time? new Firepad.Headless returns instantly but getText on new instance takes >200ms (>400ms after significant edits), while on existing one it's ~1ms. On my paper-length doc, getText is ~20ms but still 200~600ms after new. But I'd like to keep the new as more bullet-proof.
  • Prevent pileup by scheduling new check only once getText completes? Do I still care about pileup if it's done once per minute? Maybe, I'm worried getText might take unbounded time when offline. Can this lead to losing one check and never resuming? OTOH, if it's possible for getText to get lost, we might in theory get stuck in false testedSynced = true forever. But here is a concrete worrying scenario: data in firebase gets really corrupt (never happened AFAIK, just unsaved) causing getText logic to crash every time. Idea: assume the worst, set testedSynced = false every time until getText returns? [perhaps better to have separate less worrying "Veryfying save..." message.]
  • Debouncing should mitigate a lot. Do checks only when we've been online, firepad believes it's synced, and no incoming changes for some time (5sec?). But let's debug root causes first while it's bad.

cben avatar Jun 04 '15 09:06 cben

Oh, dev consoles leak a lot as well. Quickly reach hundreds of megabytes. This is easily explained by my very verbose logging, including JS objects and DOM elements. Chrome and FF do throw away old messages when console is closed for a long time, but is that true for all browsers? Perhaps it's time to default logging to off in prod (with function and/or URL param turn it on)?

cben avatar Jun 04 '15 09:06 cben

https://groups.google.com/forum/#!topic/firebase-talk/AEDekJ_B480 is interesting. It's likely that new Firepad.Headless() registers new firebase callbacks that never go away (and keep the Headless itself alive and updating...)

cben avatar Jun 04 '15 10:06 cben

Just reducing CPU usage to say once/minute is not good enough. Consider a person with 30 mathdown tabs — now one of the tabs wakes up (and largely swaps in) every 2 seconds, even if nothing changed. I should either avoid checks entirely when nothing changed, dynamically backoff frequency, or both.

I should also check whether this whole feature noticably increases a tab's memory usage. At best, I'll still be constructing 2 copies of the text (codemirror getValue, firepad getText). But that's negligible compared to the full usage I see.

cben avatar Jun 04 '15 11:06 cben

Using heap profiling on the ~100K document. Without headless checking it's 70MB on load, 170MB after renderingm math. With checking, after being open for 1000s, it's 360MB. During 50s heap allocations profile, the biggest allocation is the string with the full text (~200K due to UTF-16) => 7MB over 50s.

All these strings are retained! They're retained in TextOp.text, which is retained in TextOperation.ops[0] array, retained as FirebaseAdapter.document_, which is retained in fb.core.view.ChildEventRegistration under callbacks.child_added closure's self. Over many allocations, these accumulate in an array of callbacks (I think on the history/ ref, not 100% sure).

Which is exactly what I assumed above - new Firepad.Headless() retains everything via callbacks. Let's try with single Headless now. Then I'll mail Firepad folks (about fixing synced and about destroying a Headless — I also need that for #7 long lived server) and possibly proceed.

cben avatar Jun 04 '15 11:06 cben

Oh, and the accumulating callbacks also explain slowdown :-)

cben avatar Jun 04 '15 11:06 cben

Brilliant - 8d031b74 fixed the leak according to Chrome allocations profiler :-) Process RAM still grows to >400MB but is then garbage-collected back to 250MB, same as the mathdown.net non-checking tab stabilized. This happens every 30~40s but remember that's with getText every 1s! The stable heap dump usage mainly consists of 50~60MB of DOM and ~40MB of CodeMirror.Lines arranged in a b-tree. It's similar with checking and without. [P.S. One would expect the partial DOM to be much smaller than CodeMirror's full b-tree, but this is a math-heavy document.]

cben avatar Jun 04 '15 12:06 cben

Looked at firebase analytics and saw this strange bandwidth graph:

uploads/e2c919eb-c06b-4042-9402-27b2f0a4b270/bandwidth30d.png

It's never been this sharp! And why is it suddenly flat? These are exactly the days when I was testing this feature! (and yes, it's sad that I alone can dominate Mathdown's bandwidth.)

I really hope that the high bandwidth was specific to the fresh-Headless-every second version and the one-Headless is saner. Need to test before landing. (of course, it's not gonna be every second. Good that I'm stress-testing this way or I wouldn't notice it.)

cben avatar Jun 29 '15 12:06 cben