koreader icon indicating copy to clipboard operation
koreader copied to clipboard

Input: Fix a bunch of multi-slot corner cases

Open NiLuJe opened this issue 1 year ago • 1 comments

  • Avoid a double-free by clearing the right timerfd callback (fix #9376).
  • Avoid getting semi-stuck in random hold states when triggering multi-slot holds.
  • Random completely unrelated fix for #9285

This change is Reviewable

NiLuJe avatar Jul 26 '22 16:07 NiLuJe

(Rebase me ;o))

NiLuJe avatar Aug 11 '22 16:08 NiLuJe

Possibly that fancy gestures still behave, as I'm not a multiswipe user ;).

(Note that this shouldn't affect multiswipes at all, only holds ;p).

NiLuJe avatar Aug 12 '22 16:08 NiLuJe

Apparently not out of the woods yet... (c.f., #9376)

NiLuJe avatar Aug 12 '22 18:08 NiLuJe

Avoid getting semi-stuck in random hold states when triggering multi-slot holds.

"Getting stuck", "hold", would sound like it may solve my long-time issue with my GloHD - but I keep my hope low :/

Before, often after getting stuck and suspend/wait/resume, my first touch was getting handled as a long-press, highlighting the word I touched on. But since a few months ago (around the TimeVal revamp stuff), in that situation, the power button does not react anymore :/ (So, hyper long-press to force a hard poweroff).

But some days ago, it worked, and I had enabled some "emergency switch to debug mode" on suspend. I got this:

08/06/22-00:45:38 INFO  Restoring user input handling
[ read or tap, and at some point Touch did not work; hitting the power button to suspend ]
08/06/22-00:47:31 INFO  ENABLING EMERGENCY DEBUG MODE ON SUSPEND
08/06/22-00:47:31 DEBUG Rename ...metadata.epub.lua
08/06/22-00:47:31 DEBUG ReaderFooter: unschedule autoRefreshFooter
08/06/22-00:47:31 DEBUG ReaderCoptListener.headerRefresh unscheduled
08/06/22-00:47:31 INFO  Inhibiting user input
08/06/22-00:47:31 DEBUG Suspending...
08/06/22-00:47:31 DEBUG Sleep cover is open
08/06/22-00:47:31 DEBUG renderImageData: using MuPDF
08/06/22-00:47:31 DEBUG Mupdf.renderImage true cdata<struct BlitBufferRGB24>: 0x2ba12198
08/06/22-00:47:32 DEBUG setDirty via a func from widget ScreenSaver
08/06/22-00:47:32 DEBUG show widget: ScreenSaver
08/06/22-00:47:32 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-00:47:32 DEBUG setDirty full from widget ScreenSaver w/ NO region AND w/ HW dithering
08/06/22-00:47:32 DEBUG setDirty via a func from widget ScreenSaver
08/06/22-00:47:32 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:32 DEBUG painting widget: ScreenSaver
08/06/22-00:47:32 DEBUG ImageWidget: _render'ing data 1072 1448
08/06/22-00:47:32 DEBUG ImageWidget: scale to fit, setting scale_factor to 1.0936555891239
08/06/22-00:47:32 DEBUG ImageWidget: scaling by 1.0936555891239
08/06/22-00:47:32 DEBUG RenderImage:scaleBlitBuffer: scaling
08/06/22-00:47:32 DEBUG ImageWidget: initial offsets -99.5 0
08/06/22-00:47:32 DEBUG blitFrom 0 0 -99.5 0 1072 1448
08/06/22-00:47:32 DEBUG _repaint: it was dithered, infecting the refresh queue
08/06/22-00:47:32 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-00:47:32 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-00:47:33 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:33 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:42 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:42 DEBUG not painting 1 covered widget(s)
[Somehow, no ZzZ... logged - it should have happened around 00:47:33+15s = 00:47:47)
{ hitting the power button, so resuming somehow;]
08/06/22-00:47:53 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:53 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:53 DEBUG not painting 1 covered widget(s)
08/06/22-00:47:53 DEBUG key event => code: 116 (Power), value: 1, time: 3148.270041
08/06/22-00:47:53 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 3148.270044
08/06/22-00:47:53 DEBUG key event => code: 116 (Power), value: 0, time: 3148.500021
08/06/22-00:47:53 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 3148.500026
08/06/22-00:47:53 DEBUG Resuming...
08/06/22-00:47:53 INFO  Kobo resume: clean up after wakeup
08/06/22-00:47:53 INFO  Kobo resume: successfully asked the kernel to resume subsystems
08/06/22-00:47:53 DEBUG close screensaver
08/06/22-00:47:53 DEBUG close widget: ScreenSaver
08/06/22-00:47:53 DEBUG setDirty via a func from widget nil
08/06/22-00:47:53 DEBUG Lower widget ReaderUI covers the full screen
08/06/22-00:47:53 DEBUG setDirty nil from widget ReaderUI w/ NO region
08/06/22-00:47:53 INFO  Restoring user input handling
08/06/22-00:47:53 DEBUG painting widget: ReaderUI
08/06/22-00:47:53 DEBUG CreDocument: goto page 393 flow 0
08/06/22-00:47:53 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448
[out of resume, I see my book page]
{ trying to touch/tap/long-press, nothing logged - fed up and I hit the power button - which worked:]
08/06/22-00:48:18 DEBUG key event => code: 116 (Power), value: 1, time: 3158.980043
08/06/22-00:48:18 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 3158.980045
08/06/22-00:48:18 DEBUG key event => code: 116 (Power), value: 0, time: 3159.220020
08/06/22-00:48:18 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 3159.220021
[power button working, but nothing else, hard poweroff]

After the reboot, everything was good - and I did the same suspend/resume sequence, and I got this:

08/06/22-01:01:57 INFO    opening took 1.549 seconds
[ all ok - just to get this log: hitting the power button to suspend ]
08/06/22-01:02:01 INFO  ENABLING EMERGENCY DEBUG MODE ON SUSPEND
08/06/22-01:02:01 DEBUG Rename ...metadata.epub.lua
08/06/22-01:02:01 DEBUG ReaderFooter: unschedule autoRefreshFooter
08/06/22-01:02:01 DEBUG ReaderCoptListener.headerRefresh unscheduled
08/06/22-01:02:01 INFO  Inhibiting user input
08/06/22-01:02:01 DEBUG Suspending...
08/06/22-01:02:01 DEBUG Sleep cover is open
08/06/22-01:02:01 DEBUG renderImageData: using MuPDF
08/06/22-01:02:01 DEBUG Mupdf.renderImage true cdata<struct BlitBufferRGB24>: 0x2dd245a0
08/06/22-01:02:01 DEBUG setDirty via a func from widget ScreenSaver
08/06/22-01:02:01 DEBUG show widget: ScreenSaver
08/06/22-01:02:01 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-01:02:01 DEBUG setDirty full from widget ScreenSaver w/ NO region AND w/ HW dithering
08/06/22-01:02:01 DEBUG setDirty via a func from widget ScreenSaver
08/06/22-01:02:01 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:01 DEBUG painting widget: ScreenSaver
08/06/22-01:02:01 DEBUG ImageWidget: _render'ing data 1072 1448
08/06/22-01:02:01 DEBUG ImageWidget: scale to fit, setting scale_factor to 1.0936555891239
08/06/22-01:02:01 DEBUG ImageWidget: scaling by 1.0936555891239
08/06/22-01:02:01 DEBUG RenderImage:scaleBlitBuffer: scaling
08/06/22-01:02:02 DEBUG ImageWidget: initial offsets -99.5 0
08/06/22-01:02:02 DEBUG blitFrom 0 0 -99.5 0 1072 1448
08/06/22-01:02:02 DEBUG _repaint: it was dithered, infecting the refresh queue
08/06/22-01:02:02 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-01:02:02 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448 w/ HW dithering
08/06/22-01:02:02 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:02 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:11 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:11 DEBUG not painting 1 covered widget(s)
[Somehow, no ZzZ... logged, it should have happened around 01:02:02+15s = 01:02:17)]
{ hitting the power button I thing, so resuming somehow;]
08/06/22-01:02:24 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:24 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:24 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:24 DEBUG key event => code: 116 (Power), value: 1, time: 719.040059
08/06/22-01:02:24 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 719.040061
08/06/22-01:02:24 DEBUG not painting 1 covered widget(s)
08/06/22-01:02:24 DEBUG key event => code: 116 (Power), value: 0, time: 719.413059
08/06/22-01:02:24 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 719.413062
08/06/22-01:02:24 DEBUG Resuming...
08/06/22-01:02:24 INFO  Kobo resume: clean up after wakeup
08/06/22-01:02:24 INFO  Kobo resume: successfully asked the kernel to resume subsystems
08/06/22-01:02:24 DEBUG close screensaver
08/06/22-01:02:24 DEBUG close widget: ScreenSaver
08/06/22-01:02:24 DEBUG setDirty via a func from widget nil
08/06/22-01:02:24 DEBUG Lower widget ReaderUI covers the full screen
08/06/22-01:02:24 DEBUG setDirty nil from widget ReaderUI w/ NO region
08/06/22-01:02:24 INFO  Restoring user input handling
08/06/22-01:02:24 DEBUG painting widget: ReaderUI
08/06/22-01:02:24 DEBUG CreDocument: goto page 393 flow 0
08/06/22-01:02:24 DEBUG _refresh: Enqueued full update for region 0 0 1072 1448
[out of resume, I see my book page - and touch works:]
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 57 (ABS_MT_TRACKING_ID), value: 1, time: 724.250238
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 48 (ABS_MT_TOUCH_MAJOR), value: 1, time: 724.250265
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 50 (ABS_MT_WIDTH_MAJOR), value: 1, time: 724.250267
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 53 (ABS_MT_POSITION_X), value: 881, time: 724.250270
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 54 (ABS_MT_POSITION_Y), value: 162, time: 724.250273
08/06/22-01:02:28 DEBUG input event => type: 3 (EV_ABS), code: 24 (ABS_PRESSURE), value: 1024, time: 724.250281
08/06/22-01:02:28 DEBUG key event => code: 330 (BTN_TOUCH), value: 1, time: 724.250297
08/06/22-01:02:28 DEBUG input event => type: 0 (EV_SYN), code: 2 (SYN_MT_REPORT), value: 0, time: 724.250301
08/06/22-01:02:28 DEBUG input event => type: 0 (EV_SYN), code: 0 (SYN_REPORT), value: 0, time: 724.250316
08/06/22-01:02:28 DEBUG GestureDetector:probeClockSource: Touch event timestamps appear to use CLOCK_MONOTONIC
08/06/22-01:02:28 DEBUG in tap state...
08/06/22-01:02:28 DEBUG set up hold timer
08/06/22-01:02:28 DEBUG adjusted ges: touch
[ all good ]

@NiLuJe : do you see anything specific in the above logs ? And if not, how sure can I be that NOT seeing any input event => type: 3 (EV_ABS), code: 57 (ABS_MT_TRACKING_ID), value: 1, time: 724.250238 means that the kernel does not get any event? Or is it still possible that the kernel event is consumed/thrown away by our code at a lower level ? I'm still suspecting it might be something with the event's or koreader's time, and our code possibly throwing stuff because the events' timestamps are not what it expects (either a kernel bug, or some time reference messed up in our code. Can I definitely rule that out, and assume it's a hardware bug on my GloHD - and stop looking for answers ? :)

poire-z avatar Aug 12 '22 21:08 poire-z

Can I definitely rule that out, and assume it's a hardware bug on my GloHD - and stop looking for answers ? :)

Yup, no event logged means no event sent to userspace, we log everything, even unhandled/unknown events ;).

FWIW, this is (vaguely) similar to what happens on sunxi with the Elan driver: the kernel sometimes "eats" the event stream that caused a wakeup from suspend/standby (although, in this case, that only affects that first event stream, subsequent ones work as well as usual (which is not that great in the Elipsa's case ;p)).

There's an added quirk with IR grids in that the kernel sometimes fail to wakeup the hardware altogether on resume, which is why we always do it ourselves on top of that.

I think I already asked you about that once, and either there's no IR grid on the Glo HD, or you confirmed that it's the same sysfs path & command as on a H2O ;).

https://github.com/koreader/koreader/blob/55c0fc6190d1be330b4efb9a8447982b6c52daee/frontend/device/kobo/device.lua#L1073-L1077

NiLuJe avatar Aug 12 '22 21:08 NiLuJe

Note that the kernel itself may buffer an event stream until a SYN_REPORT, but that shouldn't be a concern here because the events are supposedly generated by the kernel itself (i.e., this is only a concern when you do funky stuff like injecting custom events in a real evdev device; and you get the whole thing replayed back on the "next" stream (or an overflow event in the worst case scenario)).

NiLuJe avatar Aug 12 '22 21:08 NiLuJe

Thanks :)

Yup, no event logged means no event sent to userspace, we log everything, even unhandled/unknown events ;).

The input event => type: 3 (EV_ABS), code: 57 (ABS_MT_TRACKING_ID), value: 1, time: 724.250238 is logged by frontend/device/input.lua. Sure there's no reason for things to be dropped in lower level input handling, stuff in base/input/* and around ?

I think I already asked you about that once, and either there's no IR grid on the Glo HD, or you confirmed that it's the same sysfs path & command as on a H2O ;).

Yes, it has a IR grid (which I prefer to capacitive stuff). I would have sweared it is that sysfs path /sys/devices/virtual/input/input1/neocmd - and thought I might just run writeToSys("a", "/sys/devices/virtual/input/input1/neocmd") every 10s just in case :) But no, no such file or directory :)

/sys/devices/virtual/input # find . -type f
./mice/uevent
./mice/dev
./mice/power/runtime_status
./mice/power/control
./mice/power/runtime_suspended_time
./mice/power/runtime_active_time
./mice/power/autosuspend_delay_ms
./input1/uevent
./input1/name
./input1/phys
./input1/uniq
./input1/modalias
./input1/properties
./input1/id/bustype
./input1/id/vendor
./input1/id/product
./input1/id/version
./input1/capabilities/ev
./input1/capabilities/key
./input1/capabilities/rel
./input1/capabilities/abs
./input1/capabilities/msc
./input1/capabilities/led
./input1/capabilities/snd
./input1/capabilities/ff
./input1/capabilities/sw
./input1/power/runtime_status
./input1/power/control
./input1/power/runtime_suspended_time
./input1/power/runtime_active_time
./input1/power/autosuspend_delay_ms
./input1/mouse0/uevent
./input1/mouse0/dev
./input1/mouse0/power/runtime_status
./input1/mouse0/power/control
./input1/mouse0/power/runtime_suspended_time
./input1/mouse0/power/runtime_active_time
./input1/mouse0/power/autosuspend_delay_ms
./input1/event1/uevent
./input1/event1/dev
./input1/event1/power/runtime_status
./input1/event1/power/control
./input1/event1/power/runtime_suspended_time
./input1/event1/power/runtime_active_time
./input1/event1/power/autosuspend_delay_ms
/sys/devices/virtual/input #

/sys/devices/virtual/input/input1 # cat name
zForce-ir-touch
/sys/devices/virtual/input/input1 # cat properties
0
/sys/devices/virtual/input/input1 # cat uniq

/sys/devices/virtual/input/input1 # cat phys


/sys/devices/virtual/input # find /sys/devices -name '*neo*'
/sys/devices/platform/imx-i2c.1/i2c-1/1-0050/neocmd
/sys/devices/virtual/input #

Doing this does not change a thing touch still works :)
/sys/devices/virtual/input # echo "a" > /sys/devices/platform/imx-i2c.1/i2c-1/1-0050/neocmd
/sys/devices/virtual/input #
/sys/devices/virtual/input # echo "b" > /sys/devices/platform/imx-i2c.1/i2c-1/1-0050/neocmd
/sys/devices/virtual/input # echo "s" > /sys/devices/platform/imx-i2c.1/i2c-1/1-0050/neocmd
/sys/devices/virtual/input #

poire-z avatar Aug 12 '22 21:08 poire-z

The input event => type: 3 (EV_ABS), code: 57 (ABS_MT_TRACKING_ID), value: 1, time: 724.250238 is logged by frontend/device/input.lua. Sure there's no reason for things to be dropped in lower level input handling, stuff in base/input/* and around ?

Nope, we'd be aborting on a poll/read error, and logging an evdev overflow ;).

NiLuJe avatar Aug 12 '22 21:08 NiLuJe

/sys/devices/virtual/input # find /sys/devices -name 'neo' /sys/devices/platform/imx-i2c.1/i2c-1/1-0050/neocmd /sys/devices/virtual/input #

I can't remember what the other commands are (and don't have access to the kernel sources right now), but "a" was definitely for wakeup, and, yeah, it's harmless when the grid is already on ;).

Try tweaking/expanding the code that handles that to support that path, it might help ;). (Because when the grid is off, nobody gets input events ;p).

NiLuJe avatar Aug 12 '22 21:08 NiLuJe

(IIRC, you're on an older than expected old kernel, which might (or not) play a part on the path front).

NiLuJe avatar Aug 12 '22 21:08 NiLuJe

From my notes from the time I did check newer firmwares (which did not help with the issue): En v3.16: Linux (none) 3.0.35+ # 4066 PREEMPT Wed May 27 15:40:28 CST 2015 armv7l GNU/Linux En v3.19.5761: Linux (none) 3.0.35+ # 4071 PREEMPT Fri Jul 17 16:56:30 CST 2015 armv7l GNU/Linux En v4.6.9995: Linux (none) 3.0.35+ # 4072 PREEMPT Mon Dec 19 09:25:37 CST 2016 armv7l GNU/Linux ^ toujours le même en 4.20.14601

I'm still with v3.19.5761.

And it was already that path in 2016: 85737e2b9d458a345ab4b71cd579dd003f59d745

poire-z avatar Aug 12 '22 22:08 poire-z

Dealt with another MT handling snafu that could lead to slot data being fed twice to GestureManager if that slot was seen multiple times in non-consecutive sequences in a single input frame (that's a behavior seen only on some drivers only, e.g., neonode).

NiLuJe avatar Aug 14 '22 19:08 NiLuJe

And dealt with some more MT shenanigans to make sure every slot actually goes through GestureDetector, which prevents missing gestures (most notably contact lifts, which could cause slots to stay stuck in the hold state).

NiLuJe avatar Aug 14 '22 23:08 NiLuJe

Whelp, nothing blew up on my various devices, so let's hit the "I Feel Lucky" button ;p.

NiLuJe avatar Aug 16 '22 16:08 NiLuJe