tuxonice-kernel-old icon indicating copy to clipboard operation
tuxonice-kernel-old copied to clipboard

Seeking to free memory, then Cleaning up

Open ysalmon opened this issue 7 years ago • 7 comments

Hello, I seem currently unable to hibernate with 4.4.0-97-generic-tuxonice.

When doing systemctl hibernate, I get to TOI progress bar which states "Seeking to free XXX MB of memory" with XXX ranging from some tens to 700. Sometimes there is a second try to free memory after the first, but each time so far the next step is "Cleaning up" then resuming.

RAM usage is 7.7GB/15.6GB and swap usage is 150MB/31GB.

This may be related to #9 or #14 ; commits are refercing these issues but they are still open so I do not know whether these are ancient stories or still current.

Here is an extract from journalctl

oct. 22 22:21:48 yann-Precision-Tower-3620 kernel: Console is 67x240.
oct. 22 22:21:48 yann-Precision-Tower-3620 kernel: Using configuration file /etc/splash/tuxonice/3840x2160.cfg.
oct. 22 22:21:48 yann-Precision-Tower-3620 kernel: No silent picture specified in the theme config.
oct. 22 22:21:48 yann-Precision-Tower-3620 kernel: Framebuffer support initialised successfully.
oct. 22 22:22:10 yann-Precision-Tower-3620 kernel: Starting other threads.
oct. 22 22:22:14 yann-Precision-Tower-3620 kernel: Freezing user space processes ... 
oct. 22 22:22:15 yann-Precision-Tower-3620 kernel: (elapsed 0.001 seconds) done.
oct. 22 22:22:15 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Asked shrink_memory_mask for 0 low pages & 103662 pages from anywhere, got 57821.
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... 
oct. 22 22:22:16 yann-Precision-Tower-3620 kernel: Freezing of tasks failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=1):
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel: jbd2/sda5-8     R  running task        0   887      2 0x00000000
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  0000000000000000 0000000000000020 0000000004420848 0000000000000040
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  0000000000000000 000000070000000c 0442084800000001 0000000000000000
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  0dd892ae082ff6c8 0000000004420848 0000000004420848 ffff8804093b3ad0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel: Call Trace:
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff81424aa5>] ? find_next_bit+0x15/0x20
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811acf9b>] ? __alloc_pages_slowpath.constprop.93+0x67b/0xad0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811ad676>] ? __alloc_pages_nodemask+0x286/0x2a0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff812b1f29>] ? ext4_map_blocks+0x289/0x5a0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811f744c>] ? alloc_pages_current+0x8c/0x110
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811a341b>] ? __page_cache_alloc+0xab/0xc0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811a3ff4>] ? pagecache_get_page+0x84/0x1c0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff8125c35c>] ? __getblk_slow+0xcc/0x2a0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff8125c57f>] ? __getblk_gfp+0x4f/0x60
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff8130c79d>] ? jbd2_journal_get_descriptor_buffer+0x4d/0xb0
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff813049e7>] ? jbd2_journal_commit_transaction+0x927/0x1870
oct. 22 22:22:17 yann-Precision-Tower-3620 kernel:  [<ffffffff811011ee>] ? try_to_del_timer_sync+0x5e/0x90
oct. 22 22:22:18 yann-Precision-Tower-3620 kernel:  [<ffffffff8130954a>] ? kjournald2+0xca/0x250
oct. 22 22:22:18 yann-Precision-Tower-3620 kernel:  [<ffffffff810c4410>] ? wake_atomic_t_function+0x60/0x60
oct. 22 22:22:18 yann-Precision-Tower-3620 kernel:  [<ffffffff81309480>] ? commit_timeout+0x10/0x10
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel:  [<ffffffff810a0c75>] ? kthread+0xe5/0x100
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel:  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel:  [<ffffffff81859b8f>] ? ret_from_fork+0x3f/0x70
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel:  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel: 
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:22:19 yann-Precision-Tower-3620 kernel: Restarting tasks ... done.
oct. 22 22:22:20 yann-Precision-Tower-3620 kernel: video LNXVIDEO:00: Restoring backlight state
oct. 22 22:22:20 yann-Precision-Tower-3620 systemd-sleep[31618]: Failed to write 'disk' to /sys/power/state: Invalid argument

Here is another one

oct. 22 22:33:39 yann-Precision-Tower-3620 kernel: Console is 67x240.
oct. 22 22:33:39 yann-Precision-Tower-3620 kernel: Using configuration file /etc/splash/tuxonice/3840x2160.cfg.
oct. 22 22:33:39 yann-Precision-Tower-3620 kernel: No silent picture specified in the theme config.
oct. 22 22:33:39 yann-Precision-Tower-3620 kernel: Framebuffer support initialised successfully.
oct. 22 22:33:43 yann-Precision-Tower-3620 kernel: Starting other threads.
oct. 22 22:33:48 yann-Precision-Tower-3620 kernel: Freezing user space processes ... 
oct. 22 22:33:49 yann-Precision-Tower-3620 kernel: (elapsed 0.001 seconds) done.
oct. 22 22:33:49 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
oct. 22 22:33:50 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:33:50 yann-Precision-Tower-3620 kernel: Asked shrink_memory_mask for 0 low pages & 163254 pages from anywhere, got 59557.
oct. 22 22:33:50 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.239 seconds) done.
oct. 22 22:33:50 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:33:50 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Asked shrink_memory_mask for 0 low pages & 103852 pages from anywhere, got 17730.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Restarting kernel threads ... done.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Free:23813(23813). Sets:2167353(2167353),0(0). Nosave:1994945-1976466=18479. Storage:2187779/8099354(2169353=>2167353). Needed:85244,0,0(100
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: Failed to prepare the image because...
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel: - The maximum number of iterations was reached without successfully preparing the image.
oct. 22 22:33:51 yann-Precision-Tower-3620 kernel:  - We need to free 85244 lowmem pageset 1 pages.
oct. 22 22:33:52 yann-Precision-Tower-3620 kernel: Restarting tasks ... done.
oct. 22 22:33:52 yann-Precision-Tower-3620 kernel: video LNXVIDEO:00: Restoring backlight state
oct. 22 22:33:52 yann-Precision-Tower-3620 systemd-sleep[32040]: Failed to write 'disk' to /sys/power/state: Invalid argument

ysalmon avatar Oct 22 '17 20:10 ysalmon

Thanks for the report. Could you provide a little more info please?:

  • What filesystems are mounted when this occurs?
  • Would you please provide the full output of /sys/power/tuxonice/debug_info

Thanks!

NigelCunningham avatar Oct 22 '17 21:10 NigelCunningham

FYI: Ubuntu-4.4.0-96.119-tuxonice and later are based on Nigel's tuxonice-4.4 branch at Linux v4.4.83 with a cherry-pick of the "Increase maximum limit of extra pages" patch by @hjudt:

https://github.com/mschlaeffer/ubuntu-kernel-with-tuxonice/tree/tuxonice-4.4

Ubuntu-4.4.0-93.116-tuxonice and earlier are based on Nigel's tuxonice-4.4 github branch at Linux v4.4.10 and v4.4.30.

mschlaeffer avatar Oct 23 '17 05:10 mschlaeffer

This is happening again tonight.

RAM usage 8.8GB/15.6GB. debug_info.txt mtab.txt

ysalmon avatar Oct 23 '17 21:10 ysalmon

In fact this is happening almost each time unless I try to hibernate soon after a fresh boot. And in one instance hibernation actually started but hanged at "doing atomic copy/restore" and I had to poweroff.

ysalmon avatar Nov 12 '17 23:11 ysalmon

I just compared your mtab with mine (4.13.12 kernel), when TOI works. There are not really many differences. Of course the individual hdd mount points differ. And the following ones:

udev /dev devtmpfs rw,nosuid,relatime,size=8141968k,nr_inodes=2035492,mode=755 0 0 tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0 efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0

I don't have cgroups fully enabled, so I can't check it's impact on here.

What made me wondering was that you have tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=1632452k,mode=755 0 0 AND tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0 active. Can it be that such kind of nested mount irritates TOI somehow? And is this second one needed at all?

Just an idea to try to help. Discard it if not useful.

Any chance that you try a more recent kernel? E.g. 4.13.xy?

BR, Manuel Krause

ManuelKrause avatar Nov 13 '17 17:11 ManuelKrause

This is still happening with 4.4.0-101.

The setup with /run and /run/lock both being mount points seems to be frequent ; I do not think it is the culprit.

I do not know how I can investigate this further ; the progress bar indicating TOI is seeking to free memory does not give clues as to why it is trying to do so and why it is failing to do so.

ysalmon avatar Dec 03 '17 23:12 ysalmon

I had time to try with 4.4.0-93, which is based on an older TOI code ; the problem still happens. The only thing I remember changing recently is my screen, which is now bigger and is connected through Displayport insttead of DVI, but I cannot see how this would be related.

ysalmon avatar Dec 29 '17 23:12 ysalmon