zfs icon indicating copy to clipboard operation
zfs copied to clipboard

ZVOL creation takes lots of time after importing the pool

Open lignumqt opened this issue 4 years ago • 1 comments

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 18.04
Kernel Version 4.19.107 and 5.4.149
Architecture x86_64
OpenZFS Version 0.8.6

Describe the problem you're observing

i have a cluster of two nodes. a pool is created on the first node and two volumes are created on it, one of them is loaded through the FC, the second through iSCSI using the Iometer. windows client (the loaded file is not large ~ 5GB). when power is lost on the first node, the second node imports the created pool, after import the creation of the zvol hung up for me for up to 20 minutes. I use alua scst also. sometimes one zvol can be created after 5 minutes and the second after another 10 minutes

I tried change property sync=disabled or parametr zfs zil_replay_disable=1 it helped, zvols are created instantly after importing the pool, but there is a possibility of data loss, which is not acceptable for me

NOTE to the fact that with the version of ZFS 0.7.13 and the kernel 4.19.107, such problems did not arise under the same conditions. let me know if you need something else

Include any warning/errors/backtraces from the system logs

while hanging, I execute echo t >/proc/sysrq-trigger and watch the next stack trace

2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649611] Call Trace:
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649618]  __schedule+0x28f/0x770
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649628]  schedule+0x36/0xc0
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649634]  schedule_timeout+0x162/0x330
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649643]  ? __next_timer_interrupt+0xc0/0xc0
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649650]  io_schedule_timeout+0x19/0x40
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649659]  __cv_timedwait_common+0x123/0x160 [spl]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649667]  ? wait_woken+0xa0/0xa0
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649721]  zio_wait+0x123/0x270 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649763]  dmu_tx_count_write+0x155/0x1d0 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649801]  dmu_tx_hold_write+0x3c/0x50 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649851]  zvol_replay_write+0x51/0x110 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649903]  ? bil_replay_error.isra.18+0xb0/0xb0 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.649960]  bil_replay_log_record+0xdc/0x1a0 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650022]  bil_parse+0x64e/0x960 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650072]  ? bil_set_logbias+0x10/0x10 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650120]  ? bil_replay_error.isra.18+0xb0/0xb0 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650171]  bil_replay+0xb7/0x120 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650228]  zvol_create_minor_impl+0x7fb/0x820 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650283]  zvol_task_cb+0x2f1/0x5a0 [bfs]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650289]  ? __schedule+0x297/0x770
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650296]  taskq_thread+0x2a6/0x4c0 [spl]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650298]  ? __switch_to_asm+0x40/0x70
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650301]  ? wake_up_q+0x70/0x70
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650307]  ? task_done+0xa0/0xa0 [spl]
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650315]  kthread+0x12f/0x150
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650321]  ? kthread_create_worker_on_cpu+0x70/0x70
2021-11-09 18:37:52 0 4 localhost kernel:[ 3705.650327]  ret_from_fork+0x35/0x40
2021-11-09 18:37:52 0 6 localhost kernel:[ 3705.650333] z_prefetch      S    0 47471      2 0x80084000

my arc_summary

Details

ARC status: HEALTHY Memory throttle count: 0

ARC size (current): < 0.1 % 1.6 MiB Target size (adaptive): 100.0 % 316.3 GiB Min size (hard limit): 0.2 % 512.0 MiB Max size (high water): 632:1 316.3 GiB Most Frequently Used (MFU) cache size: 62.3 % 931.5 KiB Most Recently Used (MRU) cache size: 37.7 % 563.0 KiB Metadata cache size (hard limit): 75.0 % 237.2 GiB Metadata cache size (current): < 0.1 % 1.6 MiB Dnode cache size (hard limit): 10.0 % 23.7 GiB Dnode cache size (current): < 0.1 % 71.2 KiB

ARC hash breakdown: Elements max: 190.3k Elements current: < 0.1 % 67 Collisions: 1.4k Chain max: 1 Chains: 0

ARC misc: Deleted: 199.8k Mutex misses: 0 Eviction skips: 69

ARC total accesses (hits + misses): 10.7M Cache hit ratio: 97.4 % 10.4M Cache miss ratio: 2.6 % 281.0k Actual hit ratio (MFU + MRU hits): 97.4 % 10.4M Data demand efficiency: 61.9 % 520.5k Data prefetch efficiency: 0.0 % 5

Cache hits by cache type: Most frequently used (MFU): 94.6 % 9.9M Most recently used (MRU): 5.4 % 560.8k Most frequently used (MFU) ghost: 0.0 % 0 Most recently used (MRU) ghost: 0.0 % 0 Anonymously used: < 0.1 % 104

Cache hits by data type: Demand data: 3.1 % 322.2k Demand prefetch data: 0.0 % 0 Demand metadata: 96.9 % 10.1M Demand prefetch metadata: < 0.1 % 104

Cache misses by data type: Demand data: 70.6 % 198.3k Demand prefetch data: < 0.1 % 5 Demand metadata: 29.4 % 82.6k Demand prefetch metadata: < 0.1 % 131

DMU prefetch efficiency: 4.2M Hit ratio: < 0.1 % 831 Miss ratio: 100.0 % 4.2M

L2ARC not detected, skipping section

Solaris Porting Layer (SPL): spl_hostid 0 spl_hostid_path /etc/hostid spl_kmem_alloc_max 1048576 spl_kmem_alloc_warn 65536 spl_kmem_cache_expire 2 spl_kmem_cache_kmem_limit 2048 spl_kmem_cache_kmem_threads 4 spl_kmem_cache_magazine_size 0 spl_kmem_cache_max_size 32 spl_kmem_cache_obj_per_slab 8 spl_kmem_cache_obj_per_slab_min 1 spl_kmem_cache_reclaim 0 spl_kmem_cache_slab_limit 16384 spl_max_show_tasks 512 spl_panic_halt 0 spl_schedule_hrtimeout_slack_us 0 spl_taskq_kick 0 spl_taskq_thread_bind 0 spl_taskq_thread_dynamic 0 spl_taskq_thread_priority 1 spl_taskq_thread_sequential 4

Tunables: dbuf_cache_hiwater_pct 10 dbuf_cache_lowater_pct 10 dbuf_cache_max_bytes 10612026016 dbuf_cache_shift 5 dbuf_metadata_cache_max_bytes 5306013008 dbuf_metadata_cache_shift 6 dmu_object_alloc_chunk_shift 7 dmu_prefetch_max 134217728 ignore_hole_birth 1 l2arc_feed_again 1 l2arc_feed_min_ms 200 l2arc_feed_secs 1 l2arc_headroom 2 l2arc_headroom_boost 200 l2arc_noprefetch 1 l2arc_norw 0 l2arc_write_boost 8388608 l2arc_write_max 8388608 metaslab_aliquot 524288 metaslab_bias_enabled 1 metaslab_debug_load 0 metaslab_debug_unload 0 metaslab_df_max_search 16777216 metaslab_df_use_largest_segment 0 metaslab_force_ganging 16777217 metaslab_fragmentation_factor_enabled 1 metaslab_lba_weighting_enabled 1 metaslab_preload_enabled 1 send_holes_without_birth_time 1 spa_asize_inflation 24 spa_config_path /etc/zfs/zpool.cache spa_load_print_vdev_tree 0 spa_load_verify_data 1 spa_load_verify_metadata 0 spa_load_verify_shift 4 spa_slop_shift 5 vdev_removal_max_span 32768 vdev_validate_skip 0 zap_iterate_prefetch 1 zfetch_array_rd_sz 1048576 zfetch_max_distance 8388608 zfetch_max_streams 8 zfetch_min_sec_reap 2 zfs_abd_scatter_enabled 1 zfs_abd_scatter_max_order 10 zfs_abd_scatter_min_size 1536 zfs_admin_snapshot 0 zfs_arc_average_blocksize 8192 zfs_arc_dnode_limit 0 zfs_arc_dnode_limit_percent 10 zfs_arc_dnode_reduce_percent 10 zfs_arc_grow_retry 0 zfs_arc_lotsfree_percent 10 zfs_arc_max 339584832512 zfs_arc_meta_adjust_restarts 4096 zfs_arc_meta_limit 0 zfs_arc_meta_limit_percent 75 zfs_arc_meta_min 0 zfs_arc_meta_prune 10000 zfs_arc_meta_strategy 1 zfs_arc_min 536870912 zfs_arc_min_prefetch_ms 0 zfs_arc_min_prescient_prefetch_ms 0 zfs_arc_p_dampener_disable 1 zfs_arc_p_min_shift 0 zfs_arc_pc_percent 0 zfs_arc_shrink_shift 0 zfs_arc_sys_free 0 zfs_async_block_max_blocks 100000 zfs_autoimport_disable 1 zfs_checksum_events_per_second 20 zfs_commit_timeout_pct 5 zfs_compressed_arc_enabled 1 zfs_condense_indirect_commit_entry_delay_ms 0 zfs_condense_indirect_vdevs_enable 1 zfs_condense_max_obsolete_bytes 1073741824 zfs_condense_min_mapping_bytes 131072 zfs_dbgmsg_enable 1 zfs_dbgmsg_maxsize 4194304 zfs_dbuf_state_index 0 zfs_ddt_data_is_special 1 zfs_deadman_checktime_ms 60000 zfs_deadman_enabled 1 zfs_deadman_failmode wait zfs_deadman_synctime_ms 600000 zfs_deadman_ziotime_ms 300000 zfs_dedup_prefetch 0 zfs_delay_min_dirty_percent 60 zfs_delay_scale 500000 zfs_delete_blocks 20480 zfs_dirty_data_max 4294967296 zfs_dirty_data_max_max 4294967296 zfs_dirty_data_max_max_percent 25 zfs_dirty_data_max_percent 10 zfs_dirty_data_sync_percent 20 zfs_disable_ivset_guid_check 0 zfs_dmu_offset_next_sync 0 zfs_expire_snapshot 300 zfs_flags 0 zfs_free_bpobj_enabled 1 zfs_free_leak_on_eio 0 zfs_free_min_time_ms 100 zfs_immediate_write_sz 32768 zfs_initialize_value 16045690984833335022 zfs_key_max_salt_uses 400000000 zfs_lua_max_instrlimit 100000000 zfs_lua_max_memlimit 104857600 zfs_max_missing_tvds 0 zfs_max_recordsize 1048576 zfs_metaslab_fragmentation_threshold 70 zfs_metaslab_segment_weight_enabled 1 zfs_metaslab_switch_threshold 2 zfs_mg_fragmentation_threshold 95 zfs_mg_noalloc_threshold 0 zfs_multihost_fail_intervals 10 zfs_multihost_history 0 zfs_multihost_import_intervals 20 zfs_multihost_interval 1000 zfs_multilist_num_sublists 0 zfs_no_scrub_io 0 zfs_no_scrub_prefetch 0 zfs_nocacheflush 0 zfs_nopwrite_enabled 1 zfs_object_mutex_size 64 zfs_obsolete_min_time_ms 500 zfs_override_estimate_recordsize 0 zfs_pd_bytes_max 52428800 zfs_per_txg_dirty_frees_percent 5 zfs_prefetch_disable 0 zfs_read_chunk_size 1048576 zfs_read_history 0 zfs_read_history_hits 0 zfs_reconstruct_indirect_combinations_max 4096 zfs_recover 0 zfs_recv_queue_length 16777216 zfs_removal_ignore_errors 0 zfs_removal_suspend_progress 0 zfs_remove_max_segment 16777216 zfs_resilver_disable_defer 0 zfs_resilver_min_time_ms 3000 zfs_scan_checkpoint_intval 7200 zfs_scan_fill_weight 3 zfs_scan_ignore_errors 0 zfs_scan_issue_strategy 0 zfs_scan_legacy 0 zfs_scan_max_ext_gap 2097152 zfs_scan_mem_lim_fact 20 zfs_scan_mem_lim_soft_fact 20 zfs_scan_strict_mem_lim 0 zfs_scan_suspend_progress 0 zfs_scan_vdev_limit 4194304 zfs_scrub_min_time_ms 1000 zfs_send_corrupt_data 0 zfs_send_queue_length 16777216 zfs_send_unmodified_spill_blocks 1 zfs_slow_io_events_per_second 20 zfs_spa_discard_memory_limit 16777216 zfs_special_class_metadata_reserve_pct 25 zfs_sync_pass_deferred_free 2 zfs_sync_pass_dont_compress 8 zfs_sync_pass_rewrite 2 zfs_sync_taskq_batch_pct 75 zfs_trim_extent_bytes_max 134217728 zfs_trim_extent_bytes_min 32768 zfs_trim_metaslab_skip 0 zfs_trim_queue_limit 10 zfs_trim_txg_batch 32 zfs_txg_history 100 zfs_txg_timeout 2 zfs_unlink_suspend_progress 0 zfs_user_indirect_is_special 1 zfs_vdev_aggregate_trim 0 zfs_vdev_aggregation_limit 1048576 zfs_vdev_aggregation_limit_non_rotating 131072 zfs_vdev_async_read_max_active 3 zfs_vdev_async_read_min_active 1 zfs_vdev_async_write_active_max_dirty_percent 60 zfs_vdev_async_write_active_min_dirty_percent 30 zfs_vdev_async_write_max_active 10 zfs_vdev_async_write_min_active 2 zfs_vdev_cache_bshift 16 zfs_vdev_cache_max 16384 zfs_vdev_cache_size 0 zfs_vdev_default_ms_count 200 zfs_vdev_initializing_max_active 1 zfs_vdev_initializing_min_active 1 zfs_vdev_max_active 1000 zfs_vdev_min_ms_count 16 zfs_vdev_mirror_non_rotating_inc 0 zfs_vdev_mirror_non_rotating_seek_inc 1 zfs_vdev_mirror_rotating_inc 0 zfs_vdev_mirror_rotating_seek_inc 5 zfs_vdev_mirror_rotating_seek_offset 1048576 zfs_vdev_ms_count_limit 131072 zfs_vdev_queue_depth_pct 1000 zfs_vdev_raidz_implcycle [fastest] original scalar sse2 ssse3 avx2 zfs_vdev_read_gap_limit 32768 zfs_vdev_removal_max_active 2 zfs_vdev_removal_min_active 1 zfs_vdev_scheduler unused zfs_vdev_scrub_max_active 2 zfs_vdev_scrub_min_active 1 zfs_vdev_sync_read_max_active 10 zfs_vdev_sync_read_min_active 10 zfs_vdev_sync_write_max_active 10 zfs_vdev_sync_write_min_active 10 zfs_vdev_trim_max_active 2 zfs_vdev_trim_min_active 1 zfs_vdev_write_gap_limit 4096 zfs_zevent_cols 80 zfs_zevent_console 0 zfs_zevent_len_max 768 zfs_zil_clean_taskq_maxalloc 1048576 zfs_zil_clean_taskq_minalloc 1024 zfs_zil_clean_taskq_nthr_pct 100 zil_maxblocksize 131072 zil_nocacheflush 0 zil_replay_disable 0 zil_slog_bulk 786432 zio_deadman_log_all 0 zio_dva_throttle_enabled 1 zio_requeue_io_start_cut_in_line 1 zio_slow_io_ms 30000 zio_taskq_batch_pct 75 zvol_inhibit_dev 0 zvol_major 230 zvol_max_discard_blocks 16384 zvol_prefetch_bytes 131072 zvol_request_sync 0 zvol_threads 32 zvol_volmode 1

VDEV cache disabled, skipping section

ZIL committed transactions: 0 Commit requests: 8 Flushes to stable storage: 8 Transactions to SLOG storage pool: 0 Bytes 0 Transactions to non-SLOG storage pool: 0 Bytes 0

lignumqt avatar Nov 10 '21 12:11 lignumqt

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 26 '22 21:11 stale[bot]