[zfs-discuss] Fwd: l2arc fully aggressive settings?

Håkan Johansson h_t_johansson at fastmail.fm
Tue Sep 15 09:37:14 EDT 2015


It was off-list [zfs-devel] suggested that

"I think ZFS on Linux has some code to do this automatically. It tries
to utilize faster devices more. Have you tested with a pool that just
has SSDs and HDDs in mirrored sets with no other tweaking?"

I have now tested with a plain mirror consisting of an SSD and a HDD,
without any tweaking.  No cache devices.

While I do see the SSD performing more reads than the HDD, the speed is
slower than if the SSD reads itself, which I could also test by simply
offlining the HDD.  I suspect that the reads that go to the HDD actually
respond so late that they are holding up overall progress.

Cheers,
Håkan


-- 
  Håkan Johansson
  h_t_johansson at fastmail.fm

On Mon, Sep 14, 2015, at 11:28 PM, Håkan Johansson via zfs-discuss
wrote:
> (This was issue #3774 at https://github.com/zfsonlinux/zfs/issues,
> but I was requested to ask here instead.)
> 
> I am trying to use ZFS as a replacement for a low-latency file system
> built with MD mirror raid of SSD and (a few) HDDs for resiliency. By
> using the MD option write-mostly for the HDDs, all reads are served
> by the SSD with low latency and high bandwidth.
> (As long as the SSD is alive :) )
> 
> I have not been able to find an option to tell a ZFS mirrored pool to
> serve all reads by some specific disks (the SSDs). Is there one?
> (Such an option would be very nice.)
> 
> I thought that by setting the SSD up as L2ARC, and being larger than
> the pool, it would serve all reads (after the first time), and thus give
> the read latency and bandwidth performance of the SSDs.
> 
> It seems that the default options of the zfs module do not handle this
> situation well however. I have therefore done the following:
> 
> l2arc_noprefetch=0 # To do also streaming reads
> l2arc_headroom=1000 # To scan all ARC to end up in L2ARC
> l2arc_write_max=15000000 # This combined with the smaller
> l2arc_write_boost=15000000
> l2arc_feed_min_ms=50 # value here seems to feed better,
> # in total 300 MB/s, which is faster
> # then the HDDs I have in now can read.
> 
> It almost does the trick, but not fully:
> 
> I have created a number of test file, one of them is 8 GB, the other
> 22 GB. When reading either of them once, I can see that all reads on
> a subsequent read is from the cache devices. Good. When reading them
> after each other, and then again after each other, some reads go from
> the
> HDDs. I am confused why data apparently is evicted from the L2ARC, as
> there still is (lots of) free space in the L2ARC...
> 
> This also happens after a fresh boot of the machine, i.e. cleared
> (L2ARC).
> Reading one of the large file apparently clears out data from the other.
> 
> (The machine has 32 GiB, which is the limit, and I'd like to be able to
> use memory also for other tasks than ZFS. That's why the large SSDs
> (expensive, but way cheaper than memory. :-) ))
> 
> If it is of help, below is a current dump from arc_summary.pl. The
> machine is running Debian, with zfsonlinux, 0.6.4-1.2-1.
> 
> ------------------------------------------------------------------------
> ZFS Subsystem Report                            Tue Sep 15 08:20:26 2015
> ARC Summary: (HEALTHY)
> 	Memory Throttle Count:                  0
> 
> ARC Misc:
> 	Deleted:                                5.28m
> 	Recycle Misses:                         225.26k
> 	Mutex Misses:                           253
> 	Evict Skips:                            253
> 
> ARC Size:                               49.11%  6.40    GiB
> 	Target Size: (Adaptive)         49.11%  6.40    GiB
> 	Min Size (Hard Limit):          28.57%  3.73    GiB
> 	Max Size (High Water):          3:1     13.04   GiB
> 
> ARC Size Breakdown:
> 	Recently Used Cache Size:       13.51%  885.89  MiB
> 	Frequently Used Cache Size:     86.49%  5.54    GiB
> 
> ARC Hash Breakdown:
> 	Elements Max:                           4.77m
> 	Elements Current:               100.00% 4.77m
> 	Collisions:                             2.18m
> 	Chain Max:                              8
> 	Chains:                                 1.32m
> 
> ARC Total accesses:                                     9.65m
> 	Cache Hit Ratio:                50.16%  4.84m
> 	Cache Miss Ratio:               49.84%  4.81m
> 	Actual Hit Ratio:               40.40%  3.90m
> 
> 	Data Demand Efficiency:         100.00% 1.46m
> 	Data Prefetch Efficiency:       0.02%   1.40m
> 
> 	CACHE HITS BY CACHE LIST:
> 	  Anonymously Used:               8.48%   410.67k
> 	  Most Recently Used:             25.66%  1.24m
> 	  Most Frequently Used:           54.89%  2.66m
> 	  Most Recently Used Ghost:       4.80%   232.18k
> 	  Most Frequently Used Ghost:     6.17%   298.83k
> 
> 	CACHE HITS BY DATA TYPE:
> 	  Demand Data:                    30.07%  1.46m
> 	  Prefetch Data:          0.01%   287
> 	  Demand Metadata:                50.48%  2.44m
> 	  Prefetch Metadata:              19.45%  941.53k
> 
> 	CACHE MISSES BY DATA TYPE:
> 	  Demand Data:                    0.00%   24
> 	  Prefetch Data:          29.07%  1.40m
> 	  Demand Metadata:                69.66%  3.35m
> 	  Prefetch Metadata:              1.26%   60.73k
> 
> L2 ARC Summary: (HEALTHY)
> 	Low Memory Aborts:                      16
> 	Free on Write:                          10.92k
> 	R/W Clashes:                            12
> 	Bad Checksums:                          0
> 	IO Errors:                              0
> 
> L2 ARC Size: (Adaptive)                         370.62  GiB
> 	Compressed:                     99.37%  368.27  GiB
> 	Header Size:                    0.39%   1.45    GiB
> 
> L2 ARC Evicts:
> 	Lock Retries:                           0
> 	Upon Reading:                           0
> 
> L2 ARC Breakdown:                               4.81m
> 	Hit Ratio:                      23.32%  1.12m
> 	Miss Ratio:                     76.68%  3.69m
> 	Feeds:                                  106.60k
> 
> L2 ARC Writes:
> 	Writes Sent:                    100.00% 33.74k
> 
> File-Level Prefetch: (HEALTHY)
> DMU Efficiency:                                 76.94m
> 	Hit Ratio:                      98.66%  75.91m
> 	Miss Ratio:                     1.34%   1.03m
> 
> 	Colinear:                               1.03m
> 	  Hit Ratio:                      0.01%   147
> 	  Miss Ratio:                     99.99%  1.03m
> 
> 	Stride:                                 74.24m
> 	  Hit Ratio:                      100.00% 74.24m
> 	  Miss Ratio:                     0.00%   681
> 
> DMU Misc: 
> 	Reclaim:                                1.03m
> 	  Successes:                      0.06%   654
> 	  Failures:                       99.94%  1.03m
> 
> 	Streams:                                1.67m
> 	  +Resets:                        0.00%   64
> 	  -Resets:                        100.00% 1.67m
> 	  Bogus:                          0
> 
> 
> ZFS Tunable:
> 	metaslab_debug_load                               0
> 	zfs_arc_min_prefetch_lifespan                     250
> 	zfetch_max_streams                                8
> 	zfs_nopwrite_enabled                              1
> 	zfetch_min_sec_reap                               2
> 	zfs_dirty_data_max_max_percent                    25
> 	zfs_arc_p_aggressive_disable                      1
> 	spa_load_verify_data                              1
> 	zfs_zevent_cols                                   80
> 	zfs_dirty_data_max_percent                        10
> 	zfs_sync_pass_dont_compress                       5
> 	l2arc_write_max                                   15000000
> 	zfs_vdev_scrub_max_active                         2
> 	zfs_vdev_sync_write_min_active                    10
> 	zfs_no_scrub_prefetch                             0
> 	zfs_arc_shrink_shift                              5
> 	zfetch_block_cap                                  256
> 	zfs_txg_history                                   0
> 	zfs_delay_scale                                   500000
> 	zfs_vdev_async_write_active_min_dirty_percent     30
> 	metaslab_debug_unload                             0
> 	zfs_read_history                                  0
> 	zvol_max_discard_blocks                           16384
> 	zfs_recover                                       0
> 	l2arc_headroom                                    1000
> 	zfs_deadman_synctime_ms                           1000000
> 	zfs_scan_idle                                     50
> 	zfs_free_min_time_ms                              1000
> 	zfs_dirty_data_max                                3377293721
> 	zfs_vdev_async_read_min_active                    1
> 	zfs_mg_noalloc_threshold                          0
> 	zfs_dedup_prefetch                                0
> 	zfs_vdev_max_active                               1000
> 	l2arc_write_boost                                 15000000
> 	zfs_resilver_min_time_ms                          3000
> 	zfs_vdev_async_write_max_active                   10
> 	zil_slog_limit                                    1048576
> 	zfs_prefetch_disable                              0
> 	zfs_resilver_delay                                2
> 	metaslab_lba_weighting_enabled                    1
> 	zfs_mg_fragmentation_threshold                    85
> 	l2arc_feed_again                                  1
> 	zfs_zevent_console                                0
> 	zfs_immediate_write_sz                            32768
> 	zfs_free_leak_on_eio                              0
> 	zfs_deadman_enabled                               1
> 	metaslab_bias_enabled                             1
> 	zfs_arc_p_dampener_disable                        1
> 	zfs_metaslab_fragmentation_threshold              70
> 	zfs_no_scrub_io                                   0
> 	metaslabs_per_vdev                                200
> 	zfs_dbuf_state_index                              0
> 	zfs_vdev_sync_read_min_active                     10
> 	metaslab_fragmentation_factor_enabled             1
> 	zvol_inhibit_dev                                  0
> 	zfs_vdev_async_write_active_max_dirty_percent     60
> 	zfs_vdev_cache_size                               0
> 	zfs_vdev_mirror_switch_us                         10000
> 	zfs_dirty_data_sync                               67108864
> 	spa_config_path                                  
> 	/etc/zfs/zpool.cache
> 	zfs_dirty_data_max_max                            8443234304
> 	zfs_zevent_len_max                                128
> 	zfs_scan_min_time_ms                              1000
> 	zfs_vdev_cache_bshift                             16
> 	zfs_arc_meta_adjust_restarts                      4096
> 	zfs_arc_memory_throttle_disable                   1
> 	zfs_vdev_scrub_min_active                         1
> 	zfs_vdev_read_gap_limit                           32768
> 	zfs_arc_meta_limit                                0
> 	zfs_vdev_sync_write_max_active                    10
> 	l2arc_norw                                        0
> 	zfs_arc_meta_prune                                10000
> 	metaslab_preload_enabled                          1
> 	l2arc_nocompress                                  0
> 	zvol_major                                        230
> 	zfs_vdev_aggregation_limit                        131072
> 	zfs_flags                                         0
> 	spa_asize_inflation                               24
> 	l2arc_feed_secs                                   1
> 	zfs_sync_pass_deferred_free                       2
> 	zfs_disable_dup_eviction                          0
> 	zvol_threads                                      32
> 	zfs_arc_grow_retry                                5
> 	zfs_read_history_hits                             0
> 	zfs_vdev_async_write_min_active                   1
> 	zfs_vdev_async_read_max_active                    3
> 	zfs_scrub_delay                                   4
> 	zfs_delay_min_dirty_percent                       60
> 	zfs_free_max_blocks                               100000
> 	zfs_vdev_cache_max                                16384
> 	zio_delay_max                                     30000
> 	zfs_top_maxinflight                               32
> 	zfs_vdev_write_gap_limit                          4096
> 	spa_load_verify_metadata                          1
> 	spa_load_verify_maxinflight                       10000
> 	l2arc_noprefetch                                  0
> 	zfs_vdev_scheduler                                noop
> 	zfs_expire_snapshot                               300
> 	zfs_sync_pass_rewrite                             2
> 	zil_replay_disable                                0
> 	zfs_nocacheflush                                  0
> 	zfs_arc_max                                       14000000000
> 	zfs_arc_min                                       4000000000
> 	zfs_read_chunk_size                               1048576
> 	zfs_txg_timeout                                   5
> 	zfs_pd_bytes_max                                  52428800
> 	l2arc_headroom_boost                              200
> 	zfs_send_corrupt_data                             0
> 	l2arc_feed_min_ms                                 50
> 	zfs_arc_average_blocksize                         8192
> 	zfetch_array_rd_sz                                1048576
> 	zfs_autoimport_disable                            1
> 	zio_requeue_io_start_cut_in_line                  1
> 	zfs_vdev_sync_read_max_active                     10
> 	zfs_mdcomp_disable                                0
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at list.zfsonlinux.org
> http://list.zfsonlinux.org/cgi-bin/mailman/listinfo/zfs-discuss


More information about the zfs-discuss mailing list