[zfs-discuss] zfs databse optimization

Gordan Bobic gordan.bobic at gmail.com
Sat Dec 31 06:17:03 EST 2016

On Sat, Dec 31, 2016 at 11:00 AM, Fajar A. Nugraha <list at fajar.net> wrote:

> On Thu, Dec 29, 2016 at 11:37 PM, Gordan Bobic via zfs-discuss <
> zfs-discuss at list.zfsonlinux.org> wrote:
>> On Thu, Dec 29, 2016 at 4:03 PM, Chris via zfs-discuss <
>> zfs-discuss at list.zfsonlinux.org> wrote:
>>> Another case I've considered is exporting a zvol over iscsi from the zfs
>>> storage server to a database server. In that case, the default volblocksize
>>> is 8k, but again, if I'm planning to use lz4 compression on this workload,
>>> is it better to set a larger default volblocksize?
>> The above holds. Set it to the DB's page size for optimal results.
> There might be a possible mismatch on how zfs vs ext4-on-iscsi-on-zvol
> allocate blocks in this case. ASCII illustration (each o/x is 4k):
> zfs (16k)          : o o o o x x x x
> ext4 on iscsi (4k) : o o x x x x o o
> ext4 can allocate contiguous blocks as extents (16k in this example, for
> innodb's case). But AFAIK there's no guarantee that this extent will start
> on multiple-of-16k, causing zfs to essentially process 2 blocks for each 1
> innodb page.

That is what the mkfs.ext4 -E stride and stripe-width options are for.
Something like the following should minimize misalignment when using ext4:
mkfs.ext4 -b 4096 -E stride=4,stripe-width=4

But why would anyone use ext4 when they have ZFS?

> The best way I can think of to work around this problem is to use a
> filesystem (on top of iscsi) which can handle 16k block size (for innodb).
> Btrfs use 16k block size by default (perfect match for innodb), and can be
> increased up to 64k (this is what I currently use on my vm-on-zvol setup
> since I don't use innodb, with matching zfs volblocksize).

Please, don't use the B word until it's actually fit for purpose. Which I
expect to happen any decade now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.zfsonlinux.org/pipermail/zfs-discuss/attachments/20161231/03421bda/attachment.html>

More information about the zfs-discuss mailing list