[zfs-devel] Ping: reasoning in module/zfs/zfs_ctldir.c
u-odpp at aetey.se
u-odpp at aetey.se
Thu Oct 13 04:27:17 EDT 2016
On Thu, Oct 13, 2016 at 03:56:51AM -0400, Chris Siebenmann wrote:
> I got curious enough to take a quick look in the code.
> The specific comment is in module/zfs/zfs_ctldir.c:
> * Because the dynamically created '.zfs' directory entries assume the use
> * of 64-bit inode numbers this support must be disabled on 32-bit systems.
> The Linux kernel generic inode structure has an i_ino member that
> is defined as an unsigned long. On 32-bit systems, unsigned longs
> are only 32 bits (you must go to unsigned long longs to get 64 bits).
> Various functions and data structures inside the ZFS ctldir stuff deal
> with 64-bit identifiers that must be put into unsigned long inode numbers,
> or passed to general Linux kernel functions that take inode numbers (as
> unsigned longs). One example function here is zfsctl_inode_lookup(),
> which takes a 'uint64_t id' and winds up calling 'ilookup(zsb->z_sb,
> (unsigned long)id)'.
> The ctldir code uses special inode numbers to represent various
> special things. Per the end of include/sys/zfs_ctldir.h:
> * These inodes numbers are reserved for the .zfs control directory.
> * It is important that they be no larger that 48-bits because only
> * 6 bytes are reserved in the NFS file handle for the object number.
> * However, they should be as large as possible to avoid conflicts
> * with the objects which are assigned monotonically by the dmu.
> #define ZFSCTL_INO_ROOT 0x0000FFFFFFFFFFFFULL
> #define ZFSCTL_INO_SHARES 0x0000FFFFFFFFFFFEULL
> #define ZFSCTL_INO_SNAPDIR 0x0000FFFFFFFFFFFDULL
> #define ZFSCTL_INO_SNAPDIRS 0x0000FFFFFFFFFFFCULL
> There is nothing in the code that actually stops these special
> inode values from colliding with inode numbers for real files.
Isn't the safeguard against the collisions referenced in the comment
above by "assigned monotonically by the dmu"?
As long as they are sparingly assigned, there is a gap of about 2^32
between zero and those reserved numbers. This is of course much less
than 2^48 but still enough for many scenarios.
> It is simply extremely unlikely ... if they are full 48-bit
The likeliness or not seems to lie in how the inode numbers are allocated.
Linear sequential allocation and reuse would make it safe until
the number of objects grows to almost 2^32.
> If all of this is correct, I suspect that the odds of the ZFS
> developers ever enabling ctldir support on 32-bit ZOL is extremely
This was the question, whether the analysis (mine or yours) is correct.
Apparently this becomes moot on the newer kernels but still is
interesting to learn.
More information about the zfs-devel