path: root/fs/dcache.c
diff options
authorLinus Torvalds <torvalds@linux-foundation.org>2012-03-19 16:19:53 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2012-03-19 16:19:53 -0700
commit6d7d1a0dc735ea8412769edae7154885021107a9 (patch)
tree9ccf7b36be77b61cdb1491f262be95592c731858 /fs/dcache.c
parentc16fa4f2ad19908a47c63d8fa436a1178438c7e7 (diff)
vfs: get rid of batshit-insane pointless dentry hash calculations
For some odd historical reason, the final mixing round for the dentry cache hash table lookup had an insane "xor with big constant" logic. In two places. The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a fairly random-looking number that is designed to be *multiplied* with so that the bits get spread out over a whole long-word. But xor'ing with it is insane. It doesn't really even change the hash - it really only shifts the hash around in the hash table. To make matters worse, the insane big constant is different on 32-bit and 64-bit builds, even though the name hash bits we use are always 32-bit (and the bits from the pointer we mix in effectively are too). It's all total voodoo programming, in other words. Now, some testing and analysis of the hash chains shows that the rest of the hash function seems to be fairly good. It does pick the right bits of the parent dentry pointer, for example, and while it's generally a bad idea to use an xor to mix down the upper bits (because if there is a repeating pattern, the xor can cause "destructive interference"), it seems to not have been a disaster. For example, replacing the hash with the normal "hash_long()" code (that uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes the hash worse. The hand-picked hash knew which bits of the pointer had the highest entropy, and hash_long() ends up mixing bits less optimally at least in some trivial tests. So the hash function overall seems fine, it just has that really odd "shift result around by a constant xor". So get rid of the silly xor, and replace the down-mixing of the bits with an add instead of an xor that tends to not have the same kind of destructive interference issues. Some stats on the resulting hash chains shows that they look statistically identical before and after, but the code is simpler and no longer makes you go "WTF?". Also, the incoming hash really is just "unsigned int", not a long, and there's no real point to worry about the high 26 bits of the dentry pointer for the 64-bit case, because they are all going to be identical anyway. So also change the hashing to be done in the more natural 'unsigned int' that is the real size of the actual hashed data anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/dcache.c')
1 files changed, 3 insertions, 3 deletions
diff --git a/fs/dcache.c b/fs/dcache.c
index bcbdb33fcc2..5f00a6f63c9 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -105,10 +105,10 @@ static unsigned int d_hash_shift __read_mostly;
static struct hlist_bl_head *dentry_hashtable __read_mostly;
static inline struct hlist_bl_head *d_hash(const struct dentry *parent,
- unsigned long hash)
+ unsigned int hash)
- hash += ((unsigned long) parent ^ GOLDEN_RATIO_PRIME) / L1_CACHE_BYTES;
- hash = hash ^ ((hash ^ GOLDEN_RATIO_PRIME) >> D_HASHBITS);
+ hash += (unsigned long) parent / L1_CACHE_BYTES;
+ hash = hash + (hash >> D_HASHBITS);
return dentry_hashtable + (hash & D_HASHMASK);