From https://cyan4973.github.io/xxHash/ , xxHash seems much faster than md5 with comparable quality. There's a Haskell implementation.
From https://cyan4973.github.io/xxHash/ , xxHash seems much faster than md5 with comparable quality. There's a Haskell implementation.
I looked at xxHash recently. I can't seem to find benchmarks of it compared with other fast hashes like Blake2.
Let alone blake3, which is 5-6 times as fast as blake2 while still apparently being a cryptographically secure hash.
https://cyan4973.github.io/xxHash/ now includes blake2, and xxh3 is much faster, 28 times as fast.
Would need haskell library, http://hackage.haskell.org/package/xxhash is out of date. It would probably not be hard to make a xxh3 haskell library, but I'm inclined to wait for someone who really wants it.
Debian already has it in libxxhash0.
Interestingly, the ghc RTS uses xxhash and recently updated to xxh3. https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4248 But I don't think that's exposed to haskell code.
I'm trying to create a external backend for xxHash, but experienced weird behaviors.
If only
/bin/git-annex-backend-XXH3is present in$PATH, andgit config annex.backend XXH3is set, then git annex complainsCannot run git-annex-backend-XH3 -- It is not installed in PATH, which seems like a bug. And if/bin/git-annex-backend-XXH3is moved to/bin/git-annex-backend-XH3according to the error message, it will complainCannot run git-annex-backend-XXH3 -- It is not installed in PATH(this is expected). Finally I have to link the same shell script to both/bin/git-annex-backend-XH3and/bin/git-annex-backend-XXH3to make the backend configXXH3work.This is a bug in your program. It is generating a key using the XH3 backend, rather than the XXH3 backend.
When git-annex later wants to do something that that key, it expects to find a git-annex-backend-XH3 program.
This change will fix it:
However, since the hash is named "XXHASH", and this is an external backend, I think the backend name you should really be using is "XXXHASH". This leaves the "XXHASH" backend name free for git-annex to use if it implemented it as a built-in backend.
Once you have the program working, we can add it to the list of external backends.
I am inclined to keep this todo open despite external backend programs existing, because it would be nice to have xxHash in git-annex natively due to its speed.
I found this haskell library which includes xxh3 and which would be easy to add as a git-annex dependency, although it would need to be gated behind a build flag for now: https://hackage.haskell.org/package/xxhash-ffi
(Since that library uses Hashable, it generates an Int for the hash. This seems to limit it to be used on 64 bit platforms. https://github.com/haskell-haskey/xxhash-ffi/issues/6 The lower-level Data.Digest.XXHash.FFI.C uses CULLong so will work on 32 bit.)