Please describe the problem.
When attempting to clone and use a git repository in a subdirectory several levels deep on Windows, I observe symptoms very similar to those described at http://git-annex.branchable.com/direct_mode/#comment-8feee726df4e287dd3751bc77fd1441f. By contrast, when I attempt the same operation in a subdirectory higher up, the operation is successful. Logs of both sessions are given below.
My suspicion is that this has to do with exceeding the maximum path length limitation (MAX_PATH) of 260 characters on Windows, as described here: http://msdn.microsoft.com/en-us/library/aa365247.aspx.
What steps will reproduce the problem?
See above.
What version of git-annex are you using? On what operating system?
git annex version git-annex version: 5.20140517-gee56d21 build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV DNS Feeds Quvi TDFA CryptoHash key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL remote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier ddar hook external local repository version: 5 supported repository version: 5 upgrade supported from repository versions: 2 3 4
git version git version 1.9.0.msysgit.0
Operating system: Windows 7 Professional (64-bit), Service Pack 1
Please provide any additional information below.
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
C:\Users\areeves\Documents\Work\MyDirectoryHere\git>git clone ssh://areeves@myserver:/home/work/git/sbv
Cloning into 'sbv'...
remote: Counting objects: 65, done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 65 (delta 26), reused 0 (delta 0)
Receiving objects: 100% (65/65), 9.25 KiB | 0 bytes/s, done.
Resolving deltas: 100% (26/26), done.
Checking connectivity... done.
C:\Users\areeves\Documents\Work\MyDirectoryHere\git>cd sbv
C:\Users\areeves\Documents\Work\MyDirectoryHere\git\sbv>git annex get
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
git-annex: C:\Users\areeves\Documents\Work\MyDirectoryHere\git\sbv\.git\annex\objects\3de\5f4\SHA256-s765223180--c9e2eebd915b4ade9429b00a7a893df928389b3fb4ab759ea9f00b0e05e18de6\: openTempFile: does not exist (No such file or directory)
C:\Users\areeves\Documents\Work\MyDirectoryHere\git\sbv>git annex direct
commit
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
ok
git-annex: C:\Users\areeves\Documents\Work\MyDirectoryHere\git\sbv\.git\annex\objects\3de\5f4\SHA256-s765223180--c9e2eebd915b4ade9429b00a7a893df928389b3fb4ab759ea9f00b0e05e18de6\: openTempFile: does not exist (No such file or directory)
failed
git-annex: direct: 1 failed
C:\Users\areeves\Documents\Work\MyDirectoryHere\git\sbv>cd c:\temp
c:\temp>git clone ssh://areeves@myserver:/home/work/git/sbv
Cloning into 'sbv'...
remote: Counting objects: 65, done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 65 (delta 26), reused 0 (delta 0)
Receiving objects: 100% (65/65), 9.25 KiB | 0 bytes/s, done.
Resolving deltas: 100% (26/26), done.
Checking connectivity... done.
c:\temp>cd sbv
c:\temp\sbv>git annex direct
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
(Recording state in git...)
c:\temp\sbv>git annex get
get BigBinaryFile_Data_Package_2012-03-31.tar.bz2.gpg (merging origin/git-annex into git-annex...)
(Recording state in git...)
sent 30 bytes received 765316741 bytes 11011752.10 bytes/sec
total size is 765223180 speedup is 1.00
ok
(Recording state in git...)
c:\temp\sbv>
# End of transcript or log.
I'm having the same problem:
In my case the filename is slightly shorter, 154 characters, for Aaron the offending filename was 162 characters.
I think the full filename that git annex is trying to write is 270 characters:
On Linux and OSX, there is a maximum filename size, typically 255 bytes. git-annex always ensures that keys it generates are a maximum of 255 bytes long, no matter the platform. But, in dir/subdir/file, each of the 3 segments of the path is allowed to be that long. The limit on the total path size on Linux is a more reasonable 4096 bytes; OSX has only 1024 bytes.
I don't know what to do about Windows having such an absurdly small
MAX_PATH
compared to more modern systems.The length of just a SHA512 checksum is 128 bytes; that means SHA512 backend cannot be used on windows, at all, since the paths git-annex generates will be at least twice that long, and will easily overflow
PATH_MAX
. I've confirmed this; just adding a file with --backend=SHA512 fails with a "No such file or directory" error when it tries to use the path.A SHA256 is a more manageable 64 bytes long. So a typical path to such an object will end with eg ".git\annex\objects\566\a33\SHA256E--d728a4c4727febe1c28509482ae1b7b2215798218e544eed7cb7b4dc988f838b\SHA256E--d728a4c4727febe1c28509482ae1b7b2215798218e544eed7cb7b4dc988f838b" -- 174 bytes long (or a bit longer when there are also extension and size in the key) and leaving only 86 bytes or so for
c:\path\to\repo
.Perhaps git-annex should reduce its maximum key size from 255 to 64 bytes, the same as SHA256. Then url keys would work on Windows, except for in deep paths, where git-annex cannot work at all. This would be an easy change.
git-annex could also avoid using absolute paths, which it currently uses extensively for simplicity (and possiibly robustness against renames of repositories and changes of working directory?), and use relative paths instead. This would probably solve the two examples given in the bug report, and it would make git-annex work better when in a deep path in Windows. It would not make SHA512 work though; with keys that long, the relative path is still too long. (And, it's still possible to get a relative path that has so many '../../' and subdirectories etc that it overflows
PATH_MAX
. It would probably take a really crazy repository directory structure though.)The MSDN article has one very interesting bit:
(It seems that, when using that prefix,
/
is not converted to\
.. I think git-annex is quite good about getting the slashes the right way round these days.)So it might be possible for git-annex to use that prefix and avoid this issue entirely. Haskell's FilePath library does understand that prefix (treats it as part of the drive). Since git-annex always uses the path to the top of the Repo when constructing the problematic FilePaths, I might be able to just change the Repo constructor to add that prefix, and everything follow from that. I tried doing that, unfortunately this makes git fail, with "fatal: relative path syntax cannot be used outside working tree" when operating on such a repo. Cause git doesn't understand that prefix.
I've started a
relativepaths
branch that uses all relative paths to the git repo. After working on it for several hours, there are still 16 test suite failures (update: 10) (update: 1). The potential for uncaught breakage is much higher than I am happy with. (Amoung other problems, git-annex does call setCurrentDirectory in several places, and this utterly breaks the relative paths).Using that branch on windows, I am still unable to add files with --backend=SHA512; even relative paths don't make it short enough for such keys.
Even with relative paths, Edward's example would use a path of 253 characters, and so a slightly longer url would still break it, even with relative paths.
So, I think reducing url key length needs to be done anyway, and I've done that. Which hardly closes this bug.
I've beat on the relativepaths branch some more and am probably as confident about it as I'm going to get. Will have to merge it and see what else it breaks.
Also, I've documented that SHA512 and other large hashes are not recommended if one wants to interop with Windows.
None of which completely fixes this bug, but short of teaching git about the magic filename prefix to make windows not be so broken, I don't see anything more I can do.
Workaround: Enable long paths in the windows registry. See https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247#maxpath
It would be good to make git-annex enable that automatically, perhaps by using the manifest file that is described on that page. I don't know how to make windows use such a manifest file. It seems to have to be embedded into the exe file. GHC has a open ticket to get it to do that: https://ghc.haskell.org/trac/ghc/ticket/13373
I enabled long filename support (running Windows 10 1709 build 16299.192), did a reboot and I'm still getting this error:
That filename is only 216 characters too... is there any way to diagnose why the file creation failed?