Please describe the problem.
decided to test annex on a new to me file system -- beegfs
$> mount | grep beegfs
beegfs_nodev on /mnt/beegfs type beegfs (rw,relatime,cfgFile=/etc/beegfs/beegfs-client.conf,_netdev)
$> modinfo beegfs
filename: /lib/modules/5.4.0-77-generic/updates/fs/beegfs_autobuild/beegfs.ko
version: 7.2.2
alias: fs-beegfs
author: Fraunhofer ITWM, CC-HPC
description: BeeGFS parallel file system client (http://www.beegfs.com)
license: GPL v2
srcversion: 533BB7E5866E52F63B9ACCB
depends: ib_core,rdma_cm
retpoline: Y
name: beegfs
vermagic: 5.4.0-77-generic SMP mod_unload modversions
What steps will reproduce the problem?
get beegfs
leviathan:/mnt/beegfs/yoh/tmp $> TMPDIR=$PWD/annex-tmp git annex test
What version of git-annex are you using? On what operating system?
leviathan:/mnt/beegfs/yoh/tmp
$> git annex version
git-annex version: 8.20210621-g91f9aac
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.1 ghc-8.8.4 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
Please provide any additional information below.
looking in detail -- it seems it is not init, but addurl (but subject is set in stone now, can't edit) -- got mislead I guess by the interleaving stdout/err:
addurl: FAIL (2.79s)
Init Tests
init: ./Test/Framework.hs:57:
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo96/myurl failed (transcript follows)
(to _mnt_beegfs_yoh_tmp_.t_tmprepo96_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo96%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
...
addurl: FAIL (1.86s)
./Test/Framework.hs:57:
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo193/myurl failed (transcript follows)
(to _mnt_beegfs_yoh_tmp_.t_tmprepo193_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo193%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
Init Tests
...
addurl: FAIL (2.29s)
./Test/Framework.hs:57:
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo293/myurl failed (transcript follows)
(to _mnt_beegfs_yoh_tmp_.t_tmprepo293_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo293%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
3 out of 984 tests failed (1776.96s)
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
on days ending with y
it seems to work quite nicely.
fixed, I think, though have not installed beegfs to test. --Joey
".git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo193%myurl" is not a directory, it is a file. So, rename seems to have no business failing in this way. Probably the FS is buggy.
Thank you Joey! indeed most likely a "too fancy" of a file system.
On https://www.beegfs.io/release/beegfs_6/Changelog.txt I found
do you think EXDEV would be worked out Ok if that is the culprit? (meanwhile I will let the beegfs users know as well - may be they could try)
I've checked with strace, to see if the file was open while it was being renamed. Not that there is anything generally wrong with renaming an open file on a POSIX file system, but it would possibly be a problem on windows, where some forms of opening a file locks it in place. And apparently this filesystem is not trying to be very POSIX either.
So the file is left open across the rename, which ought to be able to be changed and would presumably fix the problem.
It's also a bit odd that the file gets read twice after being copied, once for checksum makes sense, but what's the other one? (Copying while checksumming should be able to avoid one of the reads, but there is an open todo tracking progress on that.)
Aah, the other read is when it's probing if the file is html in case it ought to be passed off to youtube-dl. That is the read that lingers for a while, because it's done with a lazy readFile and probing if the file is html doesn't read to the end and close it, so the file handle lingers until the GC gets around to closing it. Of course youtube-dl won't be able to do anything with a file url, but git-annex doesn't know that. And anyway the failure on this filesystem would also happen when adding a http url.
Ok, fixed it to close the handle promptly. That should fix the test suite. It does not seem unlikely that something else will break due to this filesystem's unusual behavior though.
Also looked over other uses of readFile. While there are a couple that don't read the whole file and so may have a lag closing, none of them are files that are used in ways that seem likely to trigger this kind of problem.