Please describe the problem.
After initializing an empty new v7 repo in a remote NTFS drive from linux mounted via CIFS, adding new file fails with a SQLite3 error. Using a v5 repo in direct mode works (with git annex init --version=5).
What steps will reproduce the problem?
$ git init .
$ git annex init test
$ git annex add file1
add file1
git-annex: SQLite3 returned ErrorBusy while attempting to perform close: unable to close due to unfinalized statements or unfinished backups.
failed
git-annex: add: 1 failed
What version of git-annex are you using? On what operating system?
Ubuntu 19.04
git-annex version: 7.20190129 build flags: Assistant Webapp Pairing S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.0.0 ghc-8.4.4 http-client-0.5.13.1 persistent-sqlite-2.8.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0 key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar hook external operating system: linux x86_64 supported repository versions: 5 7 upgrade supported from repository versions: 0 1 2 3 4 5 6
Please provide any additional information below.
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
$ mount
//192.168.0.1/myserver/myfolder on /media/mydisk type cifs (rw,nosuid,nodev,noexec,relatime,vers=1.0,cache=none,username=myname,uid=1000,forceuid,gid=1000,forcegid,addr=192.168.0.1,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=61440,wsize=65536,bsize=1048576,echo_interval=60,actimeo=1)
$ mkdir /media/mydisk/tmp
$ cd /media/mydisk/tmp
$ date > file1
$ echo "The following fails..."
$ git init .
Dépôt Git vide initialisé dans /media/wdtv/tmp/.git/
$ git annex init test
init test
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Basculement sur la branche 'adjusted/master(unlocked)'
ok
(recording state in git...)
$ git annex add .
add file1
git-annex: SQLite3 returned ErrorBusy while attempting to perform close: unable to close due to unfinalized statements or unfinished backups
failed
git-annex: add: 1 failed
$ echo "The following works..."
$ rm -rf .git/
$ git init .
Dépôt Git vide initialisé dans /media/wdtv/tmp/.git/
$ git annex init --version=5 test
init test
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
ok
(recording state in git...)
$ git annex add .
add file1 ok
(recording state in git...)
# End of transcript or log.
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
When initializing the repo with v5 git-annex (git annex init --version=5 test), then the legacy direct mode is used and file addition works well.
I guess the CIFS must be the cause of the problem, NTFS on linux works in my tests.
Are you able to reliably reproduce this problem every time, or does it only fail that way some of the time?
It would be good to know if this also happens with using sqlite3 at the command line on the CIFS mount point. Can you do this:
And see if that also crashes.
According to https://www.sqlite.org/c3ref/close.html, a close can indeed fail with BUSY, and I guess the only thing to do then would be to keep retrying until sqlite hopefully gets around to finishing whatever it's doing.
Or, it looks like using
sqlite3_close_v2
might be an option since that leaves the db running in the background until it's able to close. Although if git-annex is exiting at the time, that might be problimatic.sqlite3_close_v2
is not currently available in the haskell bindings.Oddly, the docs say this should only happen when there are "unfinalized prepared statements or unfinished sqlite3_backup objects". I don't believe git-annex uses either. It seems likely something else in sqlite is failing that results in BUSY, and if that can be reproduced outside of git-annex it would be good to file a bug on sqlite about it.
Sorry for the long answer delay.
I tried the given commands, after fixing the "insert" one a little (syntax).
The diff between /tmp and /media/mydisk/tmp is the filesystem, /tmp being in RAM. It seems that sqlite does not like the user rights and owner of files mounted by CIFS, here "olivier:users", even though user "olivier" can read/write files there and belongs to group "users".
I've made git-annex catch the ErrorBusy at database close, and retry for up to 16 seconds. That's in 9628ae2e6758d1ec6e10df7f3540cd78ce333f1f
I don't know if that solves the problem. It would be good to try and find out.
It looks more like a locking problem than a permission problem.
Anyway, you've certainly shown that sqlite is not able to work well on that filesystem.
I forgot was that git-annex uses WAL mode for the database, and that will change how sqlite does locking. Can you please try this on a new database file:
I've found several discussions about Sqlite on CIFS.
https://stackoverflow.com/questions/42722855/sqlite-3-on-windows-share-cifs-access-from-ubuntu-nobrl-risk
That also had all write actions fail with "locked". It suggests mounting the CIFS filesystem with the "nobrl" mount option. Another thread suggested instead the "nolock" mount option.
Those may be worth trying, although I'd make sure that
annex.pidlock is set to true in the repository's git config. As long as it is, disabling locking at the database level is not unsafe, since only one git-annex process will be able to use the database.
You only ever use git-annex on one computer at a time in a given repository on this CIFS drive.
The lock issue seems confirmed here. And the "nobrl" mount option solves the issue.
Maybe you should inform the user about "nobrl" & "annex.pidlock" during the "git annex init" filesystem check, refuse to go on and propose legacy direct mode instead if user cannot accept to have no filesystem locking ?
Without changing my mount options:
After adding "nobrl" in the mount options:
It seems "nolock" is an alias for "nobrl", results are the same; when I put "nolock" in the mount options, I end up with "nobrl" in the "mount" command output.
Finally, with the "nobrl" mount option:
Hurray !
I notice that
git annex init
is not able to detect that posix locks are not working (I assume they are not), so it doesn't enable pid locking. You should run:git config annex.pidlock true
That could be improved: The current probe only tries to set an exclusive lock, but does not verify that the exclusive lock has any effect. But I'm a bit wary about all the ways verifying the lock could go wrong given that the posix locks are violating posix. Trying to take the lock nonblocking might block forever or any other undefined behavior.
It does seem that init ought to test if sqlite can be used, and fail early otherwise. Better to learn about the problem before you start using the repository.
We have got user of cifs filesystem mount reprorting that setting pidlock didn't help while using git-annex 10.20220504-g35cafb7 and cifs mount with options
Please advise to how to troubleshoot such situation.
NB I might later try similarish cifs mount of our local HPC which I believe might expose it similarly.
@yoh Mounting the cifs filesystem with nobrl should avoid the problem. To make that safe, you also need to set annex.pidlock.
Setting annex.pidlock on its own will not help.
I don't think that git-annex can be changed in any way that makes sqlite work on this filesystem in its default state. What I discussed in comment
7 seems like the only feasible improvement to git-annex and would only
help the user learn about the mount option.
Another way this could be dealt with is to move the sqlite databases to a local filesystem.
Since git-annex only ever uses them for caching information from git, it can rebuild them, so even a tempfs would work. And it would also be ok if two git-annex processes accessing the same repo used different sqlite databases, since they would both build up the same information. (Even if they were on different computers using a network filesystem.)
This could be done with a git config. And perhaps get
git-annex init
to somehow probe for this problem and set the git config. Although, what path would it set it to?(Note that gitAnnexKeysDbIndexCache would also need to be moved since it contains information about the content of a sqlite database.)
I have mostly implemented an annex.dbdir to relocate the sqlite databases. It's in the
dbdir
branch until I get it fully working.annex.dbdir is now implemented. Note that it can safely be set to the same path in several repositories. If all your repositories were on cifs, you could even set it globally.
This issue remains open because
git-annex init
ought to probe to determine when sqlite cannot be used on the repository's filesystem. I don't think it could itself set annex.dbdir to work around the problem, because what would it set it to? Maybe~/.cache/git-annex/something
? Seems better for it to explain the problem to the user and suggest that they set it.I've implemented probing by
git-annex init
, and it will display this message to help the user get it configured:Going to close the bug since I think this is the best that it can be handled.