Hi, I'm trying to convert a v5 classic, symlink repository into a v6 always unloked repository.
I'm trying to follow along with: http://git-annex.branchable.com/tips/unlocked_files but something goes wrong in the process, so I'm sure I'm missing something. The repository is about 600G to start with on a ext4 filesystem, and I don't have another 600G of free space on disk, so I'm going with thin mode:
- git annex upgrade
- git config annex.thin true
- git annex fix
- git annex unlock
It's all good to this point. Everything gets unlocked and is fine. Then I try to commit the changes with:
git annex sync
The process seems to take and incredibly long time (several hours) and then ends up running out of space. I check the repository with "du -sh" and it's almost double the size. Is there a reason for this? Is there a way to avoid this duplication of data. Shouldn't annex.thin do the trick?
This is also strange: with "git annex info" I get
local annex keys: 3572
local annex size: 933.19 gigabytes
annexed files in working tree: 0
size of annexed files in working tree: 0 bytes
the annex size should be about 540G and why are there no annexed files in the working tree?
Is there a correct and faster way to migrate my repo to an always unlocked one which won't require hours of time and take all that disk space?
thanks a lot, daniel
Thank you for trying v6 unlocked mode; do note that it's still somewhat experimental.
git annex info
doesn't report correctly on unlocked files, which is why it has 0 in two places in the output you showed. I have just committed a fix for that problem.I don't know what would cause the unexpectedly small "local annex size". That should correspond to
du -hsc .git/annex/objects
; if it does then you seem to have fewer annexed objects than you expect for some reason.The behavior on sync sounds kind of like git commit is checking the whole contents of files into git, bypassing the annex. I don't know how that could happen, barring a misconfiguration, but it's at least the first thing to check. Check for gigabytes of data in .git/objects/ to see if that is the case.
If the above isn't the problem, can you see if the files in the work tree have a link count of 2? For example:
If you don't see link count of 2, something has caused the annex.thin not to take effect. One possibility would be if the
git annex sync
merged in changes that moved a lot of files around. When that happens, git checks out the updated work tree, and git-annex currently is not able to preserve annex.thin in that case.Hi, thanks for your quick feedback!
So, I've investigated a bit and this is what I found:
I currently deleted the annex and restored from a backup because I couldn't afford to keep it in an inconsistent state. It seems like this problem can be consistently reproduced. I had the same problem on a smaller annex (200G) which took a long time to commit and also inflated during the process. At the end of the process (it had enough space to finish) the size shrinked again, although it was still bigger than the original. I also tried with a test annex and random data a the problem seemed to still be there.