Hi all,
git-annex basically renders my repository unmanageble. What is the best and save way to recover?
Here is my situation:
I have a fairly large repository with ~8000 managed files taking about 65GB of disk space.
git-annex worked well there. But some programs choke on the symlinks. So, I converted the repository to direct mode. The transition worked well.
Now git status reports a type change for the ~8000 files.
But as soon as I run
git commit -m "typechange" even-only-one-of-the-files
the process git-annex pre-commit .
eats 3.5GB of ram, where I
usually kill it, as I only have 4GB of ram....
-- Andreas
AFAIK in direct mode you are not supposed to commit you just run
and it will commit if files are changed. you only add new files annexed files are handled by sync
Hi Hamza,
thanks for that comment. I thought
git annex sync
is just a wrapper aroundgit commit -a
(among others).Using
git annex sync
does not help, as it just means that nowgit annex sync
eats all my memory until swapping starts.git-annex sync is a wrapper around git commit. But not -a! git commit -a will stage every one of your large files directly into the git repository, wasting much memory and worse, disk space. It is ok to use
git commit
orgit commit --staged
in direct mode after eggit annex add
. But notgit commit -a
orgit commit even-only-one-of-the-files
. It's best to just usegit annex sync
rather thangit commit
, as it avoids finger memory causing you to run the wrong type of commit command. Please see direct mode for the details.I was able to make pre-commit take a lot of memory by committing a 1 gb file directly to git. git-annex was buffering the whole file content in memory due to not thinking to check first if it was a symlink. I have fixed that bug.
So I think you must have run that command you showed, and you now have a lot of data stored in your git repository that you had meant git-annex to handle. You might need to use git-filter-branch to remove it..
This kind of thing is why I need to write the ?direct mode guard.
Let me ask a simple question: All files are regular files (no symlinks). If I can live with loosing the history, is it save to just remove the .git directory and start over? Or do I risk anything?
PS: .git is about 7GB
Thanks a bunch! That is what I did now. Seems good.
May I additionally suggest a change in the man page of git annex? There I read
which made me think, I could do
git commit -a
manually as well.