So, I'm using git-annex to manage a group of external hard drives that I use to store a bunch of big files.
When syncing tonight, though, it decided to take a whole subdirectory and replace the symlinks with the content directly when merging, or something.
The remote side is not like that, and I didn't tell it to do anything like that. It just seems to feel like that's the thing to do, then it syncs and fails because suddenly git has many gigs of objects, which is what git-annex is supposed to solve.
I've reset master a bunch of times, and it keeps choosing that path, but I don't know why. I've also reset master and synced/master a few times. I'm not fully sure what that one actually encodes, so I didn't want to touch it too much, but setting it to the same thing as master didn't seem to fix things either.
Any idea why?
Are you using direct mode?
Before the direct mode guard was put in place, it was not uncommon for users to mistakenly run some command like "git commit -a" in a direct mode repository. This happily checks the full large files into git.
It sounds to me like you've done that. But, you have not provided enough information to make anything more than a vague guess.
I am not in direct mode. I don't believe I did anything in the repo. I reset back to a commit and look in the folder and it's all symlinks and nothing in git status says anything about it.
Then I run
git annex sync
and one of the thing it does is "Checking out files" which takes a long time, since it seems to be copying the data into the working directory and commiting it. I don't know why it's decided to do that.What steps can I take to either get you more information or fix things?
Perhaps relatedly, if I make a mistake with git-annex, is git-reset master and git-reset synced/master the right approach? Are there other things I should try, etc?
"Checking out files" is a message printed by git (not by git-annex) when it is updating the work tree.
It still seems that you have somehow committed large files directly to git. Perhaps you accidentally ran "git add" on a large file.
git annex sync is probably merging this commit from one of your other repositories.
I know that checking of files is a git message, I was just providing it for context.
I know that my other repos do not have these things checked in, because one of them is offline and not involved in this, and the other repo literally can't contain that much data, which is the biggest reason this is an issue.
Here are the commits I don't like:
Now, the most interesting part here is that commit 7403e3116eb is dated Sat Dec 21. That is not when I was attempting this. The 30th, shown in 1ae24 is.
That makes me think that I have failed to reset properly, and so when I'm telling it to sync it's not recomputing the thing, it's just grabbing the thing it did last time.
Now, I still don't know why it chose to do that, but perhaps it wouldn't choose to do that again if I could figure out how to get it to forget that path.
Thoughts?
Ok, so, looking into it a little more I think I've found that I screwed up updating a ref, so I didn't reset synced/master like I thought I did.
So, I don't know why it chose to make this choice in the first place, but I was able to reset and manually change my way around it.
That being said, I don't want to do this a lot, because I'm not fully clear on what the synced branches represent. Is it that, on repo1, repo2/synced/master is the last thing that pushed over, and that after a sync local synced/master is always master, and before that it's the last sync to anywhere?
How safe is it to screw around with these things?
The
T
in yourgit show --raw
indicates that what was a symlink has been replaced with a regular file, and this has been committed to git.It's certainly possible to do that when using git-annex, but you have to go a bit out of your way to so shoot yourself in the foot, because normally the pre-commit hook will detect that, and fix up your commit to not change the type of the file. Something like this:
The fact that you seem to have made 2 commits that did this, on the 21st and 30th, makes me wonder if your .git/hooks/pre-commit does not exist, or perhaps you are making some other mistake repeatedly.
The synced/* branches never have any data that is not stored somewhere else (another branch, possibly in the remote repository), so it should always be ok to delete them.