So i am coming from git world, All documentation and posts seemed to be for media etc. I have a simple question.
Can i use git annex to manage large files , Which i think i can. But will it work with branching/tagging etc.
e.g. We currently use git have some large files in git repo.... I am planning to move the large files to use annex. But would like to maintain branches, i.e. each branch might have different version of files and maybe tagged, So i can get back some old version from a branch.
Is this possible ? I did not get any explicit answer or examples on how this works or even if its supported.
Also can i use same folder for git and git-annex ?
Yes, git-annex would not do very well at adding large file support to git if it did not handle tagging, branches, etc! So of course it does. It's in a sense too obvious a thing to get much mention. And so people sometimes get confused about it.
The only thing to need to be aware of coming from git is that not every repository will have every version of every file locally available. When you check out a branch, you may need to run
git annex get
to retrieve those versions from origin or elsewhere.And,
git annex unused
can be used to find versions of files that no existing tag or branch refers to, andgit annex dropunused
can then delete those versions. If you want to ensure every revision in your git repo is accessible, you should avoid using those two commands; otherwise git-annex will never delete old versions of files.The unreleased git master adds a new feature, a --all switch that makes git annex commands operate on all versions of files. While normally
git annex get
will only do what it needs to to get all files in the currently checked out branch,git annex get --all
will pull down every version of every file in the whole history. Similarly,git annex copy --all --to origin
will ensure that every locally available version of every file is sent to origin.git add file
and added to git annexgit annex add file
Awesome!! thanks, As i am kicking the tires, Based on my current scenario, How can i deploy this in my company , So each developer to do the bare minimum, Like just install git annex and it should work.
i.e. is it possible to avoid the step git remote add backup ? Then they can just do
$ yum install git-annex $ cd repo $ git checkout staging $ git pull $ git annex get large_file
And it then just works.
I don't know where you got the "git remote add backup" step from. Obviously this is not necessary unless you want to add a remote named "backup".
To use git-annex in a centralized git environment, which it sounds like you have here, you just need to install git-annex on the central git server, and arrange for all the developers to have ssh access to it. Then any git repository that is cloned from that server using ssh as the transport can support git-annex without the user needing to do anything special to set it up.
This assumes that users have shell accounts on the server. git-annex includes the git-annex-shell program, which is similar to the git-shell in git. User accounts can be locked down to use this restricted shell if giving them full shell access to the server is not desired.
If the server is using a git repository manager like gitolite or gitosis, those can also be adapted to use git-annex-shell. I got gitolite patched to support it earlier, see using gitolite with git-annex.
PS: I'm available for consulting on deploying git-annex in production environments.
Thank you very much!!!!! Yes, Planning to use Centralized repo. Basically what i understood was there will be a git server and separate server for hosting the large files. Its great to know i can use the same server and do not need to explicitly point to the server.
As we are using Gerrit, What is the general best practice for it. (After i am done with it, I will write a blog post, As i am sure many people with centralized repo's might need this)
Thanks!
Yes, the default for a repository accessed using ssh is to also store the large files in that repository. Of course you can set up a second remote to hold the large files if that works better.
I don't have any personal experience with Gerrit. All I can say is that files managed by git-annex will appear as symlinks.
Thanks, So i made some progress, In the gerrit repo server, (which has git-annex installed)
I ran: git annex init origin,
Did a clone of the repo, it had the branch git-annex, So i assumed all is working.
I did a git annex add and git push
Then when i try to git annex copy --to origin
I get error , But git-annex-shell is installed on the machine:
Any idea what i need to do to make it work ?
Ahh... So i guess this might not be possible in near future.
Ill go to plan B, Setup another server which users will connect to for large files which will have git-annex installed. (So a central Repository for git annex). They will need to add another remote in git config.
Does git annex have a way to manage ssh keys ? (Please can you point me to the location where i can read how git-annex-shell ? )
Yes, a separate repository is one way to go, although I do not think it would be very hard to get gerrit's internal ssh to run git-annex-shell.
git-annex-shell is a low-level restricted shell like git-shell, not a login manager, so it has nothing at all to do with ssh keys. Things like gitosis and gitolite can be built on top of git-shell and git-annex-shell and handle ssh key management.