I've been using git annex for the last few months and there are a lot of things I really like about it, but I continue to run into frustration after frustration and it is really hampering my experience with it. I'm not sure if the frustration is coming from my ignorance, inexperience, the fact I'm using it on a windows machine, or whether I have unrealistic expectations for it. Let me explain how I'm currently using it, and maybe people have suggestions on how I can make my experience more fluid.
Currently I have a several annex repos on my laptop and copies on two separate hard drives. Over the course of a week I'll add and refer to stuff in the annex repos on my laptop, and then once per week I'll sync the repos with my external hard drives and remove content from my laptop to free up space. Eventually, I'd like to write a script to handle the syncing process.
I continue to have issues. A few weeks ago I accidentally added large files straight into git and not knowing or understanding the documentation enough to remove them I eventually just removed and recloned the repo. To this week, being unable to sync my repos due to filenames that are too long. I know neither of these examples are really issues with annex itself, but they are issues which have come up while using annex which gives me bad mental associations with it.
I guess this is more venting frustration and looking for guidance rather than a specific question. I've read through much of the documentation, but do I need an understanding of the internals of git and git annex so when something goes wrong I know how to fix it? Where is a good place to start learning these? There have now been a couple issues around being on a windows machine, is it even worth using git annex on a windows machine? Do people have any tips on how I'm using git annex which may make my experience better, like a different workflow or something?
(I'm still having the filenames issue, which from what I can tell the only real solution is changing the name. If anyone knows a better solution it would be appreciated)
Hi,
What may help with the filenames too long is putting the repository directly under
C:
likeC:\repo
. The reason is that Windows paths are limited to a maximum of 260? bytes and if you have it under likeC:\Users\bla\loong\path\repo
then the whole path to the keys (C:\Users\bla\loong\path\repo\.git\annex\objects\xyz\xyz\SHA256E-e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
) might be too long.You could also try with a different backends like
SHA1
since then the hash part (SHA256E-e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
) is shorter.There's a learning curve, especially around the areas you mention. I think the rules for which files go into annex vs git have evolved in a somewhat ad-hoc way as @joeyh did his best to reconcile people's different use cases, so ended up being somewhat complex. It may be worth taking the time to read the main page top-to-bottom once, especially the config options, and also the internals page. You can tag git branches when things are in a good state so you can later reset them if anything goes wrong; note though that you'd want to tag the git-annex branch and any
synced/*
branches as well.I've tried using git-annex on Windows and gave up, using it inside an Ubuntu guest VM via VirtualBox, but I think others have successfully used it on Windows. Overall I've found git-annex to be very reliable and adaptable to almost any scenario, and these forums helpful. The issues I've had with git-annex have been mostly around slowness on large repos, but recent releases have made strides to address that.
I meant, I ended up using it instead inside an Ubuntu guest VM via VirtualBox, and that has worked very well. The only caveat is that the files have to be stored inside VirtualBox volumes; trying to access Windows files via shared folder ?runs into VirtualBox filesystem issues.
Thanks for the support guys! I really appreciate knowing that others have struggled with similar issues and I'm not giving up yet.
@Lukey That's a great idea, except I'm already doing that. The issue really wasn't to do with annex, as it wasn't the hash files names. I was downloading torrent files directly to the annex so I could keep the torrent link active without keeping two copies of the files. But torrent filenames are packed with information and thus long, leading to my issue. I just changed the filename and broke the torrent link. Frustrating, but ultimately not really a big deal. Thanks for the suggestion though.
@Ilya_Shlyakhter Thanks for the support. I'll go through the Main Page again as well as that Internals page. I think I skipped the Internals page when I was going through the documentation a few months ago, but looking at it now it looks pretty illuminating. I'll also check out using a VM, but I may just decide to make the jump to linux and use a windows VM there.
Not sure if you're already using it, but there is a bittorrent special remote (haven't tried it myself).