Hi all,
the last few days I wrote the first 1200 words of a Git annex book (in German), but I am unsure whether I shall continue.
I've written two books about non mainstream software, namely about the documentation generator Sphinx and the document converter Pandoc. I am using these programs myself on a daily basis so that I was quite sure to cover most of their features in the books. I even narrowed the scope of the Pandoc book to "book production" as this is what I do with it, leaving out some use cases such as slides or HTML production.
Before I really start writing the book, I would like to get a clear and robust understanding of the program. As you can see, I don't write books about very popular software in order to sell a lot of copies, I write about software I really like. But I would not write about software that isn't used at all.
So to get a clear understand of Git annex I need a sparring partner who is willing to discuss usecases, viewpoints and the outline of the book. The following questions and problems are blocking my work:
Some time ago I used Git annex to sync my desktop and my laptop. I used the webapp. My experiences were very bad. Git annex had a very bad performance slowing down my computer making it nearly unusable. I had numerous Unicode errors which let to a duplication of files. So I stopped using Git annex and switched to Syncthing, which does the job without any problems. Syncing computers with the webapp obviously is not be the usecase I would like to cover in the book. Today I use Git annex as a feedreader.
So why do I want to write a book about a software I do not really use? First of all I like the concept. I like the idea to manage big files in one tree while the content of many files is stored in remote archives. So this could be the idea of the book: sustainable management of big files. And after reading https://git-annex.branchable.com/design/iabackup/ I think that this could be the main use case of Git annex.
When I wrote about Sphinx and Pandoc I talked to people who used the software to a great extent. Sphinx is the the de facto standard to document Python code. So I was able to evaluate the software before I invested half a year of writing a book. I am not able to say whether Git annex is good for doing x or y, because my setup is way too small.
So I am looking for power users, for people who use Git annex to manage at least 1TB of data or more. Would you recommend Git annex to film professionals, to music professionals, to data miners, to universities, to your parents? Does it really makes your life easier? What are your experiences with the webapp?
I would be happy if you are willing to discuss these questions here, privately by mail or by chat eg. via Firefox Hello.
TIA juh
hi!
I understand you may have had unicode problems with git-annex in the past: i had my ?own share, but most bugs i filed were fixed (see anarcat for a full list).
I regularly use git-annex to manage my music, video and book collection. The total fileset is around 1-2TB, with thousands of files. Sometimes the music collection operations are a little slow because there are so many files, but in general i am very happy with the project. I don't use the webapp so much, because I feel the interface is too limited, and do a lot of things on the commandline.
git-annex not only makes my life easier, it makes possible some things that were impossible before. I used to have a really hackish shell script to rsync part of my music collection to my laptop because it's too big to fit there completely. It wasn't working so well and there was no way to sync new music back in the main collection. Now I regularly can make changes to the music collection on the laptop without the files even being present. I can also import new files on the laptop easily when i meet people that want to share with me in my travels.
i also work on the Isuma project, for which i will eventually do a longer write up here. For now, look at day 290 . The project aims to manage over a terabyte of data across multiple devices spread over remote areas with limited bandwidth connexions. It's a distributed, two-way, CDN. So far it works well, and issues we have found have been quickly addressed by ?joeyh.
I would definitely recommend git-annex to film, music and university enthousiast. It is certainly worth the technical learning curve. For my parents, I am less sure. I mentioned the project, but they haven't found the use case for it yet, and they run Windows and OSX all over the place, which makes integration a little harder. With iPads, in particular, it seems that local storage is a thing of the past and everything goes "in the cloud". Which is in contradiction with git-annex a little, as it tries to take back control of your files.
Anyways, happy to spar more if you have any more questions. Good luck with the book!
I'm happy to help if I can. Though I don't use the assistant (personally, though people "at the other end" of annexes I use do though) or special remotes, I do have at least one unusual use case which entails several terabytes over millions of files.
In this case, I have a collection of harddrives and I am annexing all their contents into one large annex for quick backup/sorting later. This is well into git's problem with scaling to lots of files (I've posted a few tips on the site to minimise this impact).
My other uses are more normal, such as managing photos and music and other digital assets such as games. I use the metadata quite a lot (e.g. I tag files I want to have available in Kodi, or tag game installer/archives to denote what operating system they are for).
Being really picky about commits, several annexes I have have auto-commits on the git-annex branch disabled so I can control them. I'm even picky about doing things in such a way that they look nice in gitk! For example, I avoid cloning, because it makes the new annex appear at the bottom of gitk with its merge commit really far away, instead of a close merge (I can screenshot it if that isn't making any sense).
I'm not sure if I would recommend it to someone who doesn't have someone who understands it nearby to help (just in case), but I would say it is definitely worth learning and using.
In the book, I would say that you should mention that git-annex builds on git, and now people are building things on git-annex. For example, a project I am dabbling in is a VR environment which uses git and git-annex to manage its data: room definitions are stored as text files in git, with references to asset data (such as textures) which are 'git annex get'd as required. This is interesting as it means my co-developer can create/change rooms while I am in the VR environment and they change almost immediately. He changes the texture for the walls? It is downloaded and updated automatically. Imagine Second Life but with a (fully revertable) git-annex backend!
Thanks for your comments. I am very interested in things people build on top of git-annex. I am looking forward to read more about these projects.
Interesting that both of you don't use the webapp or the assistant. It was the first thing I used and it was disappointing, so now I try out the commandline.
I understand that managing an amount of files way too big for ones notebook or desktop is one of the main use cases. And this is a use case I definitely will cover in the book, if I write it at all.
Still evaluating...