unlocked files

Normally, git-annex stores annexed files in the repository, locked down, which prevents the content of the file from being modified. That's a good thing, because it might be the only copy, you wouldn't want to lose it in a fumblefingered mistake.

# git annex add some_file
add some_file
# echo oops > some_file
bash: some_file: Permission denied

Sometimes though you want to modify a file. Maybe once, or maybe repeatedly. To support this, git-annex also supports unlocked files. They are stored in the git repository differently, and they appear as regular files in the working tree, instead of the symbolic links used for locked files.

using unlocked files

You can unlock any annexed file:

# git annex unlock my_cool_big_file

That changes what's stored in git between a git-annex symlink (locked) and a git-annex pointer file (unlocked). You can commit the change, if you want that file to be unlocked in other clones of the repository. To lock the file again, use git annex lock.

The nice thing about an unlocked file is that you can modify it in place -- it's a regular file. And you can commit your changes.

# echo more stuff >> my_cool_big_file
# git commit -a -m "some changes"
[master 196c0e2] some changes
 1 files changed, 1 insertion(+), 1 deletion(-)

Notice that git commit -a added the new content of the file to the annex, and only committed a change to the pointer. That happened because git-annex knows this was an annexed file before. Git leaves the file unlocked, so you can continue to make modifications to it.

By default, using git to add a file that has not been annexed before will still add its contents to git, not to the annex. If you tell git-annex what files are large, it will arrange for the large files to be added to the annex, and the small ones to be added to git. This is done by configuring annex.largefiles. See largefiles for full documentation of that.

All the regular git-annex commands (find, get, drop, etc) can be used on unlocked files as well as locked files. When you drop the content of an unlocked file, it will be replaced by a pointer file, which looks like "/annex/objects/...". So if you open a file and see that, you'll need to use git annex get.

Under the hood, unlocked files use git's smudge/clean filter interface, and git-annex converts between the content of the big file and a pointer file, which is what gets committed to git.

By default, git-annex commands will add files in locked mode, unless used on a filesystem that does not support symlinks, when unlocked mode is used. To make them always use unlocked mode, run: git config annex.addunlocked true
git add always adds files in unlocked mode.

adjusted branches

If you want to mostly keep files locked, but be able to locally switch to having them all unlocked, you can do so using git annex adjust --unlock. See git-annex-adjust for details. This is particularly useful when using filesystems like FAT, and OS's like Windows that don't support symlinks. Indeed, git-annex init detects such filesystems and automatically sets up a repository to use all unlocked files.

finding unlocked files

While it's easy to see when a file is a git-annex symlink, unlocked files look the same as files stored in git. To see what files are unlocked or locked, many git-annex commands support --unlocked and --locked options.

git annex find --unlocked

imperfections

Unlocked files mostly work very well, but there are a few imperfections which you should be aware of when using them.

git stash, git cherry-pick and git reset --hard don't update the working tree with the content of unlocked files. The files will contain pointers, the same as if the content was not in the repository. So after running these commands, you will need to manually run git annex smudge --update.
When git-annex is running a command that gets or drops the content of an unlocked file, git's index will briefly be locked, which might prevent you from running a git commit at the same time.
Conversely, if you have a git commit in progress, running git-annex may complain that the index is locked, though this will not prevent it from working.
When an operation such as a checkout or merge needs to update a large number of unlocked files, it can become slow. So can be git add of a large number of files (git annex add is faster).

(The technical reasons behind these imperfections are explained in detail in git smudge clean interface suboptiomal.)

using less disk space

Unlocked files are handy, but they have one significant disadvantage compared with locked files: On most filesystems, they use more disk space.

While only one copy of a locked file has to be stored, often two copies of an unlocked file are stored on disk. One copy is in the git work tree, where you can use and modify it, and the other is stashed away in .git/annex/objects (see internals).

The reason for that second copy is to preserve the old version of the file, when you modify the unlocked file in the work tree. Being able to access old versions of files is an important part of git after all!

(Some filesystems including btrfs and xfs support reflinks, and on those, the extra copy is a reflink, and takes up no additional space.)

So two copies is a good safe default. But there are ways to use git-annex that make the second copy not be worth keeping:

When you're using git-annex to sync the current version of files across devices, and don't care much about previous versions.
When you have set up a backup repository, and use git-annex to copy your files to the backup.

In situations like these, you may want to avoid the overhead of the second local copy of unlocked files. There's a config setting for that.

Note that setting annex.thin only has any effect on systems that support hard links. It is supported on Windows, but not on FAT filesystems.

git config annex.thin true

After changing annex.thin, you'll want to fix up the work tree to match the new setting:

git annex fix

When a direct mode repository is upgraded, annex.thin is automatically set, because direct mode made the same single-copy tradeoff.

Setting annex.thin can save a lot of disk space, but it's a tradeoff between disk usage and safety.

Keeping files locked is safer and also avoids using unnecessary disk space, but trades off easy modification of files.

Pick the tradeoff that's right for you.

RSS Atom

usage changes

This sounds interesting. But OTOH I'm curious about upgrades from direct mode (which I assume will soon go away):

If currently I just use annex add; annex sync --content on a media repo, would that change to git add --all; git commit -m whatever; annex unlock *; annex sync --content? That is, will v6 require the manual commit step?

Also, when annex get or annex sync retrieve files from another repo, will there be an option to have the files unlocked by default, as in v5 direct mode?

(I'm kinda hoping for annex init --thin or something similar to the v5 annex direct, as manually setting config options is easy to forget.)

Comment by grawity — Fri Jan 15 13:31:37 2016

Remove comment

comment 2

Direct mode is not going away any time soon.

git add adds the file to the annex in unlocked mode, and git annex sync commits any such adds the same as any other changes, so all you need is git add --all; git annex sync --content

Whether a file is locked or unlocked is a property of the file, that gets committed to git, so when you commit some unlocked files, they'll be unlocked when they appear in other clones of the repository.

Comment by joey — Fri Jan 15 19:07:24 2016

Remove comment

comment 3

If you want to save a committed version of a file, is there a way to do that, other than syncing to a remote that does not have annex.thin set?
If you add and commit a file multiple times in a repo without syncing to a remote, what does the commit history look like on a remote when you do sync it? It just has several commits for which the file contents are not available?
If you want to preserve history with annex.thin set, do you just have to sync manually after each commit? I guess you might want to set up a git commit hook to do that in that case.

Comment by wsha.code+ga — Sat Jan 16 13:03:26 2016

Remove comment

comment 4

@wsha.code, if you opt to use annex.thin, then commit a file, and then edit the same file again and commit again, the older commit will be in git's history, but if you check it out, the old content of the file won't be available. This is very similar to what happens when not using annex.thin, but later running git-annex unused and dropping the "unused" intermediate version of the file.

Running git annex sync --content or just git annex copy --to remote will get the thin version of the file saved on a remote, and then editing it won't lose the content. But note that if you edited a file while it was being copied off to the remote, the previous version would still get lost.

If these seem like troublesome behaviors, well that's why annex.thin is not enabled by default.

Comment by joey — Wed Jan 20 18:49:07 2016

Remove comment

how to use normal rm to files directly?

My problem is following, I delete files from a directory over normal delete functionality. I expect this files than be really deleted, at least on that repos. so that the diskspace for it is free.

I thought direct mode or now v6 with addunlocked setting is the solution to that. But either with thinmode there is a hardlink still there or without there is a copy in the directory.

I would rather not have to use dropunused to get rid of that, it would be good if git annex sync or assist could just add this changes to the history. I dont care if that would be the last copy, that does not matter for me in that usecase.

I want:

access files (or hard links) not soft links
only saved space 1 time per file
deleted files give free the full space without usind drop-unused
files should be added/deleted in that repos, not only transfered from somewhere else.

Do I need therefor a special repos like web/directory/rsync or can I do that somehow with such a normal repos? as far as I understand even if I would use web with a directory as parameter it would not save the files normaly in that directory?

Comment by stefan.huchler — Fri Nov 4 21:04:56 2016

Remove comment

workaround to my request

I guess adding a hourly cronjob that drop all unused filed would be accaptable maybe?

Or is there a better solution?

Comment by stefan.huchler — Sat Nov 5 14:53:36 2016

Remove comment

NTFS Make it clear that it'll not work with annex.thin

On the doc it's said that

"Note that setting annex.thin only has any effect on systems that support hard links. It is supported on Windows, but not on FAT filesystems."

Having read that, I was thinking that I'd be able to use annex.thin with NTFS but it doesn't work. I'd specify clearly that NTFS would also not work with annex.thin

Thanks

Comment by colin.brosseau — Thu Jan 3 18:04:58 2019

Remove comment

Best solution to save disk space on exFAT

I see that annex.thin doesn't support FAT. What's the best option to save disk space when you are using FAT? I'm currently trying to put files that are more than 50% of a drive's size on that drive, with a v7 repository. Is that possible?

Comment by tjbk123 — Tue Mar 5 18:23:03 2019

Remove comment

Re: Best solution to save disk space on exFAT

@tjbk123, I was wondering the same thing. I think just locking all files works.

Comment by meribold — Tue Mar 19 14:14:37 2019

Remove comment

Re: Best solution to save disk space on exFAT

Oh, I guess git annex sync unlocks everything again.

Comment by meribold — Tue Mar 19 14:22:57 2019

Remove comment

Re: Best solution to save disk space on exFAT

Maybe using a bare repository is the way to go.

Comment by meribold — Tue Mar 19 14:32:29 2019

Remove comment

comment 12

Same problem here! I would love to have an alternative to annex.thin on FAT.

Comment by gueux — Thu Mar 21 22:59:38 2019

Remove comment

comment 13

?annex.thin without hardlinks is a tracking bug for annex.thin not working on FAT etc.

Comment by joey — Fri Mar 22 13:31:52 2019

Remove comment

annex.thin without hardlinks would be useful for non-crippled systems as well

One of the concerns with use of git-annex on HPC and other heavily loaded systems is the >3x consumption of inodes due to all the symlinks etc. In some cases it could be completely avoided probably, if repository is instructed to be installed just for consumption (to access data) only. In this way it would be nice if we could get the same "annex.thin without hardlinks" that there would not be even any .git/annex/objects (superthin?) and files just get installed in-place.

Comment by yarikoptic — Mon Mar 25 18:57:36 2019

Remove comment

comment 15

Is where any way to make Git Annex use Windows 10 NTFS hardlinks in the working tree?

Looking to conserve disk-space while still being able to browse and view files content. Currently Git Annex is doubling the ammount of disk space.

Comment by rshalaev — Thu Dec 3 01:43:00 2020

Remove comment

comment 16

Yes, as the page above explains, git config annex.thin true and then git annex fix

Comment by Lukey — Thu Dec 3 07:38:30 2020

Remove comment

Windows 10 NTFS hardlinks not working

Lukey - I tried git config annex.thin true and then git annex fix

Doing it on Windows NTFS drive did not create hard-links. I've followed the instructions. Could not get it to work. Always got copies of files instead of hardlinks.

From: -- Joey Hess id@joeyh.name Mon, 18 Apr 2016 18:33:52 -0400 git-annex (6.20160412) Changelog * annex.thin and annex.hardlink are now supported on Windows.

Based on Joey change log - hard links should work on NTFS. According to my obesrvation and a report from colin.brosseau above (titles "NTFS Make it clear that it'll not work with annex.thin") it does not work.

Can anyone confirm if git annex can creates NTFS hardlinks? I can file a bug report if needed.

Thanks!

Comment by rshalaev — Thu Dec 3 11:43:07 2020

Remove comment

Permission fix

Hi,

Lots of gratitude for your work on git annex.

I have an annex repo with a default setting to unlock files. When I run git annex add myfile, I notice a change related to permission is added to my file in the working tree, which I need to further git add in order to get to a clean state. See below.

Is that expected? I'm wondering if it wouldn't make more sense / be a better experience if git annex add myfile would seamlessly handle that permission change and add it to git for unlocked files, so I don't have to run both git annex add and git add to get to a clean state?

Thanks.

$ git status
On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        05 Tapestry.mp3

nothing added to commit but untracked files present (use "git add" to track)

$ git annex add .
add 05 Tapestry.mp3
ok
(recording state in git...)

$ git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   05 Tapestry.mp3

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   05 Tapestry.mp3

$ git diff
05 Tapestry.mp3 changed file mode from 100644 to 100755

Comment by czard — Mon Mar 3 12:08:28 2025

Remove comment

Add a comment