unlock --read-only

"annex unlock" in thin mode of v6 hard-links key into the file location and makes it RW. This is obviously for the case where modifications to the file need to be done and danger is understood. In my case, I need unlock to avoid having symlinks in the files since some software doesn't digest them well (might copy without dereferencing or dereference and look for neighboring files in the directory), and I want to use unlock to pretty much provide "symlink-free" view of the tree BUT at least with some protection, which could be given if files are unlocked read-only, so no inplace modifications could happen without explicit change of the permissions.

closing because this never got to a concrete proposal that didn't have fatal problems. --Joey

RSS Atom

comment 1

The protection offered by a read-only mode is pretty minimal; any program that writes a file atomically using rename will bypass it. So, as programs are implemented better, they'll bypass this "protection" more -- not much of a protection!

Also, it doesn't make much sense to call this operation "unlock" if it's intended to not let programs modify the files.

Comment by joey — Mon Oct 17 18:09:44 2016

Remove comment

comment 2

Actually, yoh is right: read-only would be sufficient protection here. Because, with annex.thin, the worktree file is a hard link to the annex object, and the annex object lives in a mode 400 directory. So, even if the file is deleted and a new version renamed into place, the annex object will still have captured the old version.

Still don't like the self-contradition of "unlock read-only".

Of course, you can do this yourself:

git annex unlock file
chmod 400 file

So I wonder if there's any need for a git-annex command to do this.

Comment by joey — Mon Oct 17 21:07:31 2016

Remove comment

why option vs manual chmod

as far as I see it

much easier to specify the option instead of figuring out which files were unlocked and need to be chmodded (some might be under git, etc)
probably would be notably (avoiding double file-tree traversal) faster in case of unlocking large number of files

Comment by EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s- [me.yahoo.com/a] — Mon Oct 17 23:11:10 2016

Remove comment

comment 4

It sounds like you would want to unlock all files in the repo this way, is that right?

If so, it seems like a case for git-annex adjust, eg git annex adjust --hardlink. And it would perhaps make sense to do that on a crippled filesystem by default instead of the current default of --unlock.

Keeping it in adjust only avoids needing to make the unlock command do something that is not an unlocking, and it avoids needing to add a new command.

It also neatly avoids the problem that, while git annex unlock makes a change that can be committed to git (in v7 mode), this new operation is not something that can be committed to git (at least w/o some change to indicate it in the pointer file).

Comment by joey — Thu Jun 27 14:36:59 2019

Remove comment

read-only unlock of only some files

There are still use cases for read-only unlock of just some files. One issue with symlinks is that Docker doesn't follow them. So if I want to let Dockerized code read annexed files, I have to unlock them first. But, if these files are to be only used as input, unlocking them for modification is not really what I want to do. I don't want git status to list these files as modified.

Comment by Ilya_Shlyakhter — Fri Jun 28 15:57:04 2019

Remove comment

I wonder if "thin mode" could generalize beyond hardlinks

Was not sure if I should file a new issue for related discussion, but I thought it might align with the last comment from Ilya, but let me know if it is off-topic too much.

One of the most common "consumer" use cases across platforms is just to get the dataset and files to be processed, and then possibly even wipe it all out. Not all file systems support hard linking or CoW. I wondered if "thin" mode could be something explicit like hardlink, or even a new mode -- mv. In mv I would see annex just moving the file in its needed location upon unlock (and probably marking in git-annex branch it to be not present "here") (and probably retaining their "read-only" for some level of protection? or have mv and mv-rw modes?).

And then may be if annex get would get an option --unlocked, then in thin=mv mode, annex could take a shortcut and just place the file in a target location right away without even bothering to change any availability information in "git-annex" branch? That would also avoid stressing file systems with consuming all the inodes for .git/annex/objects tree in such scenarios.

Comment by yarikoptic — Mon Jul 1 16:19:47 2019

Remove comment

comment 7

To "just to get the dataset and files to be processed, and then possibly even wipe it all out", maybe you could just use the directory special remote?

Which common file systems do not support hardlinking? It seems that Windows does.

To fix the problem that unlocking a file causes git status to report it as changed ("typechange"), maybe git-annex could tell git to locally ignore the change?

Comment by Ilya_Shlyakhter — Tue Jul 2 18:36:15 2019

Remove comment

re: I wonder if "thin mode" could generalize beyond hardlinks

The ideas in that comment won't work, and here's why:

If git-annex does not maintain a hardlink in .git/annex/objects, then when you run git checkout and it replaces the working tree file with some other version, or deletes it, it's deleted the only copy of the annex object that is stored on your disk. So you lose data.

Comment by joey — Wed Jan 29 15:15:06 2020

Remove comment

comment 9

Much of this discussion seems irrelevant given that v7 is the default now and half of the discussion above is about v5 unlock.

In general, this todo suffers from far too many unrelated or only tangentially related suggestions.

Any concrete proposals, or shall I close this?

Comment by joey — Wed Jan 29 15:18:28 2020

Remove comment

comment 10

joeyh, could you please elaborate what v7/v8 does different to v5 when unlocking? I don't get it.

I need this feature (checked out real/hardlinked files while being immutable) as well. Even if it is only a thin layer of protection it may help. Where supported, git annex may use the file immutable attributes (as discussed in https://git-annex.branchable.com/internals/lockdown/) for better protection.

Imo it's lock/unlock which isn't clear about naming/semantics. We have to things here: 1. symlink vs. direct files 2. protection against mutation. These 2 things mingle together in the current implementation they are different concepts. We can not choose any free combination of these (writable symlink into the object store makes no sense). But a little finer control would be appreciated. No idea how to do this in a concrete way.

Maybe some 'git annex protect' command to set different protection modes on content (which could be abstract, no need to comply to unix semantics. for example: appendable, writable, immutable, deletable etc. git-annex could enforce the mode lazily if not supported directly)

or 'git annex rolock' (needs better name) .. which is like unlock but makes the file immutable/write protected somehow.

Comment by ct.git-annex — Sun Apr 26 20:18:48 2020

Remove comment

read-only view of repo clone with symlinks as normal files

I've used bindfs with -o ro and --resolve-symlinks to create a read-only view of a repo where symlinks look like regular files.

Comment by Ilya_Shlyakhter — Thu Jul 1 17:17:20 2021

Remove comment

Add a comment