High level overview:
- This only concerns mtime (and its equivalents on other systems, if applicable). atime, ctime and permissions are out of scope.
- An object added to an annex and later retrieved from it via
mv
, cp -L
, git annex unannex
and such should always keep its mtime, even if retrieved on an entirely different machine and/or from a backend that doesn't support timestamps natively.
- If an added/reinjected object is already known to the annex, use the older mtime by default, since that's probably the version that's had its metadata preserved better.
- If that's too much of an assumption, provide a switch to use the older/newer/known/unknown mtime, or add a git-annex-touch command.
- symlink and object file mtimes should reflect the mtime tracked by the annex.
- ideally, directory mtimes would also be preserved, or failing that,
git-annex-fix
, git-annex-add
and git-checkout
should leave them untouched.
open questions/ideas:
- What if the user
touch
es a file/symlink, bypassing git-annex? Should the data be reconciled or ignored?
- preserving directory mtimes looks tricky, but could it maybe be done from a hook or two?
notabug --Joey
This is lacking a lot of detail about how this would be accomplished.
Assuming it's to be accomplished using git-annex metadata, it seems likely to signficantly slow down some git-annex operations (which would need to do an expensive git-annex branch lookup).
There's also the complication of merging a git-annex branch that contains changes to the timestamp metadata. Would git-annex need to look over all the merged changes and go off and frob timestamps?
does not preserve timestamps already contains discussion of this topic. I'm not sure that it's productive to discuss it in two different places. (Nor does this really seem like a bug report.)
I have no opinion about what backend to use. If doing it via the metadata system significantly slows down things though and is generally awkward, why not build a separate subsystem?
I don't know what you mean by "look over all the merged changes and go off and frob timestamps", but as long as n is on the order of [number of files changed in the commit], updating n files' timestamps sounds reasonable? There's the question of which timestamp has preference in a merge, but that sounds solvable.
I made this a separate bug because it's a specific design proposal; I consider does not preserve timestamps a tracking bug/user story.
proposal re file modification time should be stored in exactly one metadata field: File and symlink timestamps, after
git-annex-get
orgit-checkout
, are set to whatever's in the repo and then considered immutable. The user can of course change them withtouch
, but if the file is locked while that happens, that's considered a corruption like editing an object file and will be caught bygit-annex-fsck
.