Recent comments posted to this site:
- Remove comment
I am confused by what you mean by "keep the overview over a git annex repository" and "are complete locally" (do you mean "are completely local"?)
It appears you are requesting an alternative representation of the working tree with folders collapsed if the locations for all contained annexed files are the same. However what that representation means is confusing: "this folder (not file) has copies in these locations". Folders are not synced across remotes: file content is. annex list
is meant to see exactly what file content exists where and if that content is trusted (X
) or untrusted (x
). What if there are non-annexed files in that folder? The collapsed view almost seems to indicate that maybe these files exist in those locations, too.
This also does not appear to have much to do with git annex info
.
If you are overwhelmed by the information density, time with the git-annex will help you understand why what it reports is important. Also if terminal history clutter adds to your information overwhelm, you can use command | less
to use a terminal pager to help parse longer form information.
In my testing, I have found git annex forget --drop-dead --force
problematic because if ever the two repositories speak to one another (thru e.g. fetch) again, the very alive remote for one marked dead in the other will be eradicated.
Luckily I've learned that you don't have to fetch from one remote to another to still issue "informed" annex commands which is critical. In other words, I didn't appreciate how annex learns of file content in remotes dynamically, I thought it was fairly dependent on merging in the git-annex
branch to learn about files. Instead you can confidently treat fetch
, pull
, push
commands as all exclusively for the merging of two sibling repos (and their histories, settings, remotes, etc).
For these kinds of ("friend"?) remotes (unrelated remotes), I think you'll want to remove the fetch refspec entirely and add annex-sync=false
if you want to keep the relationship around, otherwise never run sync
until you remove unrelated remotes.
One thing that I am unsure about is what should happen if git-annex get foo
needs the content of file bar
, which is not present. Should it get bar
from
a remote? Or should it fail to get foo
?
Consider that, in the case of git-annex get foo --from computeremote
, the
user has asked it to get a file from that particular remote, not from
whatever remote contains bar
.
If the same compute remote can also compute bar
, it seems quite reasonable
for git-annex get foo --from computeremote
to also compute bar. (This is
similar to a single computation that generates two output files, in which
case getting one of them will get both of them.)
And it seems reasonable for git-annex get foo
with no specified remote
to also get or compute bar, from whereever.
But, there is no way at the level of a special remote to tell the difference between those two commands.
Maybe the right answer is to define getting a file from a compute
special remote as including getting its inputs from other remotes.
Preferring getting them from the same compute special remote when possible,
and when not, using the lowest cost remote that works, same as git-annx
get
does.
Or this could be a configuration of the compute special remote. Maybe some would want to always get source files, and others would want to never get source files?
A related problem is that, foo
might be fairly small, but bar
very
large. So getting a small object can require getting or generating other
large objects. Getting bar
might fail because there is not enough space
to meet annex.diskreserve. Or the user might just be surprised that so much
disk space was eaten up. But dropping bar
after computing foo
also
doesn't seem like a good idea; the user might want to hang onto their copy
now that they have it, or perhaps move it to some faster remote.
Maybe preferred content is the solution? After computing foo
with bar
,
keep the copy of bar
if the local repository wants it, drop it otherwise.
Progress display is also going to be complicated for this. There is no
way in the special remote interface to display the progress for bar
while getting foo
.
Probably the thing to do would be to add together the sizes of both files, and display a combined progress meter. It would be ok to not say when it's getting the input file. This will need a way to set the size for a progress display to larger than the size of the key.
.... All 3 problems above go away if it doesn't automatically get input files before computations and the computations instead just fail with an error saying the input file is not present.
But then consider the case where you just want every file in the repository.
git-annex get .
failing to compute some files because their input files
happen to come after them in the directory listing is not good.
I've started a compute
branch which so far has documentation for
the compute special remote,
git-annex addcomputed,
and
git-annex recompute
I am pretty happy with how this design is shaping up.
LFS uses http basic auth, so using it over http probably allows any man in the middle to take over your storage.
With that rationalle, https://hackage.haskell.org/package/git-lfs hardcodes a https url at LFS server discovery time. And I don't think it would be secure for it to do anything else by default; people do clone git over http and it would be a security hole if LFS then exposed their password.
In your case, you're using a nonstandard http port, and it's continuing to use that same port for https. That seems unlikely to work in almost any situation. Perhaps a http url should only be upgraded to https when it's using a standard port. Or perhaps the nonstandard port should be replaced with the standard https port. I felt that the latter was less likely to result in security issues, and was more consistent, so I've gone with that approach. That change is in version 1.2.4 of https://hackage.haskell.org/package/git-lfs.
git-lfs has git configs lfs.url
and remote.<name>.lfsurl
that allow the user to specify the API endpoint to use. The special
remote's url= parameter is the git repository url, not the API endpoint.
So I think that to handle your use case, it makes sense to add an optional
apiurl= parameter to the special remote, which corresponds to those git
configs.
Unfortunately, adding apiurl= needed a new version 1.2.5 of https://hackage.haskell.org/package/git-lfs, so it will only be available in builds of git-annex that use that version of the library. Which will take a while to reach all builds.
Found that
git annex lock .
fix it
I guess when copied it kept in unlocked state which may be nothing but absence of symlink (layman term) So by this logic when I tried git annex lock . it fixed this problem.
Not sure if here are any option/config to control this that when file first copies it should be directly put into locked state.
Thanks