Recent comments posted to this site:
Hi Joey,
Sorry, for the late response, and thanks for the feedback.
"that's fundamentally different than how git-annex works"
Hence the previous comment
"And I think you could put it in your special remote."
That's exactly what I was doing around a year ago. I was implementing a special remote to support writing data on BDXL disks.
"So that when git-annex sends a file to your remote, the file is actually stored in the remote, rather than in a temporary location."
Yep, roughly that's how I was implementing it - storing intermediate data in an sqlite database.
I'd put the project on hold because I started to ask myself the following questions:
- OK, I can store transactions in the special remote. It means storing what is where on which disk. Isn't it what git annex supposed to do?
- If a BDXL disk get's corrupted or lost, how to reflect it in the git annex repo and the special remote? I can mark it as "lost" in the remote, then run fsck in git annex remote.
- Because I have to track location data separately in the special remote, what if it get's corrupted (the sqlite database)?
- What if I buy 50GB BDXL instead of 100GB which I'm using? Does it means the special remote also should track free space on each disk?
- Burning a disk - what if it won't be successful? Git annex will think that it was successful, cause it doesn't support bulk operations and numcopies rules will be violated.
There were many more questions like this.
And at some point the design started to look more like a blown-up feature-reach archival application/solution. The main point here is that it's definitely possible. I can limit the scope but there are many many issues, and nobody except me will be interested in it. Plus, many responsibilities would be overlapping with git annex.
It's not as simple as just plumbing that up though, because testremote has implicit dependencies in its test ordering. It has to do the storeKey test before it can do the present test, for example.
I already thought that this might be the case, so running the tests independently isn't really infeasible.
To address my second point I might be able to just parse the output of testremote into "sub-tests" on the Forgejo-aneksajo side. Tasty doesn't seem to have a nice streaming output format for that though, right? There is a TAP formatter, but that looks unmaintained...
There are actually only two write operations, storeKey and removeKey. Since removeKey is supposed to succeed when a key is not present, if storeKey fails, then removeKey will succeed. But removeKey should fail to remove a key that is stored on the remote. To test that, the --test-readonly=file option would need to be used to provide a file that is already stored on the remote.
Now that you are saying this, is a new option even necessary? --test-readonly already takes a filename that is expected to be present on the remote, so instead of adding a new option --test-readonly could ensure that this key can't be removed, and that a different key can't be stored (and that removeKey succeeds on this not-present key).
I don't know about the "--write-only" name, but I see the value in having a way for testremote to check what a remote that is expected to only allow read access does not allow any writes, as well as otherwise behaving correctly.
There are actually only two write operations, storeKey
and removeKey
.
Since removeKey
is supposed to succeed when a key is not present, if
storeKey
fails, then removeKey
will succeed. But removeKey
should
fail to remove a key that is stored on the remote. To test that,
the --test-readonly=file option would need to be used to provide a file
that is already stored on the remote.
I think it would make sense to require that option be present in order to use this new "--write-only" (or whatever name) option.
Also, git-annex does know internally that some remotes are readonly. For
example, a regular http git remote that does not use p2phttp.
Or any remote that has remote.<name>.annex-readonly
set. Currently
testremote
only skips all the write tests for those, rather than
confirming that writes fail. It would make sense for testremote of a known
readonly remote to behave as if this new option were provided.
(But, setting remote.<name>.annex-readonly
rather than using
the "--write-only" option would not work for you, because that config
causes git-annex to refuse to try to write to the remote. Which doesn't
tell you if your server is configured to correctly reject writes.)
It would be possible to make git-annex testremote
support the
command-line options of the underlying test framework (tasty).
git-annex test
already does that, so has --list-test and --pattern.
It's not as simple as just plumbing that up though, because testremote has
implicit dependencies in its test ordering. It has to do the storeKey
test before it can do the present
test, for example. Those dependencies
would need to be made explict, rather than implicit.
Explict dependencies, though, would also make it not really possible to run
most of the tests separately. Running testremote 5 times to run the listed
tests, if each run does the necessary storeKey
would add a lot of overhead.
Not declaring dependencies and leaving it up to the user to run testremote repeatedly to run a sequence of tests in the necessary order would also run into problems with testremote using random test keys which change every time it's run, as well as it having an end cleanup stage where it removes any lingering test keys from the local repository and the remote.
This seems to be a bit of an impasse...
I have .gitattributes
:
* annex.largefiles=nothing filter=annex
*.pdf annex.largefiles=anything filter=annex
and git config:
[annex]
gitaddtoannex = true
Using git add
now adds it to annex. This can be confirmed with
git annex info file.pdf
The output should show present = true
at the end. If it wasn't added to annex, the output would show fatal: Not a valid object name file.pdf
.
And it seems that, by default, the files are stored in the working tree in their unlocked state. So git add
doesn't replace the file with a symlink unlike git annex add
I think that "annex.assistant.allowlocked" would be as confusing, like you say the user would then have to RTFM to realize that they need to use annex.addunlocked to configure it, and that it doesn't cause files to be locked by default.
To me, "treataddunlocked" is vague. Treat it as what? "allowaddunlocked" would be less vague since it does get the (full) name of the other config in there, so says it's allowing use of the other config.
I agree this is a confusing name, and I wouldn't mind changing it, but I don't think it warrants an entire release to do that. So there would be perhaps a month for people to start using the current name. If this had come up in the 2 weeks between implementation and release I would have changed it, but at this point it starts to need a backwards compatability transition to change it, and I don't know if the minor improvement of "allowaddunlocked" is worth that.