todo/Configuring metadata view filenamesgit-annexhttp://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/git-annexikiwiki2023-03-24T18:01:44Zcomment 1http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_1_08f8f27e5a8dbd80a91ffd9fd6f64e6c/joey2015-02-04T20:10:42Z2015-02-04T19:38:58Z
<p>git-annex really doesn't care what filenames are used with in a view.
It only needs to ensure that each file gets a unique filename. Which
is why the directory is included in the filename, to avoid conflicts
if 2 files with the same name appear in different directories.</p>
<p>It would probably be better to make it avoid needing to include the
directory in the filename unless there is such a conflict, rather than
adding complexity configuring that.</p>
<p>However, since views are currently built by streaming the contents of the
branch to git update-index, git-annex can't just eg, examine the working
tree to see if a conflicting file exists. It seems it would need to keep
a map of the files it's added to the view branch so far, and check against
the map. But that would make memory use scale with the number of files in
the view, which I'd prefer to avoid..</p>
<p>I'm going to move this from bugs to todo.</p>
Maybe provide an option to force without name changehttp://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_2_34efe1424a9b02dca706565517900bd6/Tafnzart2017-05-07T14:05:03Z2017-05-07T14:05:03Z
<p>This name change shouldn't be necessary if on view that has directory structure from master:
git annex view todo=<em> "/=</em>"</p>
comment 3http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_3_dd7aa7560412d4589d2fc28eb978a71e/CandyAngel2017-05-09T10:05:14Z2017-05-09T10:05:14Z
<blockquote><p> But that would make memory use scale with the number of files in the view, which I'd prefer to avoid..</p></blockquote>
<p>This sounds like another use case for bloom filters <img src="http://git-annex.branchable.com/smileys/smile.png" alt=":)" /></p>
comment 4http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_4_1873c7cdd142dffc210ec0172bf29997/joey2017-05-09T17:47:25Z2017-05-09T17:41:33Z
<p>True, it could use a bloom filter.</p>
<p>I had not thought of <code>/=*</code> (or forgot about it). Views could, as a special
case, use the original paths in that case. That's getting very close to
adjusted branch territory, and I want to rewrite the view branch generation
code to use adjusted branches eventually (so changes made in the view
branch can be propigated back out to the source branch and so view branches
can be updated when the source branch changes).</p>
comment 5http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_5_b222634d9f97e2ef604b476df357d54b/Xyem2023-03-23T17:05:24Z2023-03-23T17:05:24Z
<p>Has the format been changed since this previously asked? I am currently trying to leverage git-annex and its metadata views with AI tooling, but the format seems to be filename_%path%, resulting in the extension being in the middle of the path. I have set <code>annex.maxextensionlength</code> to <code>12</code> so the extensions are present on the files in the backend.</p>
<pre><code>$ git annex view type=model model/=*
$ ls -lr
.:
sd
./sd:
v1.4.safetensors_%model%sd% v1.5.safetensors_%model%sd%
</code></pre>
<p>whereas I would expect (or rather, I am trying to achieve):</p>
<pre><code>$ ls -lr
.:
sd
./sd:
v1.4.safetensors v1.5.safetensors
</code></pre>
comment 6http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_6_1eb8c2d70ff9e3e5b3dbebf88270eb93/joey2023-03-23T20:45:08Z2023-03-23T20:39:33Z
<p>@Xyem no, it's unchanged. But annex.maxextensionlength does not configure
the extension length here currently. I think it would be a good thing for
it to do, probably.</p>
comment 7http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_7_ae259f68ab5b366b6fd29e5df1a05469/Xyem2023-03-23T22:26:24Z2023-03-23T22:26:24Z
<p>So if my understanding is correct, the file paths generated for this view should something like <code>sd/v1.5_%model%.safetensors</code> but as <code>annex.maxextensionlength</code> isn't being considered during this, it doesn't realise <code>safetensors</code> is the extension?</p>
<p>Unfortunately, the software will only regard certain extensions as being usable files, so I will be unable to use metadata views for now. I've set up separate branches and will copy symlinks between branches in the meantime.</p>
comment 8http://git-annex.branchable.com/todo/Configuring_metadata_view_filenames/comment_8_2bcfc677da72637f34904b84fdd95c10/joey2023-03-24T18:01:44Z2023-03-24T17:48:43Z
<p>I've made git-annex view use <code>annex.maxextensionlength</code>. Note that refining
an existing view will reuse the extension length that was configured when
initially constructing the view.</p>