Subject: RE: [sarif] Question about specifying file locations
Youâre right: the spec defines a nested file to be one that is physically embedded in its container:
which is contained within another file
The old nested file URI syntax â which is no longer needed now that run.files is an array â was always intended to express that physical containment.
Yes, your idea of using the .jar file as a uriBaseId is a good one.
All (but especially Larry):
We're collaborating with a partner with a tool that is producing Sarif. Their tool analyzes .jar files, but reports results in terms of source code locations. The question is how to produce valid Sarif in that situation. Right now they specify "file:///c:/.../foo.jar#/org/.../bar.java" (this is pre-11/14 Sarif), which implies to me that the bar.java file is expected to be explicitly textually embedded with in the foo.jar file.
Although one can put source files in .jar files in this fashion, I don't believe that's normal practice, and that's actually not where the .java files are in this particular case. The tool does not actually care where the source files are; I believe it is retrieving their names from debug information embedded within the .class files.
I don't believe that this was the intended use of the mechanism for specifying nested files, although I do see how the spec could have been interpreted in this way.
In my opinion, it would be more correct to have the file location be specified in a relative fashion. For the above example, that would be "org/.../bar.java", and then use the uriBaseId mechanism, one entry per .jar file, so that the fileLocation object for the above example could look something like this:
That is, each .jar file that is analyzed would give rise to a uriBaseId specifically for that .jar file.
Does my alternative make better sense?
Paul Anderson, VP of Engineering, GrammaTech, Inc.
531 Esty St., Ithaca, NY 14850
Tel: +1 607 273-7340 x118; http://www.grammatech.com