Subject: Re: Please comment on #125

This scenario was mentioned by Paul at the in person meeting. Maybe there should be a separation of the path to the file when captured and the path to a copy of the file contents.


On 04/10/2018 11:41 AM, Larry Golding (Comcast) wrote:
I see!  A SARIF producer enables consumers to access previous versions of an overwritten file not just by /mentioning/ each version in the run.files dictionary, but by /persisting their contents/ there. It seems so obvious now 😊 I can write the text for this now.

Editorial consideration: Explaining this, including an example, will take up a medium amount of space. And it’s not obvious where it does in the spec (in the run.files section? In the uriBaseId section?). So I propose to add a new non-normative Appendix to explain this corner case.

Example below. Note the interplay between originalUriBaseIds, result.location, and the property names in run.files. It’s actually kind of elegant. It gives me faith in our format that it can represent this corner case in such a natural way.


{                                      # A run object

   "originalUriBaseIds": {

     "generated-1": "file:///dev-machine/c:/project/out/obj",

     "generated-2": "file:///dev-machine/c:/project/out/obj"


   "results": [


       "ruleId": "CA4567",

       "location": {

         "physicalLocation": {

           "fileLocation": {

             "uri": "MainWindow.xaml.g.cs",

             "uriBaseId": "generated-1"


           "region": {

             "startLine": 42






   "files": {

     "#generated-1#MainWindow.xaml.g.cs": {

       "fileContent": {                 # THIS IS WHAT MAKES IT WORK

         "text": "..."



     "#generated-2#MainWindow.xaml.g.cs": {

       "fileContent": {

         "text": "..."





I’ve thought about this issue a bit. We should be thinking about an analysis that provides a hit in any generated file that isn’t under source control. For example, a generated XAML code-behind file. The corner case covers something even more problematic, a single analysis run where generated files are, for example, overwritten on a per-project basis (to a common location in some build intermediates folder). To answer your questions:

 1. This isn’t tool specific, it relates to scan targets which are
    themselves generated content not under source control (and which are
    fluid/overwritten even while some larger build analysis is taking place)
 2. The file is a valid scan target, whatever that means. A PCH file or
    other intermediate. A header file that is generated by some perl
    script. Etc.
 3. Producers SHOULD persist all files to run.files that aren’t managed
    by a version control system. This is just good general guidance.
 4. It may be necessary to represent multiple versions of this
    re-written file in the run.files dictionary, if multiple results
    instances exist that point to different versions of the generated
 5. Ditto, a viewer will need to access any version of the file
    referenced by any result.

#125 <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Foasis-tcs%2Fsarif-spec%2Fissues%2F125&data=02%7C01%7Cmichael.fanning%40microsoft.com%7C2edd19213dba40390cf308d59c05ff3a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636586471924348658&sdata=pdOhg5OiBHMyKm47iFLpSgTZKieHrgx2n5G3ygdPqdA%3D&reserved=0>: Address corner case for generated files in run.files dictionary

This is the scenario where the same physical file is re-written in the course of an analysis. Please see my comments in the issue. What is the scenario here? – that is:

  * What tool is involved?
  * What is the nature of the file that’s being re-written?
  * Is it necessary to represent this file in the run.files dictionary?
  * Is it necessary to represent /multiple versions/ of this re-written
    file in the run.files dictionary?
  * Would a viewer need access to /any version of this file except the
    last one written/?



