tools icon indicating copy to clipboard operation
tools copied to clipboard

feat(rome_cli): add `--json` argument to format command

Open ematipico opened this issue 3 years ago • 8 comments
trafficstars

Summary

Closes #3049

The status of the PR is not the final design.

There are still things missing that I would like implement in subsequent PRs:

  • report errors on stderr
  • review how messages and reports can live together
  • expand --json to the check command

These were intentionally left out because I had spent enough time to experiment with other crates.

What I would like to review is:

  • the shape of the information reported after a traversal of the format command on a file/folder
  • if the usage of three threads makes sense and not flawed

The usage of three threads was necessary to because we need to collect information from two sources:

  • the thread of the traversal
  • the thread to process the messages

And we need a third thread to receive these messages.

Examples

Given the file example.js

const a = "hey"

statement(

);

If we run the following command rome format example.js --json, we get the following JSON:

{
  "formatter": {
    "summary": {
      "duration": { "secs": 0, "nanos": 3975001 },
      "filesCompared": 1,
      "filesWritten": null
    },
    "details": {}
  },
  "errors": {
    "example.js": {
      "diff": {
        "severity": "Error",
        "before": "const a = \"hey\"\n\nstatement(\n\n);\n",
        "after": "const a = \"hey\";\n\nstatement();\n"
      }
    }
  }
}

If we run the command rome format test.js --write --json, we get the following JSON:

{
  "formatter": {
    "summary": {
      "duration": { "secs": 0, "nanos": 6510659 },
      "filesCompared": null,
      "filesWritten": 1
    },
    "details": {
      "example.js": {
        "newContent": "const a = \"hey\";\n\nstatement();\n"
      }
    }
  },
  "errors": {}
}

The paths will be relative to the path where the command was run.

Test Plan

Added a test case to cover a simple case

ematipico avatar Aug 16 '22 10:08 ematipico

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 45faf07
Status: ✅  Deploy successful!
Preview URL: https://cc661d19.tools-8rn.pages.dev
Branch Preview URL: https://feature-out-argument.tools-8rn.pages.dev

View logs

Would you mind adding a short outline to the PR summary how the architecture changed. You mention three threads. What's the responsibility of each thread, why are multiple threads necessary? Are there specific reasons why you doubt that three threads are a good approach?

Edit: Can you add an example of a formatter and check JSON output to the test section?

MichaReiser avatar Aug 16 '22 11:08 MichaReiser

Would you mind adding a short outline to the PR summary how the architecture changed. You mention three threads. What's the responsibility of each thread, why are multiple threads necessary? Are there specific reasons why you doubt that three threads are a good approach?

Edit: Can you add an example of a formatter and check JSON output to the test section?

Done.

ematipico avatar Aug 16 '22 12:08 ematipico

Thanks for updating the description.

Some feedback on the JSON format

  • what use cases to you see for the duration? In my view, this is something that scripts can easily measure for themselves if they're interested in the duration
  • Is the reason that errors are reported outside of the formatter that these are diagnostics?
  • Do you think it's necessary for us to report the content of the written files? A script could read the file if it is interested in the updated content.

MichaReiser avatar Aug 16 '22 13:08 MichaReiser

As an alternative to having a dedicated thread for receiving the stats, the crossbeam library we're using for MPSC channels has a select! macro that would allow the existing console thread to wait on both the messages and stats channels at the same time, and wake up the thread whenever one of the channels receives a message

leops avatar Aug 16 '22 13:08 leops

Thanks for updating the description.

Some feedback on the JSON format

* what use cases to you see for the `duration`? In my view, this is something that scripts can easily measure for themselves if they're interested in the duration

This is true. But I thought, since we have the information at hand (and it's really detailed), why not expose it? Also, considering that our APIs can be used by other developers to make other plugins, they can use it for internal measurements.

* Is the reason that `errors` are reported outside of the `formatter` that these are diagnostics?

Yes, but I am not sure these errors should be sent together with this JSON on stdin or we should stderr instead. That's why I left it out.

Also, I am not sure if errors should stay at top level or at "feature" level. If we run the check command, we might have diagnostics related to linter, formatter, etc.

* Do you think it's necessary for us to report the `content` of the written files? A script could read the file if it is interested in the updated content.

I think it's a valuable information, since we already have it. If the script wants the information, it would require to make another I/O operation. If we scale this situation on folders, there would be lot of I/O operations... It's a trade-off. The JSON would be slim but scripts that require that information would be penalized. What do you think? Should we remove it?

ematipico avatar Aug 16 '22 13:08 ematipico

I think it's a valuable information, since we already have it. If the script wants the information, it would require to make another I/O operation. If we scale this situation on folders, there would be lot of I/O operations... It's a trade-off. The JSON would be slim but scripts that require that information would be penalized. What do you think? Should we remove it?

That depends on the use cases that we want to support. Overall, I'm leaning towards removing any information that we don't have an explicit use case for because it will be difficult to remove fields in the future.

I would currently focus on only adding fields/information for which we have a direct use in the Node JS API, because that's the ultimate goal we're pursuing. So the question is, what's the result of formatFiles?

Should errors be called diagnostics?

MichaReiser avatar Aug 16 '22 16:08 MichaReiser

@leops I would like to defer your suggestion in another PR, mostly because select won't work with the current architecture. The current architecture has two receivers for the reports, one inside the traversal and one inside the processor of the messagges. Using Select from crossbeam won't work with the current architecture, if the receiver of the reports is woken up first, we will cause a deadlock because the sender passed to the processor of the messages never goes our of scope (because it's never woken up) by Select. I plan to refactor the current architecture, but this work is out of scope for this PR and I would like to tackle it a later PR.

@MichaReiser Are your concerns been covered? Are there any other things pending? I would like to merge this PR and continue to apply new changes from here.

ematipico avatar Aug 19 '22 08:08 ematipico

Do you plan to create an issue for the follow up on select or is it something we don't plan to pursue at the moment?

I plan to revisit this. This is was more of an optimization and I think it's not needed for our case.

ematipico avatar Aug 23 '22 07:08 ematipico