emerge icon indicating copy to clipboard operation
emerge copied to clipboard

.tsx support

Open gustaff-weldon opened this issue 4 years ago • 12 comments

Is your feature request related to a problem? Please describe.

I'm trying to run emerge against a big React app codebase. python emerge.py -c frontend.yml

---
project_name: frontend
loglevel: info
analyses:
- analysis_name: check_tsx_files
  source_directory: <some-project-path>/frontend/src
  only_permit_languages:
  - javascript
  only_permit_file_extensions:
  - .tsx
  - .ts
  ignore_dependencies_containing:
  - node_modules
  ignore_files_containing:
  - spec.ts
  - story.ts
  - mock.ts
  - .snap
  - .gql
  file_scan:
  - number_of_methods
  - source_lines_of_code
  - dependency_graph
  - louvain_modularity
  - fan_in_out
  export:
  - directory: <some-project-path>/frontend
  - graphml
  - dot
  - json
  - tabular_file
  - tabular_console_overall
  - d3

I'm getting tons and tons of messages:

.tsx is an unknown extension, ignoring `/<edited>/src/Issues/Issues/Issues.tsx`

Pretty much the whole app gets skipped and not mapped.

Describe the solution you'd like I'd like to be able to analyze a codebase that contains tsx which are essentially jsx with Typescript.

Given the size of our app, the ability to visualise it and check in an interactive way, would be an immense help.

gustaff-weldon avatar Oct 19 '21 21:10 gustaff-weldon

@gustaff-weldon sounds like an interesting enhancement. Could you point me maybe to some example project on GitHub that has a similar mixture of tsx/jsx files, which I could use for an analysis? Then I'll see if I can do something about this on the following weekend 😉.

glato avatar Oct 20 '21 18:10 glato

@glato that would be awesome!

I have found a couple of projects that might be handy:

Typescript only: tsx, ts https://github.com/cypress-io/cypress-realworld-app https://github.com/HospitalRun/hospitalrun-frontend https://github.com/excalidraw/excalidraw/

More known, mixed language types: https://github.com/grafana/grafana/tree/main/packages https://github.com/getsentry/sentry

Javascript js, jsx:

  • https://github.com/bbc/simorgh (check src folder)
  • small: https://github.com/oldboyxx/jira_clone/

Of course, I'm happy to help with testing on our codebase, we have a fairly big app (~5000 ts/tsx files), I cannot share as it's a private repo.

Let me know if that's enough :)

gustaff-weldon avatar Oct 20 '21 20:10 gustaff-weldon

@gustaff-weldon I've pushed an update in the current version 0.19.1 to be ready for parsing .tsx and .jsx files also. I've tested this on https://github.com/cypress-io/cypress-realworld-app and https://github.com/oldboyxx/jira_clone. This already looked promising. I setup the following config as an example to be able to include both filetypes:

---
project_name: tsx example
loglevel: info
analyses:
- analysis_name: simple check
  source_directory: <some-project-path>/github/cypress-realworld-app
  only_permit_languages:
  - javascript
  - typescript
  only_permit_file_extensions:
  - .js
  - .jsx
  - .ts
  - .tsx
  file_scan:
  - number_of_methods
  - source_lines_of_code
  - dependency_graph
  - fan_in_out
  - louvain_modularity
  export:
  - directory: <some-export-path>/tmp/emerge/export/tsx
  - graphml
  - dot
  - json
  - tabular_file
  - tabular_console_overall
  - d3

I'd appreciate any feedback if this helps and solves the issue? Or if there are some details which does not work out as expected?

glato avatar Oct 23 '21 19:10 glato

Hi @glato,

First of all, thank you so mych for your effort, much appreciated. I will try to help with testing.

For starters, I have run this against our codebase, and while it worked (ie. looks better than before :)) I'm a little bit suspicious about the number of files traversed.

Are you following imports or just going through all the matching files in the directories?

The reason I'm asking is because we are using alias in the import, think of @myco/utils/date to avoid ../../../../../utils/date We are usually providing those aliases in tsconfig and webpack config files. This is quite a common practice in big JS projects.

Alias-related part of config for Typescript:

  "compilerOptions": {
    "paths": {
      "@myco/*": [
        "./src/*"
      ]
    },

Alias-related part of config for Webpack:

  resolve: {
    alias: {
      '@myco': resolve('src')
    },

If you do not resolve aliases, we might need to add this. I'm happy to provide those as a part of the yml config, or if you prefer, you can parse provided tsconfig.json file.

I will get back to you as soon as I do more testing. But I wanted to clarify this alias thing.

gustaff-weldon avatar Oct 26 '21 07:10 gustaff-weldon

@gustaff-weldon Hi there, emerge currently tries to do a hybrid approach here: First it traverses recursively through all files from a starting directory and collecting every import statement within each file that it can find. For any recognized import statement, it tries to 'resolve' it to a correct POSIX-kind of filesystem path (e.g. an import of the kind 'foo/bar/../baz.ts' should be resolved to 'foo/baz.ts'). import aliases of the kind @myco/utils/date are currently not part of the resolver. If you say that this kind of import aliases are often used in js/ts project, this could indeed be an useful update. I'll see if I can find some further examples to analyze and test such import aliases and see about an update on this issue.

glato avatar Oct 31 '21 09:10 glato

@glato using aliases is quite a common thing in bigger JS projects. I will see if I can find some examples to help with testing

gustaff-weldon avatar Nov 02 '21 11:11 gustaff-weldon

@gustaff-weldon I've just pushed a small update in 0.19.2 that might solve your problems:

There is (currently) an undocumented yaml config section replace_dependency_substrings where you could easily define dependency name substring replacements like the following:

replace_dependency_substrings:
- "@foo": src/foo
- "@bar": src/bar

so that import aliases/prefixes like @foo would be replaced by src/foo in any dependency. The only restriction here is that the import alias keyword is wrapped in "..." since this seems to be a YAML/PyYAML restriction also for some characters (including @).

This means that an example TS-configuration might look like this:

---
project_name: tsx example
loglevel: info
analyses:
- analysis_name: simple check
  source_directory: /Users/user/github/cypress-realworld-app
  only_permit_languages:
  - javascript
  - typescript
  only_permit_file_extensions:
  - .js
  - .jsx
  - .ts
  - .tsx
  replace_dependency_substrings:
  - "@foo": src/foo
  - "@bar": src/bar
  file_scan:
  - number_of_methods
  - source_lines_of_code
  - dependency_graph
  - fan_in_out
  - louvain_modularity
  export:
  - directory: /Users/user/tmp/emerge/export/cypress-realworld-app
  - graphml
  - dot
  - json
  - tabular_file
  - tabular_console_overall
  - d3

I would appreciate any feedback if this might solve your problem (by defining a custom configuration in replace_dependency_substrings), then I would consider to also implement this for the other parsers.

glato avatar Nov 06 '21 21:11 glato

@gustaff-weldon I've release this import alias feature with version 1.0.0. You can also install this tool very quickly with pip now (pip install emerge-viz). Will be closing this issue in a couple of days if this solves the problem.

glato avatar Dec 04 '21 22:12 glato

@glato thanks, I will give it a go. ~~I assume I should use the replace_dependency_substrings: workaround you have suggested.~~ Edit: just noticed in new README import_aliases has been added

gustaff-weldon avatar Dec 06 '21 22:12 gustaff-weldon

@glato I wonder if this is how the scan should behave given the following settings:

---
project_name: frontend
loglevel: debug
analyses:
- analysis_name: check_tsx_files
  source_directory: /Users/myuser/frontend/src
  only_permit_languages:
  - javascript
  - typescript
  only_permit_file_extensions:
  - .tsx
  - .ts
  - .js
  import_aliases:
  - "@zen/": src/
  ignore_dependencies_containing:
  - node_modules
  ignore_files_containing:
  - .spec.ts
  - .spec.tsx
  - .story.ts
  - .story.tsx
  - .mock.ts
  - .mock.tsx
  - .snap
  - .gql
  - .generated.ts
  file_scan:
  - number_of_methods
  - source_lines_of_code
  - dependency_graph
  - fan_in_out
  - louvain_modularity
  export:
  - directory: ./report/
  - graphml
  - dot
  - json
  - tabular_file
  - tabular_console_overall
  - d3

Notice that I exclude all spec files, yet I see in console:

2021-12-06 23:56:20     parser D ⏩ extracting imports from base result SettingsRepository.spec.ts...
2021-12-06 23:59:44     parser D ⏩ extracting imports from base result CaptureProfileRolesPage.spec.tsx...

I would expect those to be excluded? It seems to be a problem with double extension ones, as eg. .css works fine:

2021-12-07 00:09:40   analysis I ⏩ .css is not allowed in the scan, ignoring /Users/myuser//frontend/src/App/styles/fonts.css

gustaff-weldon avatar Dec 06 '21 23:12 gustaff-weldon

  1. Also, even without imports, I'm not sure if imports resolution actually works. I have run some tests on a very limited part of our codebase and I'm getting some dubious results.

Eg. given such simple folder:

.
├── OperationsSettingsSearch.spec.tsx
├── OperationsSettingsSearch.tsx
└── index.ts

where index.ts looks like this:

export { default } from './OperationsSettingsSearch';

The resulting graph does not have a node index.ts which depends on OperationsSettingsSearch. index.ts is shown with no dependencies.

  1. I have a feeling the way aliases are done do not solve the problem. For example:
2021-12-07 00:30:30     parser D ⏩ adding import: react to OperationsSettingsSearch/OperationsSettingsSearch.tsx
2021-12-07 00:30:30     parser I ⏩ renamed dependency: @zen/Components/Search -> src/Components/Search

given the files structure:

src_min
└── Components/Search
    ├── Search.spec.tsx
    ├── Search.tsx
    ├── __snapshots__
    │   └── Search.spec.tsx.snap
    └── index.ts
    OperationsSettings/OperationsSettingsSearch
    ├── OperationsSettingsSearch.spec.tsx
    ├── OperationsSettingsSearch.tsx
    └── index.ts

with alias @zen: src and

src_min/OperationsSettings/OperationsSettingsSearch/OperationsSettingsSearch.tsx relying on src_min/Components/Search.tsx like this:

import Search from '@zen/Components/Search';

and src_min/Components/Search/index.ts relying on the same src/Components/Search.tsx like this:

import Search from './Search';

I'm getting two separate dependency chains. Screenshot 2021-12-07 at 00 50 46

I feel like, the tool does not include JS/TS import resolution logic, as technically those two dependencies below, are the same:

Screenshot 2021-12-07 at 00 52 22 Screenshot 2021-12-07 at 00 52 16

I'm not sure what would be the best way, to share with you, even that limited example. I'm happy to connect on Discord eg.

gustaff-weldon avatar Dec 06 '21 23:12 gustaff-weldon

Hello, any updates on this? I'm also trying to run emerge on a large Typescript+React codebase and I'm getting a bunch of

NotImplementedError: currently not implemented in TYPESCRIPT_PARSER

Thanks so much for this project. I think it will be very helpful to an initiative I have at work.

rafaelrozon avatar Feb 15 '24 17:02 rafaelrozon