masterPortfolio icon indicating copy to clipboard operation
masterPortfolio copied to clipboard

Added onlick for organization icons and limit in github data fetcher

Open KasRoudra opened this issue 2 years ago • 20 comments

Summary of changes

  • Formatting queries(fixed indentation)
  • Added argument support in fetcher file to fetch more data (pull requests and issues more than 100 (upto given limit))
  • Added onclick for organization icons

Details on fetcher

Now git_data_fetcher.mjs accepts four arguments. --pr-limit(-pl) for pull requests, --issue-limit(-il) for issues, --org-limit(-ol) for organizations and --limit(-l) for all. Example: node git_data_fetcher.mjs -l 200.Until the limit matches we keep requesting to github api and append the data altogether. So, user contributed organizations will properly appear now.

KasRoudra avatar Jun 20 '22 07:06 KasRoudra

@saiteja13427 I think this one should block until the one you raised is merged.

ashutosh1919 avatar Jun 21 '22 00:06 ashutosh1919

@ashutosh1919 I think this is a better approach than my approach, though I will review it properly once and then see which one do we go with.

saiteja13427 avatar Jun 21 '22 03:06 saiteja13427

@KasRoudra I have tried testing the git script, I ran node git_data_fetcher.mjs -l 30 but still I got my 86 PRs.

Did you test that script properly or am i doing something wrong?

saiteja13427 avatar Jun 24 '22 14:06 saiteja13427

@saiteja13427 You aren't doing anything wrong. That's the correct behaviour of code. Currently the limit isn't sharp enough. If you give a number of 101-199, it will give you 100 results. For 201-299 you will get 200 results. Hope you got the point. I've just implemented core logic focusing on data fetching. However, I will try fix the limit issue in next commit. Thanks for your review

KasRoudra avatar Jun 24 '22 14:06 KasRoudra

@KasRoudra Yeah sure, do fix this and raise a commit!

Let us make it flexible with any number as limit.

saiteja13427 avatar Jun 24 '22 14:06 saiteja13427

@saiteja13427 I've done the commit. Take a test!

KasRoudra avatar Jun 24 '22 15:06 KasRoudra

@KasRoudra Tested It.

I ran node git_data_fetcher.mjs -pl 30, then it did fetch only 30 PRs but it is skipping issues and organisations. It gets issues and organisations only when i give a limit to them.

Also for orgs and issues if i give a limit more than the number of issues and orgs i have, It doesn't fetch anything at all.

This need to be corrected. I hope you understood both the points!

saiteja13427 avatar Jun 25 '22 04:06 saiteja13427

@saiteja13427 The first issue you mentioned isn't reproducible for me. You can check the code

const limit = numify(args.limit || args.l);
const pr_limit = numify(args["pr-limit"] || args.pl || limit);
const issue_limit = numify(args["issue-limit"] || args.il || limit);
const org_limit = numify(args["org-limit"] || args.ol || limit);

Even if you skip a value, that will automatically take a value of 100. Maybe that issue was a temporary github bug. However, from the last commit limits larger than data length will be ignored. Let me know if you find any other issue

KasRoudra avatar Jun 25 '22 07:06 KasRoudra

Yup, now it seems to work well. Basically, both the issues are linked that's why the first issue was not reproducible for you.

@ashutosh1919 I have checked this PR. Please do have a look and then we can merge it.

Thanks, @KasRoudra for the changes.

saiteja13427 avatar Jun 25 '22 09:06 saiteja13427

@KasRoudra @saiteja13427 I think it should depend upon the already produced data.

So, what i expected originally was this one: we first need to see whether we have any PRs which are fetched. If we have PRs, then we need to find PRs which are not available. For example, if you run github_data_fetcher.mjs -pl 120, then it should fetch upto 120 PRs which are not already available in website local data. We can distinguish what is available and what is not from the PR id. It is okay to make multiple API calls internally to fetch more than 100 PRs.

Same thing should happen to issues and orgs.

What do you both think about this expected behaviour?

ashutosh1919 avatar Jun 25 '22 15:06 ashutosh1919

@ashutosh1919 That will be nice feature but I do not get the point of using local data. The local data will be also available in github database and during api calls they will be automatically fetched. So why should we read and append that local data?

KasRoudra avatar Jun 25 '22 16:06 KasRoudra

We are locally storing data in order to improve performance. Fetching data would improve latency issues.

ashutosh1919 avatar Jun 25 '22 16:06 ashutosh1919

I think the behaviour should be like this.

By default the script should fetch all the PRs, Issues and Orgs on the first time and from the second run, it should just append the remaining to local data.

On top of it, we can also give a limit feature where we will do the above thing only but cut the data to the specific limit number of PRs, orgs or issues so that users can have a specific number of latest contributions if they want.

What do you think @KasRoudra @ashutosh1919

saiteja13427 avatar Jun 26 '22 03:06 saiteja13427

@saiteja13427 If we fetch "all" PRs, fetched data will automatically contain local data and therefore we don't need to read or append local ones. @ashutosh1919 suggests that we should only fetch those PRs which aren't available locally. Currently, I don't have idea how to fetch those accurately. If the user don't use a limit, we can save startCursor and endCursor locally and then later use them in next api call for pagination. But if the user uses a limit, data will be inaccurate because by default cursor is for 100 entries.

KasRoudra avatar Jun 26 '22 04:06 KasRoudra

@KasRoudra no need to get confused. You don't need to worry about cursors. For each script run, we need to fetch all PRs/issues/orgs irrespective of the arguments it generates. And all means All even if the number of PRs are >>100. So once we get the data, then we can compare it with id of PR.

Also, github API doesn't show data which is more than 1 year old IMO. So, that is why we are appending the new data and not replacing it.

ashutosh1919 avatar Jun 26 '22 15:06 ashutosh1919

Alright so, let us make it clear.

So, we are going with the approach of appending.

When a user runs the script, if he doesn't have old data, then we just get everything.

If the user has data, then we just get all data and keep on appending the data until the Id matches the latest data-id from the old data.

saiteja13427 avatar Jun 26 '22 16:06 saiteja13427

So, our next goal is now clear. We can already get all PRs, issues and orgs. We also have limiting feature. The only remaining feature is appending local data. I think @saiteja13427 can do it better as he has already worked on it. I suggest merging this one for now and then @saiteja13427 should raise a new PR for that feature

KasRoudra avatar Jun 26 '22 17:06 KasRoudra

@ashutosh1919 Any decision on the pending pull requests?

KasRoudra avatar Aug 25 '22 14:08 KasRoudra

@saiteja13427 @KasRoudra what is the status of this PR? I think it is much needed feature and we should get it done with this one

ashutosh1919 avatar May 26 '23 14:05 ashutosh1919

@ashutosh1919 this got left out actually. Let me test this once again and we will merge this PR first.

Then on top of it I will add the appending feature we had discussed about!

saiteja13427 avatar May 27 '23 04:05 saiteja13427