omp icon indicating copy to clipboard operation
omp copied to clipboard

update csv importexport tool. ##Godoy0722/stable-3_3_0

Open Godoy0722 opened this issue 1 year ago • 16 comments

Updates:

  • fix some bugs where the code wasn't able to execute this CLI command;
  • adds support for additional fields: abstract, keywords, subjects, book cover image, book cover image alt text, and categories.
  • create a document with guidance on how to use this tool in CLI mode.

Issues

This PR is intended to solve the following issues:

  1. https://github.com/pkp/pkp-lib/issues/10116
  2. https://github.com/pkp/pkp-lib/issues/10117
  3. https://github.com/pkp/pkp-lib/issues/10118
  4. https://github.com/pkp/pkp-lib/issues/10119

Godoy0722 avatar Jun 24 '24 07:06 Godoy0722

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jun 24 '24 07:06 CLAassistant

One extra comment... I've just noticed the number of modified lines is large, and looks like a lot of files are having the line break replaced from \n to \r\n, we've got to revert them to \n.

I'll create an issue to decrease the chances of it happening again.

jonasraoni avatar Jun 25 '24 20:06 jonasraoni

Hello folks, and sorry for the delay about this PR! @kaitlinnewson I read your comments and I'll make other commits to make everything in the same pattern as it should be. For @asmecher most comments and @jonasraoni too. My next commit will be a kind of undo for my auto-format so the changes will be more clear.

Godoy0722 avatar Jul 02 '24 13:07 Godoy0722

@Henrique523, don't spend too much time undoing the formatting; I don't mind if it gets included here. Just a note for the future, though, so that it's easier for reviewers next time.

asmecher avatar Jul 02 '24 14:07 asmecher

Hello again! After making some updates, I have some news about this PR which I'd like to share with you.

Undos:

Patterns and Structure:

  • Updated the array syntax (#307fd9f);
  • Remove unnecessary comments (#f14fc52);
  • Reason for array_shift explained (#4c0ec5b);
  • Update quotes for single quotes every time I could (#ce10c3a);
  • Spelling for the localized message (#6738cc9);

Bug fixes

  • Fix submission and remove unnecessary title (#ab1c367);
  • Add an author as the primary contact (#5e65efa);
  • Dynamic submission PDF mime-type (#5eee836);
  • Stop script if the file is not present (#f64f4ea);
  • Filename for submission PDF (#c1721ed);
  • Path for assets updated as the same as the CSV file (#e798a7a)(#e63561d);

Docs and Version

I believe I covered all requests from you in these commits. Despite that, I have one more comment, about the documentation. I'll suggest the main PKP documentation (https://docs.pkp.sfu.ca/admin-guide/3.3/en/) about this tool. This way the docs will be available in both code and the docs. If you need anything else from me, or if you find more things to solve before merging this PR, please let me know!

Godoy0722 avatar Jul 04 '24 13:07 Godoy0722

I didn't review the PR, but just to get you prepared, updates will need to be forwarded to the stable-3_4_0 and main branches. But better to wait the initial review to be completed, or you'll have to update too many branches.

jonasraoni avatar Jul 04 '24 14:07 jonasraoni

Hello folks. I reviewed all @asmecher , @kaitlinnewson, and @jonasraoni comments and I think now all your requests and suggestions are covered. Could you please take one last look at the code? Thank you so much for your patience!

Godoy0722 avatar Jul 16 '24 13:07 Godoy0722

Thanks, @Godoy0722, I've taken a last skim over it and added just a couple of comments -- we could easily go back and forth over small details like this forever, so take what you like from those and ignore the rest. But @kaitlinnewson, a last test and look over it from you would be much appreciated!

asmecher avatar Jul 16 '24 18:07 asmecher

@Godoy0722 I've added a few additional comments for things that came up when I re-tested. Looking good to me otherwise!

kaitlinnewson avatar Jul 17 '24 18:07 kaitlinnewson

Hello again!

I just made more updates here. I made a rebase until the stable-3_3_0 to maintain all updates in one commit, removed the permission stuff @asmecher told me about and fix the last issues you guys sent me. Because of the rebase, I had to make a force push in my fork branch.

If you need anything from me because of it please let me know! And thank you again.

Godoy0722 avatar Jul 19 '24 17:07 Godoy0722

Hello folks!

Any news about my updates from here? @asmecher @kaitlinnewson

Godoy0722 avatar Jul 31 '24 11:07 Godoy0722

Hello again,

here are all my updates for my last commit:

  • I rolled back the plugin version;
  • Updated the located file and created new keys for the new usage;
  • Removed some imports and static variables from the CSV loop;
  • Cached some values as @jonasraoni suggested;
  • For the CSV sample file, I removed the "https://doi.org/" as suggested by @kaitlinnewson; updated the author's example, and updated the readme file to fit with the new author's information pattern.

New Tool Structure/Behavior

The tool now is working the way as follows below:

  1. For each row, all error and inconsistency validations are made before the process starts. If any of those are found, the tool will try the next row. The process has the same behavior as I already developed.
  2. All wrong/inconsistent rows will be shown on the terminal, followed by its reason after all CSV file process ends.
  3. The tool will generate a CSV file with only the inconsistent rows as explained in the readme.md file. I thought it'd be easier for the user to finish his work.
  4. I changed the author's information pattern on CSV file. You can have a better look on the readme file about this pattern, which I also believe it's easier for the user to manage all authors information on csv file.
  5. Finaly, I'm using the HTML purifier on abstract text to avoid suspicious content there.

Hope this explain helps the Code Review! If you have any questions for me, please let me know! Thanks folks.

Godoy0722 avatar Aug 06 '24 17:08 Godoy0722

Ah, a general comment that I forgot to leave... If you have free time, it would be interesting to break that huge method into pieces.

jonasraoni avatar Aug 07 '24 18:08 jonasraoni

Hello @jonasraoni, @kaitlinnewson, and @asmecher,

I wanted to inform you of some significant updates to the codebase:

I've refactored the code, implementing several improvements including private attributes for caching and static variables. The tool is now thoroughly documented with comments, and each submission process has been organized into small sections for clarity. Additionally, I've updated some deprecated methods to their current equivalents.

To enhance reliability, all verifications now occur before processing begins, ensuring field correctness and consistency.

As for error handling, any rows that fail validation are now recorded in a separate CSV file. This file is generated in the same directory as the client's CSV and includes an additional column detailing the reason for each row's failure.

I would greatly appreciate it if you could review these changes when you have the opportunity. :smile:

Godoy0722 avatar Aug 10 '24 11:08 Godoy0722

@Godoy0722 I'll aim to give this another test run later today

kaitlinnewson avatar Aug 12 '24 13:08 kaitlinnewson

@Godoy0722 Is this ready for another test or are there more changes planned? I wasn't able to get to it before my vacation but can do another test when it's ready!

kaitlinnewson avatar Aug 26 '24 13:08 kaitlinnewson

Hi there. Just pinging @kaitlinnewson that an additional review is still needed for this tool. I really appreciate if you could take a look and see if everything is in place now.

Best, Guilherme Godoy

Godoy0722 avatar Sep 16 '24 16:09 Godoy0722

@Godoy0722 I've made a few additional comments - looking good otherwise!

kaitlinnewson avatar Sep 20 '24 14:09 kaitlinnewson

Hello, @kaitlinnewson!

Yes, you were right. I forgot to pass this variable as a parameter to the respective method that processes this info. I updated and tested again the tool and it seems it's working as it should be. If you could please take a look again, I appreciate that!

Godoy0722 avatar Oct 12 '24 17:10 Godoy0722

P.S.: I just updated the Docs and the "tab delimited" occurrences. The "submissionPdfs" and "coverImages" parts of the path were removed since the behavior of the tool changed. Now it's working in a way that all the assets (both PDFs and cover images) must be put inside the same path as the CSV file. It makes the tool management easier and clearer for the user.

A final matter, the tool is ready in my idea, there's nothing to add for now unless the tool is with more bugs. So, if you find everything correct, it's ready to be merged on my idea. Thanks for the review @kaitlinnewson !

Godoy0722 avatar Oct 12 '24 17:10 Godoy0722

Looking good to me @Godoy0722! I think @asmecher will need to do the final merge due to the previous requested changes.

kaitlinnewson avatar Oct 16 '24 17:10 kaitlinnewson

Merged -- thanks, all!

asmecher avatar Oct 21 '24 14:10 asmecher