omp
omp copied to clipboard
update csv importexport tool. ##Godoy0722/stable-3_3_0
Updates:
- fix some bugs where the code wasn't able to execute this CLI command;
- adds support for additional fields: abstract, keywords, subjects, book cover image, book cover image alt text, and categories.
- create a document with guidance on how to use this tool in CLI mode.
Issues
This PR is intended to solve the following issues:
- https://github.com/pkp/pkp-lib/issues/10116
- https://github.com/pkp/pkp-lib/issues/10117
- https://github.com/pkp/pkp-lib/issues/10118
- https://github.com/pkp/pkp-lib/issues/10119
One extra comment... I've just noticed the number of modified lines is large, and looks like a lot of files are having the line break replaced from \n to \r\n, we've got to revert them to \n.
I'll create an issue to decrease the chances of it happening again.
Hello folks, and sorry for the delay about this PR! @kaitlinnewson I read your comments and I'll make other commits to make everything in the same pattern as it should be. For @asmecher most comments and @jonasraoni too. My next commit will be a kind of undo for my auto-format so the changes will be more clear.
@Henrique523, don't spend too much time undoing the formatting; I don't mind if it gets included here. Just a note for the future, though, so that it's easier for reviewers next time.
Hello again! After making some updates, I have some news about this PR which I'd like to share with you.
Undos:
Patterns and Structure:
- Updated the array syntax (#307fd9f);
- Remove unnecessary comments (#f14fc52);
- Reason for array_shift explained (#4c0ec5b);
- Update quotes for single quotes every time I could (#ce10c3a);
- Spelling for the localized message (#6738cc9);
Bug fixes
- Fix submission and remove unnecessary title (#ab1c367);
- Add an author as the primary contact (#5e65efa);
- Dynamic submission PDF mime-type (#5eee836);
- Stop script if the file is not present (#f64f4ea);
- Filename for submission PDF (#c1721ed);
- Path for assets updated as the same as the CSV file (#e798a7a)(#e63561d);
Docs and Version
I believe I covered all requests from you in these commits. Despite that, I have one more comment, about the documentation. I'll suggest the main PKP documentation (https://docs.pkp.sfu.ca/admin-guide/3.3/en/) about this tool. This way the docs will be available in both code and the docs. If you need anything else from me, or if you find more things to solve before merging this PR, please let me know!
I didn't review the PR, but just to get you prepared, updates will need to be forwarded to the stable-3_4_0 and main branches. But better to wait the initial review to be completed, or you'll have to update too many branches.
Hello folks. I reviewed all @asmecher , @kaitlinnewson, and @jonasraoni comments and I think now all your requests and suggestions are covered. Could you please take one last look at the code? Thank you so much for your patience!
Thanks, @Godoy0722, I've taken a last skim over it and added just a couple of comments -- we could easily go back and forth over small details like this forever, so take what you like from those and ignore the rest. But @kaitlinnewson, a last test and look over it from you would be much appreciated!
@Godoy0722 I've added a few additional comments for things that came up when I re-tested. Looking good to me otherwise!
Hello again!
I just made more updates here. I made a rebase until the stable-3_3_0 to maintain all updates in one commit, removed the permission stuff @asmecher told me about and fix the last issues you guys sent me. Because of the rebase, I had to make a force push in my fork branch.
If you need anything from me because of it please let me know! And thank you again.
Hello folks!
Any news about my updates from here? @asmecher @kaitlinnewson
Hello again,
here are all my updates for my last commit:
- I rolled back the plugin version;
- Updated the located file and created new keys for the new usage;
- Removed some imports and static variables from the CSV loop;
- Cached some values as @jonasraoni suggested;
- For the CSV sample file, I removed the "https://doi.org/" as suggested by @kaitlinnewson; updated the author's example, and updated the readme file to fit with the new author's information pattern.
New Tool Structure/Behavior
The tool now is working the way as follows below:
- For each row, all error and inconsistency validations are made before the process starts. If any of those are found, the tool will try the next row. The process has the same behavior as I already developed.
- All wrong/inconsistent rows will be shown on the terminal, followed by its reason after all CSV file process ends.
- The tool will generate a CSV file with only the inconsistent rows as explained in the readme.md file. I thought it'd be easier for the user to finish his work.
- I changed the author's information pattern on CSV file. You can have a better look on the readme file about this pattern, which I also believe it's easier for the user to manage all authors information on csv file.
- Finaly, I'm using the HTML purifier on abstract text to avoid suspicious content there.
Hope this explain helps the Code Review! If you have any questions for me, please let me know! Thanks folks.
Ah, a general comment that I forgot to leave... If you have free time, it would be interesting to break that huge method into pieces.
Hello @jonasraoni, @kaitlinnewson, and @asmecher,
I wanted to inform you of some significant updates to the codebase:
I've refactored the code, implementing several improvements including private attributes for caching and static variables. The tool is now thoroughly documented with comments, and each submission process has been organized into small sections for clarity. Additionally, I've updated some deprecated methods to their current equivalents.
To enhance reliability, all verifications now occur before processing begins, ensuring field correctness and consistency.
As for error handling, any rows that fail validation are now recorded in a separate CSV file. This file is generated in the same directory as the client's CSV and includes an additional column detailing the reason for each row's failure.
I would greatly appreciate it if you could review these changes when you have the opportunity. :smile:
@Godoy0722 I'll aim to give this another test run later today
@Godoy0722 Is this ready for another test or are there more changes planned? I wasn't able to get to it before my vacation but can do another test when it's ready!
Hi there. Just pinging @kaitlinnewson that an additional review is still needed for this tool. I really appreciate if you could take a look and see if everything is in place now.
Best, Guilherme Godoy
@Godoy0722 I've made a few additional comments - looking good otherwise!
Hello, @kaitlinnewson!
Yes, you were right. I forgot to pass this variable as a parameter to the respective method that processes this info. I updated and tested again the tool and it seems it's working as it should be. If you could please take a look again, I appreciate that!
P.S.: I just updated the Docs and the "tab delimited" occurrences. The "submissionPdfs" and "coverImages" parts of the path were removed since the behavior of the tool changed. Now it's working in a way that all the assets (both PDFs and cover images) must be put inside the same path as the CSV file. It makes the tool management easier and clearer for the user.
A final matter, the tool is ready in my idea, there's nothing to add for now unless the tool is with more bugs. So, if you find everything correct, it's ready to be merged on my idea. Thanks for the review @kaitlinnewson !
Looking good to me @Godoy0722! I think @asmecher will need to do the final merge due to the previous requested changes.
Merged -- thanks, all!