magento2 icon indicating copy to clipboard operation
magento2 copied to clipboard

Improve cron error logging

Open fredden opened this issue 3 years ago • 70 comments

Description (*)

This pull request improves logging of errors within cron processes.

Current scenario Currently any error messages are lost / not recorded. For the main group (and any groups not set to use a separate process), STDERR is left for the scheduler to capture. This is most easily caught with the MAILTO= directive on unix, however experience suggests this is rarely configured usefully. Any cron group currently configured to run in a separate process will have its STDOUT and STDERR lost forever.

Proposed scenario With the changes here, both STDERR and STDOUT are caught and logged. This is very valuable in scenarios where one cannot configure the scheduler, or when a job running in a separate process is suffering from errors. A good example of this is Magento Cloud hosting where cron processes often fail without any useful diagnostic information available. These changes will mean that STDERR output is readily available, aiding in diagnostics.

Related Pull Requests

Fixed Issues (if relevant)

None

Manual testing scenarios (*)

  1. Artificially introduce a memory leak into a cron job, eg indexer_reindex_all_invalid
  2. Run php bin/magento cron:run without these changes
  3. Observe lack of information about the failure
  4. Run php bin/magento cron:run with these changes
  5. Observe information available in var/log (in this example, var/log/magento.cron.index.log)

Questions or comments

I can see that using posix_isatty() is discouraged, but no alternative is provided / suggested. There are functions in packages installed via composer as development dependencies which provide the required functionality. (eg here and here) What's the recommended approach in this case? Do we simply remove the output line altogether?

Contribution checklist (*)

  • [x] Pull request has a meaningful description of its purpose
  • [x] All commits are accompanied by meaningful commit messages
  • [x] All new or changed code is covered with unit/integration tests (if applicable)
  • [ ] All automated tests passed successfully (all builds are green)

Resolved issues:

  1. [x] resolves magento/magento2#37453: Improve cron error logging

fredden avatar Apr 07 '21 22:04 fredden

Hi @fredden. Thank you for your contribution Here are some useful tips how you can test your changes using Magento test environment. Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

:exclamation: Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s) For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names. Allowed build names are:

  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE,
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests
  13. Semantic Version Checker

You can find more information about the builds here

:information_source: Please run only needed test builds instead of all when developing. Please run all test builds before sending your PR for review.

For more details, please, review the Magento Contributor Guide documentation.

:warning: According to the Magento Contribution requirements, all Pull Requests must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of Pull Requests happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

m2-assistant[bot] avatar Apr 07 '21 22:04 m2-assistant[bot]

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

I can confirm this issue has caused a problem with monitoring for one of our clients.

evktalo avatar Feb 14 '23 11:02 evktalo

I can confirm this issue has caused a problem with monitoring for one of our clients.

@evktalo I'm having a little trouble interpreting your statement. Are you saying that (1) you have applied the changes in this pull request to a website and these have caused a problem somehow? Or, are you saying that (2) you have witnessed the issue that this pull request aims to solve?

fredden avatar Feb 14 '23 13:02 fredden

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@magento run all tests

engcom-Hotel avatar Apr 21 '23 11:04 engcom-Hotel

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

Hello @fredden,

Thanks for the contribution!

We have tried to reproduce the issue in the latest 2.4-develop branch, but it seems the issue is no more reproducible. As per the steps mentioned here https://github.com/magento/magento2/pull/32690#issue-852871342, we have tried to artificially introduce a memory leak into a cron job in the below file:

https://github.com/magento/magento2/blob/9c0e1236743f5cccb13cfbf72cf1473aaebd2d9c/app/code/Magento/Indexer/Cron/ReindexAllInvalid.php#L31

And tried to add the below code to introduce a memory leak:

$leak = [];
        $ticks = 0;
        while(true) {
            $leak[$ticks] =  "{A VERY LONG STRING}";
            $ticks++;
            unset($leak[$ticks]);
        }

But the issue is not reproducible for us. Please let us know if we have missed anything.

Thanks

engcom-Hotel avatar Apr 24 '23 07:04 engcom-Hotel

@engcom-Hotel when you run that code as a stand-alone script, do you get an out-of-memory error from PHP? You may need to adjust the error_reporting and memory_limit settings in order to ensure that the memory leak will actually produce a PHP Fatal error. Another way to trigger a PHP Fatal error may be to throw an uncaught Exception. You could also produce output on both STDERR and STDOUT to check if this gets logged or lost.

fredden avatar Apr 24 '23 09:04 fredden

Hello @fredden,

I think you are talking about the below error:

image

I am able to generate the error after setting up the lower memory_limit in Magento branch 2.4-develop.

Thanks

engcom-Hotel avatar Apr 25 '23 12:04 engcom-Hotel

Hello @fredden,

We have tried this PR to check the error logs with the same memory leak codebase. This time with this PR changes, we have not gotten any error as mentioned in this https://github.com/magento/magento2/pull/32690#issuecomment-1521680879.

But either not getting any error log in the log file as well:

image

let us know if we have missed anything.

Thanks

engcom-Hotel avatar Apr 25 '23 14:04 engcom-Hotel

@engcom-Hotel what log files are you looking in? The snippet you posted seems to be from var/log/cron.log. Have you checked other log files, such as var/log/magento.cron.index.log or var/log/magento.cron.consumers.log?

I've tested this just now by adding the following to bin/magento:

diff --git a/bin/magento b/bin/magento
index 0a5f498dd7e..d4ffb28aa6c 100755
--- a/bin/magento
+++ b/bin/magento
@@ -10,6 +10,11 @@ if (PHP_SAPI !== 'cli') {
     exit(1);
 }
 
+fwrite(STDOUT, "This string has been written to STDOUT\n");
+fwrite(STDERR, "This string has been written to STDERR\n");
+
+print_r($argv);
+
 try {
     require __DIR__ . '/../app/bootstrap.php';
 } catch (\Exception $e) {

When I run php bin/magento cron:run, I get expected output:

Screenshot_2023-04-25_15-54-14

I also get the PHP Fatal error logged as expected when I introduce a memory leak as suggested:

Screenshot_2023-04-25_15-57-58

fredden avatar Apr 25 '23 15:04 fredden

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

Hello @fredden,

I have just rechecked the issue and found the logs in var/log/magento.cron.index.log:

image

Meanwhile can you please look into the failed tests?

Thanks

engcom-Hotel avatar Apr 26 '23 07:04 engcom-Hotel

The test failures from "Functional Tests B2B", "Functional Tests CE", "Functional Tests EE", and "Integration Tests" are all out of scope for this pull request. I have not done any research to determine if these tests are unreliable or broken elsewhere. They are not failing due to any changes introduced in this pull request.

There are two errors showing in the "Static Test" suite. One error (reported under a heading of "Php (B2B)") is already covered in the pull request description above. The other error (reported for both "Integrity (B2B)" and "Integrity (EE)") will need to be resolved by someone who has access to the now-private repositories for "enterprise edition" (ie, Adobe Commerce) and "business to business" (which isn't an edition).

I have fixed the Unit Test failure in 365236a125d7f9fc35aa6f683dca3b8954a0fe3a.

fredden avatar Apr 26 '23 12:04 fredden

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

Please fix failing static tests, all other failures look not related

Would you like me to add a 'phpcs:ignore' entry for the use of posix_isatty()? See note in pull request description. I am unable to influence the other static test failures as I do not have access to the relevant repositories on GitHub. (This has already been stated in https://github.com/magento/magento2/pull/32690#issuecomment-1523330685.)

fredden avatar May 02 '23 10:05 fredden

@fredden I think ignoring failure for posix_isatty would be enough

ihor-sviziev avatar May 02 '23 12:05 ihor-sviziev

I have added a1d3859b6c2146b6fd24a818ae3f8b8ac1756a44 to ignore the use of posix_isatty().

fredden avatar May 02 '23 12:05 fredden

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@fredden @ihor-sviziev we have created a guild issue to discuss on the usage of posix function in this PR. You can refer to the same here https://github.com/magento-commerce/development-guild/issues/202.

I will update you the discussion outcome here soon.

Thanks

engcom-Hotel avatar May 03 '23 07:05 engcom-Hotel

@engcom-Hotel that URL does not work for me (because the repository is private). Do you expect us to join the conversation, or are you telling us that there is a private-to-Adobe discussion taking place? Why not have the discussion here?

fredden avatar May 03 '23 07:05 fredden

:heavy_check_mark: QA Passed

Preconditions:

  • Install fresh Magento 2.4-develop and PHP 8.1

Manual testing scenario:

  • Reduce the memory limit in the Php.ini.
  • Add code in /bin/magento file
fwrite(STDOUT, "This string has been written to STDOUT\n");
fwrite(STDERR, "This string has been written to STDERR\n");

print_r($argv);

  • Run php bin/magento cron:run
  • Observe the log in cat magento.cron.index.log

Before: :heavy_multiplication_x: No logs were recorded

After: :heavy_check_mark: Now logs is recorded

Screenshot 2023-05-03 at 12 52 54 PM Screenshot 2023-05-03 at 11 56 12 AM

Builds are failed. Hence, moving this PR to Extended Testing

engcom-Lima avatar May 03 '23 07:05 engcom-Lima

@magento create issue

engcom-Lima avatar May 03 '23 10:05 engcom-Lima

Hello @fredden,

We are in discussion with the internal team through Guild for using the POSIX extension, hence we are moving this PR on hold status.

We will proceed further once we get the conclusion from the internal team.

Thanks

engcom-Hotel avatar May 15 '23 07:05 engcom-Hotel