ersilia icon indicating copy to clipboard operation
ersilia copied to clipboard

[🐈 Task]: Putting Log Files from Workflow Errors into Artifacts

Open miquelduranfrigola opened this issue 2 years ago • 17 comments

If the Actions fails: save the Log Files as artifacts so a contributor can export it. Explore ways to add a link to the artifact Log File so a contributor or maintainer can resolve the issue.

miquelduranfrigola avatar Jan 17 '23 16:01 miquelduranfrigola

Hi,

There are several log files that are created at fetch time. At the moment, they are stored in temporary folders, which makes it difficult to trace them. I will collect these logs files in a pre-defined folder, so I hope it will be easier to gather the artifacts.

At serve time, I think there is only one log file, so it will be easy.

I have assigned this issue to myself. If I face problems with including artifacts, I will ask @megamanics or @honeyankit . Thanks!

miquelduranfrigola avatar Jan 17 '23 16:01 miquelduranfrigola

@miquelduranfrigola multiple paths with wildcards can be passed to collate the artifacts

      - name: Archive multiple artifacts
        uses: actions/upload-artifact@v2
        with:
          name: fetch-logs
          path: |
            dist
            !dist/**/*.md

megamanics avatar Jan 17 '23 18:01 megamanics

Thanks @megamanics this is helpful!

miquelduranfrigola avatar Jan 17 '23 21:01 miquelduranfrigola

This is related to: https://github.com/ersilia-os/ersilia/issues/537 We will need to collect log files in the same directory first

GemmaTuron avatar Jan 26 '23 17:01 GemmaTuron

This issue is not a priority, we will leave it open and tackle when possible

GemmaTuron avatar Feb 02 '23 17:02 GemmaTuron

@miquelduranfrigola maybe this is a good time to re-take this? Could we assign it to one of the new interns?

GemmaTuron avatar Dec 03 '23 10:12 GemmaTuron

Good idea. Are you still interested, @GemmaTuron ?

miquelduranfrigola avatar Jan 03 '24 21:01 miquelduranfrigola

Yes, who do you want to assign it to @miquelduranfrigola ?

GemmaTuron avatar Jan 04 '24 06:01 GemmaTuron

Let's ask in the internships channel? Whoever has some experience with GitHub Actions.

miquelduranfrigola avatar Jan 04 '24 07:01 miquelduranfrigola

@miquelduranfrigola this isn't currently happening right?

DhanshreeA avatar May 28 '24 14:05 DhanshreeA

no, this has not been tackled yet so it might be a good moment to do so!

GemmaTuron avatar May 29 '24 02:05 GemmaTuron

@dzumii would you like to take this up?

DhanshreeA avatar Jul 01 '24 08:07 DhanshreeA

Ok! I will check it out and let you know if I have questions or encounter any blocker

dzumii avatar Jul 01 '24 08:07 dzumii

Thanks all. Any feedback needed, please let me know.

miquelduranfrigola avatar Jul 01 '24 10:07 miquelduranfrigola

@GemmaTuron @DhanshreeA @miquelduranfrigola To get the log files, I have been focusing on the eos/tmp directory and found that nothing is really happening there in terms of the run logs. @DhanshreeA just made me know over the call now that these files are created in the /tmp directory in the root directory. A couple of folders are created there. Still a bit confused about which of the files we want to actually upload as artifacts.

dzumii avatar Jul 04 '24 07:07 dzumii

OK thanks @dzumii. I am a little bit confused - I am sure I am missing something, so apologies beforehand. In my opinion, the logs that we should be keeping are in the file eos/current.log and/or eos/console.log. Also, when models are being tracked, we could consider improving/enlarging the eos/ersilia_runs/logs/... files, but I would deprioritize this for now. In any case, I think that keeping info from the tmp directory may be difficult since many, many files are generated. @DhanshreeA - am I missing something? Is there a reason why we want to keep more than what is found in the .log files in the eos folder? If so, should we maybe channel dynamically the missing information into those logs?

miquelduranfrigola avatar Jul 04 '24 08:07 miquelduranfrigola

@miquelduranfrigola I've generally found the cycle to debugging model issues a bit long considering that the bentoml server logs, as well as subprocess logs from run.sh - which get generated in /tmp directory of a host - do not fully get captured in either current.log, or console.log, and I think we're both talking about the same thing from slightly different angles - it would indeed be helpful to increase the scope of logs we collect in something like eos/ersilia_runs/logs/..., especially to make model troubleshooting easier for maintainers/contributors. Indeed we can deprioritise it for now, in which case this issue can be successfully closed since both console.log, and current.log will be uploaded as artifacts for each model with a 14 day retention period (subject to model workflows being updated).

DhanshreeA avatar Jul 09 '24 14:07 DhanshreeA

OK, I see and I'm agree. @DhanshreeA perhaps a good moment to think about this is actually now when we are trying to run multiple sessions at the same time. As part of this procedure, we can redirect logs into appropriate folders as well. Let me know what you think

miquelduranfrigola avatar Jul 15 '24 14:07 miquelduranfrigola

I absolutely agree @miquelduranfrigola - this would be super useful to have now.

DhanshreeA avatar Jul 15 '24 14:07 DhanshreeA

OK then let's go for it during this week. @DhanshreeA and @miquelduranfrigola can give it a first go and then we can share progress with @dzumii on Wednesday or Thurday. Sounds good?

miquelduranfrigola avatar Jul 15 '24 16:07 miquelduranfrigola