cloud-native-setup icon indicating copy to clipboard operation
cloud-native-setup copied to clipboard

makereport.sh fails when having more than one result file within subdirectory

Open marcemq opened this issue 6 years ago • 2 comments

What makereport.sh fails when having more than one result file within subdirectory, the system details table gets generated and only the final PDF report which reports some errors, see the generated attached (metrics_report.pdf)

Steps to reproduce

  1. Have at list two result json file within directory, i.e.:
$ tree results/
results/
├── Env.R
└── scaling
    ├── k8s-scaling-many-nodes.json
    └── k8s-scaling.json

1 directory, 3 files
  1. Launch report execution:
$ ./report/makereport.sh
...
[1] "html.md"
Executing: pandoc -t html -o 'html.html' 'html.md'
[1] "html.html"
PNGs of graphs and tables can be found in the output directory.
The report, named metrics_report.pdf, can be found in the output directory
-rw-r--r-- 1 root root   7886 Oct 17 14:44 /<user>/cloud-native-setup/metrics/report/output/dut-1.png
-rw-r--r-- 1 root root 163725 Oct 17 14:44 /<user>/cloud-native-setup/metrics/report/output/metrics_report.pdf

marcemq avatar Oct 17 '19 19:10 marcemq

Debugging the code the error occurs in tidy_scaling.R, a file that gets called from makereport.sh and it turns out that it fails even having one output file because there is a mismatch in the naming of the result file and the test name inside it at this point, i.e.:

$ tree results
results
├── Env.R
└── scaling
    └── k8s-scaling-many.json

when calling makereport.sh with debug mode:

> source('/scripts/tidy_scaling.R')
> source('/scripts/tidy_scaling.R')
[1] "Output json file:"
[1] "/inputdir/scaling/k8s-scaling-many.json"
[1] "Content of fdata:"
NULL
Error in do.call("rbind", br$launched_pods) :
  second argument must be a list

From the print log, tidy_scaling.R can process any json output file, but the test name into the results is k8s-scaling, so the questions to be answered to fix this issue:

  • should we enable the functionality to ever-ride TEST_NAME (k8s scaling) when calling k8s_scale.sh? Currently it's hard-coded. By enabling this feature, then the name of the result file will be used for the test name used.
  • even if we add the TEST_NAME functionality, what would be the expected behavior when having more than one output file in subdirectory? should the results be kind of merged in the images generated? Or should we force to have only one file per subdirectory?

marcemq avatar Oct 18 '19 16:10 marcemq

Hi @marcemq . First, thanks for reporting the pdf generation failure :-) I think the failure comes from two things:

  1. I think you have copied a .json scaling file to a different name (k8s-scaling-many.json) into the same subdir as the original results file - that is not how the data is meant to be 'structured'. Each set of results should go in a seprate subdir - so, you would have the subdirs named maybe onenode and manynodes, each with a single k8s-scaling.json file within it.
  2. but, there does indeed look to be a bug in the tidy_scaling.R script, which then fell over this directory/file structure... ideally it should have ignored the 'extra' file in the subdir.

Looking at the tidy_scaling.R code, we can see the file match line at the top:

testnames=c(
        "k8s-scaling.*"
)

and then further down the file regexp matching code of:

                matchfile=paste(testname, '\\.json', sep="")
                files=list.files(matchdir, pattern=matchfile)

That produces the matchfile pattern of k8s-scaling.*\\.json, which I think is incorrect. That first .* is a greedy match, and thus matches the file k8s-scaling-many.json, when that was not the intention. It should only match files called k8s-scaling.json. I suspect this was a copy/inherit issue from another R file that did do many-file matching....

Sooo, I think the simple fix is to:

diff --git a/metrics/report/report_dockerfile/tidy_scaling.R b/metrics/report/report_dockerfile/tidy_scaling.R
index 3e56d99..27842a9 100755
--- a/metrics/report/report_dockerfile/tidy_scaling.R
+++ b/metrics/report/report_dockerfile/tidy_scaling.R
@@ -14,7 +14,7 @@ suppressMessages(library(scales))                     # For de-science notation of axis
 library(tibble)                                                # tibbles for tidy data

 testnames=c(
-       "k8s-scaling.*"
+       "k8s-scaling"
 )

which results in a matchfile pattern of k8s-scaling\\.json, which then does the correct thing.

Let me know what you think - I think the above fix is correct, but bolstering the robustness of the code is never a bad thing, so please assess if we still want to add your other patches. Feel free to add the above to your commits - and again, thanks for spotting this - there was a real bug there ;-)

grahamwhaley avatar Oct 31 '19 10:10 grahamwhaley