Query runtime errors are not being surfaced
The problem will occur when searching for several Pools that are imported with a different type of csv, line Because one line does not have a csv structure, the results output is equivalent to 0 KB
I will put the sample file
I think you should add line option to export
@AFgh24: Thanks for reporting. It looks like you found some bugs/limitations in areas that have not been heavily exercised.
The root cause is actually at the Zed layer: Zed requires the top-level values to be record type to be able to output them as CSV. This is due to the fact that CSV has a header row with the field names, but top-level primitive values have no names. This is apparent from the Zed CLI tooling:
$ zq -version
Version: v1.7.0-26-g2286e6ab
$ zq -i line -f csv imdb.txt
CSV output encountered non-record value: "15 avg_score_16 full_time_teachers percent_black percent_white percent_asian percent_hispanic"
...but as I described with your test data at https://github.com/brimdata/zed/issues/4215#issuecomment-1518255665, there's a bug where that error is not currently being surfaced in Zui, so there's no way you'd have been made aware to try another way. So we'll look to get that fixed.
The attached video shows one way to get around the problem: You can use the conditional operator to find top-level primitive values (string type in this case) and make them into named fields in a record.
from imdb*
| yield typeof(this)==<string> ? {name: this} : this
Once you've done that, the CSV output will succeed.
https://user-images.githubusercontent.com/5934157/233724556-07932040-1009-4d60-bd96-775106bf568d.mp4
Regarding the suggestion:
I think you should add line option to export
On the input side with line format, treating every newline-separated string as a separate primitive value seems intuitive. But given the diversity in Zed's type system, it's not clear how each value should be treated when output as line. For instance, if a pool contained only primitive string values, indeed, this would be easy, but what if a pool contained complex values like records, as most pools do? Given your specific use case I'm guessing you might have imagined each line being a string representation as CSV, but a case could just as easily be made for JSON or ZSON strings. Perhaps you can say a bit more about what you'd have expected if line had already existed as an export option?
(BTW, your test data revealed another bug that I filed at https://github.com/brimdata/zed/issues/4525. As you may have noticed, the fuse operator is automatically applied by Zui when doing CSV export in order to ensure the data all fits into a single type and hence a single CSV header of field names can be printed. Your test data seemed to trigger a problem unique to fusing primitive & complex types. Yet another bug https://github.com/brimdata/zed/issues/4522 came up when I was tinkering with trying to find a more general solution to checking for the top-level primitive types. Thanks for helping find all these!)
Per https://github.com/brimdata/zed/issues/4215, the Zed lake API now has a new Query Status endpoint from which runtime errors can be retrieved. In addition to those docs, the verification notes in https://github.com/brimdata/zed/issues/4215#issuecomment-1711815826 show an example of using the endpoint in the specific case of what would have been a 0-length failed Export like the one described in this issue.
Since the Zed layer now has the foundations, it's time to enhance Zui to take advantage of it. I suspect the zed-js client could start wrapping its Zed query requests to the lake API such that it checks that "status" endpoint after a query connection has closed and surfaces any response other than those that look like:
{"type":"Error","kind":"invalid operation","error":"query not found"}
...since that's what's returned when a query has no runtime errors.
@mattnibs confirmed that the query status endpoint is currently only wired up to return errors related to the query endpoint, so there's currently no sense in checking for errors on other lake API operations, e.g., data loads.
Verified in Zui commit 09b7a36.
As shown in the attached video, now when we attempt the CSV export on primitive text values, the error message is popped up for the user.
https://github.com/brimdata/zui/assets/5934157/071a06d6-3655-476a-adf1-401c26471ca2
This also unblocks #2591. I've been holding off on adding Parquet as an export format because its limited data type support means that I expect a high number of failed export attempts, so it seemed reckless to add the feature without being able to inform the user why it may fail.
Thanks @jameskerr!