parso icon indicating copy to clipboard operation
parso copied to clipboard

[QUESTION] Identify TIME Column Type?

Open cgivre opened this issue 4 years ago • 1 comments

Thank you very much for writing (and open sourcing) this really great library! I am working on incorporating this into Apache Drill which will enable users to use Drill to query SAS files with SQL. I ran into a small issue which I can't seem to figure out. I need to get the schema of the data. Right now, I have a for loop that iterates over the columns and maps them to the correct type of Drill vector.

List<Column> columns = sasFileReader.getColumns();

The issue I'm running into is with TIME formats. With the code above, parso only seems to return times as longs. Is there any way to programmatically identify whether a column is a time and not a long?

Here's a link to my code: https://github.com/cgivre/drill/tree/format-sas The relevant code can be found in the contrib/format-sas folder.

Thanks!

cgivre avatar Nov 22 '21 00:11 cgivre

Apologies, I might have missed your question in 2021! Parso tries to mimic SAS7BDAT format behaviour, and time (we are talking about time per se, not date or date+time, right) is stored as a number of seconds after the midnight. To understand whether it was time or just arbitrary number, you would need to understand what the column type in the original file was (for example, it could be TIME or just NUMERIC).

Please take a look at https://github.com/epam/parso/pull/86 to better understand what are the challenges with datetime format.

printsev avatar Jan 17 '22 05:01 printsev