python-swat
python-swat copied to clipboard
swat.functions.concat truncates char columns with different length
I have for example two CAS Tables T1 T2, which have the exact same columns. I do this:
df1 = CASTable('T1')
df1["Name"] = "AB"
df1.partition(casout=dict(name='T1', replace=True))
df1 = CASTable('T1')
df2 = CASTable('T2')
df2["Name"] = "CDE"
df2.partition(casout=dict(name='T2', replace=True))
df2 = CASTable('T2')
swat.functions.concat([df1,df2], casout=dict(name='T3'))
The T3 table has then a column "Name" which contains "AB" and "CD" i.e. it has a length of 2 and "CDE" is truncated to "CD".
How can I specify an a-priori length for the "Name" column? Like you do in sas code:
length Name $3.;
For something like this, you'll have to use the append_computed_columns method of CASTable. It allows you to specify whatever code you want and the variables that get exported. In your case, it would look like:
df1.append_computed_columns('Name', 'length Name $3.; Name = "AB";')
df2.append_computed_columns('Name', 'length Name $3.; Name = "CDE";')
SWAT probably should have made character columns like this be varchars instead of fixed with character columns.
Thank you, very interesting the append_computed_columns method (is it missing in the docs? sometimes I feel like there are some things undocumented).
However I see that if I assing a string constant in that way, the code generated is this:
computedvars=['Name']
computedvarsprogram='Name= "AB"; '
What will be the columns type on CAS in this case?
That might not be in the doc, now that you mention it. It was used as an internal-only method for a while, but then was made public later.
As far as the data type goes, character columns will be fixed-width columns by default. Varchar and varbinary have to be explicitly specified.