cmapPy
cmapPy copied to clipboard
Bug .GCT written by cmapPy on Windows have inconsistent line endings
Hi Lev,
#Bug .GCT files written with cmapPy on Windows, show alternating blank lines after the top 3 lines when opened in Excel, though fine in code editor Spyder v5.12.3
Fix: The line below writes the 1st 2 lines of a .GCT file and would otherwise default to OS line_terminator of \r\n which conflicts with all other lines terminated by \n
Inconsistent line endings probably tricks Excels auto line ending recognition
C:\ProgramData\Anaconda3\Lib\site-packages\cmapPy\pandasGEXpress\write_gct.py #line 102
Write top_half_df to file
#top_half_df.to_csv(f, header=False, index=False, sep="\t")
top_half_df.to_csv(f, header=False, index=False, sep="\t", line_terminator='\n')
Please incorporate into next version. Screenshots attached.
Thanks,
Good catch. I think the better change would be to replace \n
with os.linesep
in write_version_and_dims
:
https://github.com/cmap/cmapPy/blob/d1652c3223e49e68e3a71634909342b4a6dbf361/cmapPy/pandasGEXpress/write_gct.py#L64-L65
Won't that lead to \r\n line endings on Windows? We should be striving to get the entire file to be \n line endings. I seek to have a file that is identical, no matter whether it is written in linux or windows. That is how I encountered this bug.
--Karl
I understand your point, but I feel that it would be wise to follow the convention chosen by pandas
to use system-specific line terminators. I confirmed (on my Mac) that the file looks the same when opened in Excel if all the terminators are either all \n
or all \r\n
.
f = open("A.txt", "w")
f.write(("A" + "\n"))
f.write(("B" + "\n"))
f.close()
g = open("B.txt", "w")
g.write(("A" + "\r\n"))
g.write(("B" + "\r\n"))
g.close()
Hi Lev,
I'm a Windows guy and now-a-days Windows programs can routinely handle '\n' line terminators. For reading/writing .GCT files people upstream/downstream of me use Macs. Consequently it is important to be able to read/write and get the same result. If you force windows generated files to be '\r\n' then I'm going to have fix every one I produce with cmaPy to get the desired '\n'.
Would you please at least provide an option to specify the line terminator to be written? So long as I have a means to get '\n' then I don't care which you choose as a default.
Thanks,
--Karl