Athena icon indicating copy to clipboard operation
Athena copied to clipboard

CPT4 ULMS API process causing insertion of carriage returns in "CONCEPT.csv"

Open odikia opened this issue 1 year ago • 2 comments

I'm presently having to clean the final concept.csv prior to insertion into a postgres database following the insertion of CPT4 codes via the cpt.bat process that is described upon downloading the vocabulary from Athena.

PostgreSQL (run in psql CLI, ):

\copy omop.concept FROM '\path\to\modified\concept.csv' WITH (FORMAT CSV, DELIMITER E'\t', QUOTE E'\b', ENCODING 'UTF8', HEADER TRUE)

Query returns:

ERROR: unquoted carriage return found in data HINT: Use quoted CSV field to represent carriage return.

System and File information

Included datafile with 4 error examples: See attached. Note that ULMS CPT4 codes being pulled down requires a license. I provide 4 error examples with Concept name and Concept code redacted so as to ensure that I haven't created any kind of license infringements by providing this document. The OMOP information provided by Odysseus, including Concept_ID's, remain.

OMOP Vocabulary version: v5.0 23-JAN-23

Java info: Version 8 Update 361 (build 1.8.0_361-b09)

Target Database version: PostgreSQL 14.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit

System info: Processor 12th Gen Intel(R) Core(TM) i7-1270P 2.20 GHz Installed RAM 32.0 GB (31.4 GB usable) System type 64-bit operating system, x64-based processor

Windows Info: Edition Windows 10 Enterprise Version 21H2 Installed on ‎7/‎20/‎2022 OS build 19044.2846 Experience Windows Feature Experience Pack 120.2212.4190.0

CONCEPT_first_4_cpt4_errors.csv

odikia avatar May 05 '23 17:05 odikia

@odikia - Daniel, this is odd... let me double check if we have changed anything recently about the cpt4.jar.

mik-ohdsi avatar May 10 '23 08:05 mik-ohdsi

@odikia - looks as if we have been doing this for a while now. I can confirm that it seems that all rows for CPT4 in the concept.csv after reconstitution end with a CRLF instead of only a LF. Did you always update your vocabularies in the same way and if so, when was the last time that you were able to do so without an error?

mik-ohdsi avatar May 10 '23 15:05 mik-ohdsi