icd icon indicating copy to clipboard operation
icd copied to clipboard

Add ICD-9 / ICD-10 crosswalk

Open dpritchLibre opened this issue 4 years ago • 11 comments

This PR addresses issue #186 and #189. In summary it adds a generic S3 function icd_gem and associated methods that takes a vector of input ICD-9 or ICD-10 codes and returns a data frame of the corresponding mappings.

The raw 2018 GEMs as found at https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html are stored in the project in the following files:

  • data-raw/icd-gem-2018-convert-10-to-9.txt
  • data-raw/icd-gem-2018-convert-9-to-10.txt

These raw GEMs are translated into R data frames using the script in tools/icd-gem-import-routines.R. The GEMs are translated to an equivalent form that has one scenario (using the terminology from the GEMs) per row that we find to be more useful for code lookup. The object documentation for these converted GEMs is added to the R/datadocs.R file. Testing for the data import / conversion is included at the bottom of icd-gem-import-routines.R. The data frames are stored in the following files:

  • data/icd_gem_9_to_10.rda
  • data/icd_gem_10_to_9.rda

The icd_gem generic function and associated methods are found in the R/icd-gem.R file. These routines are essentially convenience functions that subset the GEM data frames (and add rows for input codes that aren't found in the GEMs).

dpritchLibre avatar Nov 06 '19 14:11 dpritchLibre

The ICD-9 / ICD-10 crosswalk is also implemented in the touch R package: https://hub.wenjie-stat.me/touch/reference/icd_map.html. It will be interesting to compare the implementation in this PR with touch::icd_map().

wenjie2wang avatar Nov 16 '19 16:11 wenjie2wang

Hi @wenjie2wang, thanks for mentioning the touch package, unfortunately I did not find your package when I was originally searching for this functionality. It sounds like we both are working with similar data, I would be happy to collaborate on this or future projects.

At first glance it appears that there is some difference between this PR and the touch package in how combination codes are handled. For example, consider the ICD-9 code 24951. For touch, we have the following.

> icd_map(c("24951"), output = "list")
[[1]]
[1] "E08311" "E08319" "E0836"  "E0839"  "E0865"  "E09311" "E09319" "E0936"  "E0939" 

And for the icd PR we have the following, which has the interpretation either a code from c("E0839" "E0939") or a code from both c("E08311" "E08319" "E0836" "E09311" "E09319" "E0936" ) and "E0865" (see the function documentation for more details on the meaning).

> conv <- as_tibble(icd_gem("24951"))
> conv
# A tibble: 2 x 5
  source scenario type        approx codes           
  <chr>  <chr>    <chr>       <lgl>  <list>          
1 24951  0        simple      TRUE   <named list [1]>
2 24951  1        combination TRUE   <named list [2]>

> conv$codes
[[1]]
[[1]]$`0`
[1] "E0839" "E0939"


[[2]]
[[2]]$`1`
[1] "E08311" "E08319" "E0836"  "E09311" "E09319" "E0936" 

[[2]]$`2`
[1] "E0865"

dpritchLibre avatar Nov 16 '19 19:11 dpritchLibre

Hi David, thanks for looking into the difference and providing the example!

The touch::icd_map() was motivated for a fast conversion of hundreds of millions of ICD codes in one project. The conversion follows the MapIT toolkit from AHRQ, where the conversion can be done by GEM with its reverse mappings (see the documentation of the mapping tool for details). I think the combination flags are no longer informative in the reverse mappings. Thus, they were completely ignored when I was implementing touch::icd_map() for simplicity. The implementation in this PR does provide more information for conversion of codes with positive combination flags by one step GEM.

wenjie2wang avatar Nov 17 '19 03:11 wenjie2wang

Just a follow-up: I did some quick updates to the touch::icd_map() for the codes with positive combination flags, such as the ICD-9 code 24951.

icd_map("24951", output = "list")
#> [[1]]
#> [1] "E0839"        "E0939"        "E08311+E0865" "E08319+E0865" "E0836+E0865" 
#> [6] "E09311+E0865" "E09319+E0865" "E0936+E0865" 
#> 

where the + indicates the code combination.

wenjie2wang avatar Nov 17 '19 20:11 wenjie2wang

This is great -- thanks for sharing. I hadn't heard of the reverse mappings strategy before. It seems like this is useful for getting a wider definition of the codes. I think we'll want to consider using these mappings for our research projects as some point.

dpritchLibre avatar Nov 22 '19 21:11 dpritchLibre

Hi all, sorry to have been absent from discussion. I haven't had time to think this through yet. From the work of the NIH hackathon group, it seemed that once the GEM mappings were converted (which is a one-off process at the package maintainer level), the ICD comorbidity engine could be used. See the PDF vignette on 'efficiency' for how this is so fast.

My philosophy is not to import other packages ( http://www.tinyverse.org/ ), but I'm glad this note is here to show users another way. I will be merging the hackathon work.

All the best, Jack

jackwasey avatar Nov 22 '19 23:11 jackwasey

Thanks, Jack! Could you please provide the link to the work of the NIH hackathon group?

I am also a fan of tinyverse: the current version of touch package only depends R itself and imports Rcpp for integrating C++ with R.

wenjie2wang avatar Nov 23 '19 02:11 wenjie2wang

Hi Jack, thanks for this fantastic package! It wasn't clear to me from your message -- is there any interest in merging in this PR after whatever modifications you see fit, or would it be better to continue this work as a standalone package using icd as a dependency?

dpritchLibre avatar Nov 24 '19 02:11 dpritchLibre

Absolutely. Thanks for your attention to this. I will definitely get the cross-walk code in. The NIH hackathon came up with various approaches: the simplest was using data.table. The surrounding testing will need elaboration, and this is a complicated area. Having a solid set of tests will be important for this to move forward. I'm working on a small CRAN-required update, and will spend some time on this after that is done.

jackwasey avatar Mar 03 '20 21:03 jackwasey

@jackwasey I've added in a crosswalk for the ICD-9 / ICD-10 procedure codes. Please let me know if you are interested in including them crosswalks into your package, otherwise I would be happy to extract them into a small add-on package. Many thanks for your work providing this resource to us all.

dapritchard avatar Apr 14 '20 19:04 dapritchard

Hi all, has there been any thoughts on whether this PR would be considered for inclusion in the project?

dpritchLibre avatar Aug 18 '20 14:08 dpritchLibre