SummarizedExperiment icon indicating copy to clipboard operation
SummarizedExperiment copied to clipboard

Enforce unique assay names

Open hpages opened this issue 2 years ago • 2 comments

This started as a more general discussion about empty strings in List names but the real concern seems to be more specifically about the names of the assays. It comes down to these basic questions:

  1. Should we enforce names on the assays? Right now assay names are optional:

    library(SummarizedExperiment)
    
    m1 <- matrix(1:12, ncol=3)
    m2 <- m1 + 100.5
    se <- SummarizedExperiment(list(m1, m2))
    
    assayNames(se)
    # NULL
    
    ## Note that the show() method is misleading here, suggesting that the names are empty strings:
    se
    # class: SummarizedExperiment 
    # dim: 4 3 
    # metadata(0):
    # assays(2): '' ''
    # rownames: NULL
    # rowData names(0):
    # colnames: NULL
    # colData names(0):
    
  2. If the user does not supply assay names, should we make automatic names? (the other option would be to complain in an error message)

  3. Should we enforce their uniqueness? Right now they can have duplicates:

    se <- SummarizedExperiment(list(A=m1, A=m2))
    assayNames(se)
    # [1] "A" "A"
    
  4. Should we also forbid empty or NA names? Right now they are allowed:

    se <- SummarizedExperiment(setNames(list(m1, m2), c("", NA)))
    assayNames(se)
    # [1] "" NA
    

My answer would be "yes" to all 4 questions.

Note that the situation is very similar to what data.frame() and DataFrame() do with column names (when check.names=TRUE). So the last question is:

  1. Should we just use make.names(., unique=TRUE) like data.frame() and DataFrame() do to fix the user-supplied names?

@LTLA @vjcitn @lawremi Comments? Suggesttions?

hpages avatar Jan 18 '22 19:01 hpages