CellO icon indicating copy to clipboard operation
CellO copied to clipboard

Understanding assignment of most specific cell type

Open LustigePerson opened this issue 2 years ago • 0 comments

Hello and thanks for your amazing tool. I am trying it out right now, but somehow it seems to me that either I don't quite understand how the most specif cell type is called and how the binarization is done, or that, in my case, it is not doing what it should.

See the example graph below. Please Note, that I modified the graph:

  • Grey outlines mean there is no binary information for this cell
  • Red outlines mean the binary assignment for this cell is False for this cluster
  • Green outlines mean the binary assignment for this cell is True for this cluster
  • The Octagonal shape is the actual called "most specific" cell type.

cluster_0_graph_fig_1

As you can see the CD14-positive, CD16-positive monocyte with a score of 0.27 is selected. The CD14-positive, CD16-negative classical monocyte cell type is not considered, which I don't get. It has a score of 0.45 as you can also see from this table for the respective cluster:

CD14-positive monocyte (probability) CD14-positive, CD16-negative classical monocyte (probability) CD14-positive, CD16-positive monocyte (probability) CD14-positive monocyte (binary) CD14-positive, CD16-negative classical monocyte (binary) CD14-positive, CD16-positive monocyte (binary)
0.9955782079351584 0.4540293475444127 0.2678592195788922 True False True

Strangely, I expected the CD14-positive, CD16-negative classical monocyte cells to also cross the binary threshold as these are the considered thresholds from the ir.10x_genes_thresholds.tsv file:

label label_name threshold empirical_threshold precision F1-score
CL:0001054 CD14-positive monocyte 0.5 0.9421197743088574 0.8941176470588236 0.8186714542190305
CL:0002057 CD14-positive, CD16-negative classical monocyte 0.20572466450006424 0.20572466450006424 0.047619047619047616 0.0904977375565611
CL:0002397 CD14-positive, CD16-positive monocyte 0.0021930058655731683 0.0021930058655731683 0.018518518518518517 0.03619047619047619

Is this because the predecessors (like the classical monocyte) maybe don't cross their threshold? So i predecessor information taken into account here? This would make perfect sense I guess, I was just unaware of this.

One additional question: There is no direct way to set a minimal threshold for the assignment, right? I think, if I remember correctly , I saw it somewhere in the code but the parameter is not available from the exposed function directly.

LustigePerson avatar May 31 '22 10:05 LustigePerson