leafcutter
leafcutter copied to clipboard
Potentially incorrect use of minclureads and minreads
Hi,
I was trying to use LeafCutter to process some junction reads and noticed a potential bug in leafcutter_cluster.py
and leafcutter_cluster_regtools.py
:
-
When the clusters are refined using the
refine_clusters
function,leafcutter_cluster.py
passes theminreads
parameter torefine_cluster
- I think this leads to some clusters with less thanminclureads
being included in further steps. There must be check to filter out clusters with less thanminclureads
by possibly providing it as another argument torefine_cluster
. -
In
leafcutter_cluster_regtools.py
, there is nominreads
parameter and theminclureads
parameter is passed torefine_cluster
. This leads to a stricter threshold being applied to the number of reads for each intron. I am not sure why theminreads
parameter was removed but fixes similar to those forleafcutter_cluster.py
could be applied here too.
Please let me know if I am missing something!
Thank you, Aniketh
Thanks for pointing this out. We are working on a newer version of LeafCutter with these fixes.
Best, Yang
On Sun, Dec 18, 2022, 18:43 Aniketh Janardhan Reddy < @.***> wrote:
Hi,
I was trying to use LeafCutter to process some junction reads and noticed a potential bug in leafcutter_cluster.py and leafcutter_cluster_regtools.py:
When the clusters are refined using the refine_clusters function, leafcutter_cluster.py passes the minreads parameter to refine_cluster
- I think this leads to some clusters with less than minclureads being included in further steps. There must be check to filter out clusters with less than minclureads by possibly providing it as another argument to refine_cluster.
In leafcutter_cluster_regtools.py, there is no minreads parameter and the minclureads parameter is passed to refine_cluster. This leads to a stricter threshold being applied to the number of reads for each intron. I am not sure why the minreads parameter was removed but fixes similar to those for leafcutter_cluster.py could be applied here too.
Please let me know if I am missing something!
Thank you, Aniketh
— Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCLRDCOJ6W3IIUSDKALWN6OQZANCNFSM6AAAAAATCY44OI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Great, thank you! Looking forward to the new release.
Hi,
I was trying to use LeafCutter to process some junction reads and noticed a potential bug in
leafcutter_cluster.py
andleafcutter_cluster_regtools.py
:
- When the clusters are refined using the
refine_clusters
function,leafcutter_cluster.py
passes theminreads
parameter torefine_cluster
- I think this leads to some clusters with less thanminclureads
being included in further steps. There must be check to filter out clusters with less thanminclureads
by possibly providing it as another argument torefine_cluster
.- In
leafcutter_cluster_regtools.py
, there is nominreads
parameter and theminclureads
parameter is passed torefine_cluster
. This leads to a stricter threshold being applied to the number of reads for each intron. I am not sure why theminreads
parameter was removed but fixes similar to those forleafcutter_cluster.py
could be applied here too.Please let me know if I am missing something!
Thank you, Aniketh
Hi,
sorry to disturb! I might encounter a similar problem like you said. when I run the leafcutter_cluster_regtools.py with --minclureads as 50, it makes the number of reads for every intron in one cluster more than 50, instead of summing all reads for the cluster.
But my poor python and English ability don't allow me to understand the source code well, so is that because "there is no minreads parameter and the minclureads parameter is passed to refine_cluster" and it makes a stricter threshold for intron?
But why🧐, Isn't the parameter --minclureads set for the cluster, so how does it influence the intron? or is it because the script of leafcutter_cluster_regtools.py has been changed recently and the former one don't have this issue?
Any reply are appreciated!
Thank you, Jeep
Thank you, this note has been noted and we will fix this, along with other requests in an upcoming leafcutter release.
Best, Yang
On Wed, Apr 12, 2023 at 11:56 AM CuteGold0407 @.***> wrote:
Hi,
I was trying to use LeafCutter to process some junction reads and noticed a potential bug in leafcutter_cluster.py and leafcutter_cluster_regtools.py:
- When the clusters are refined using the refine_clusters function, leafcutter_cluster.py passes the minreads parameter to refine_cluster
- I think this leads to some clusters with less than minclureads being included in further steps. There must be check to filter out clusters with less than minclureads by possibly providing it as another argument to refine_cluster.
- In leafcutter_cluster_regtools.py, there is no minreads parameter and the minclureads parameter is passed to refine_cluster. This leads to a stricter threshold being applied to the number of reads for each intron. I am not sure why the minreads parameter was removed but fixes similar to those for leafcutter_cluster.py could be applied here too.
Please let me know if I am missing something!
Thank you, Aniketh
Hi,
sorry to disturb! I might encounter a similar problem like you said. when I run the leafcutter_cluster_regtools.py with --minclureads as 50, it makes the number of reads for every intron in one cluster more than 50, instead of summing all reads for the cluster.
But my poor python and English ability don't allow me to understand the source code well, so is that because "there is no minreads parameter and the minclureads parameter is passed to refine_cluster" and it makes a stricter threshold for intron?
But why🧐, Isn't the parameter --minclureads set for the cluster, so how does it influence the intron? or is it because the script of leafcutter_cluster_regtools.py has been changed recently and the former one don't have this issue?
Any reply are appreciated!
Thank you, Jeep
— Reply to this email directly, view it on GitHub https://github.com/davidaknowles/leafcutter/issues/222#issuecomment-1505611972, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGWTCPKUI42YGRSQHYDOZTXA3NE5ANCNFSM6AAAAAATCY44OI . You are receiving this because you commented.Message ID: @.***>