bgcflow icon indicating copy to clipboard operation
bgcflow copied to clipboard

Feature: df_bgcs tables with all metadata

Open OmkarSaMo opened this issue 2 years ago • 3 comments

Proposal:

Create a df_bgcs and df_gcfs tables in the processed/{project_name}/tables directory with several metadata directly from antiSMASH results.

I think a general table will be valuable with several metadata of bgcs in the main tables directory instead of the current for_cytoscape directory.

List of extra metadata:

I think some extra columns will be beneficial, adding a few below and look for more recommendations:

  • antiSMASH based prediction of MIBIG similar BGCs based on knownclusterblast results with %similarity. (some times antiSMASH finds known clusters that BIGSCAPE misses)
  • whether a BGC is on contig edge (very useful)
  • number of genes in BGC and size of the BGC in KB
  • number of A-domains in the case when present
  • number of core biosynthetic genes
  • path to BGC gbk file
  • assigned GCF with 0.3 cutoff
  • Number of genomes where the GCF is present
  • Whether BGC is known based on BiGSCAPE
  • Whether BGC is known based on antiSMASH
  • BiGSLICE family assignment
  • BGCs in BiGSLICE model

Need anything more - @matinnuhamunada ?

OmkarSaMo avatar Sep 14 '22 11:09 OmkarSaMo

This looks perfect! Some of the data on this table can answer questions that @EVBAST and @tilmweber discussed this morning. Adding URLs to the MIBIG hits proven to be useful for end users too.

matinnuhamunada avatar Sep 16 '22 08:09 matinnuhamunada

This looks perfect! Some of the data on this table can answer questions that @EVBAST and @tilmweber discussed this morning. Adding URLs to the MIBIG hits proven to be useful for end users too.

THANKS @matinnuhamunada and @OmkarSaMo

EVBAST avatar Sep 19 '22 07:09 EVBAST

Hi, this issue will be adressed in the 0.6.1 release. As the table is huge, it will be stored in .parquet format and will be loaded to duckdb (instead of sqlite) https://github.com/NBChub/bgcflow/tree/dev-0.5.1

matinnuhamunada avatar Nov 15 '22 15:11 matinnuhamunada