Trying to generate the projection directory without succes
Hello,
i'm using PPanGGOLiN v2.2.3 and I would like to produce a projection folder as in v1. I've tryied the following command:
ppanggolin projection -p pangenome.h5 --fasta ../Bacillus_A_ombysepticus_ICSA_fasta_29072025_modified.list --anno ../Bacillus_A_ombysepticus_ICSA_gff_29072025_modified.list --verbose 2
2025-07-31 08:34:24 utils.py:l239 INFO Command: /SD5/people/s1060627/miniconda3/bin/ppanggolin projection -p pangenome.h5 --fasta ../Bacillus_A_ombysepticus_ICSA_fasta_29072025_modified.list --anno ../Bacillus_A_ombysepticus_ICSA_gff_29072025_modified.list --verbose 2
2025-07-31 08:34:24 utils.py:l242 INFO PPanGGOLiN version: 2.2.3
2025-07-31 08:34:24 utils.py:l710 DEBUG The parameter "--anno: ../Bacillus_A_ombysepticus_ICSA_gff_29072025_modified.list" has been specified in the command line with a non-default value. Its value overwrites the default value (None).
2025-07-31 08:34:24 utils.py:l710 DEBUG The parameter "--fasta: ../Bacillus_A_ombysepticus_ICSA_fasta_29072025_modified.list" has been specified in the command line with a non-default value. Its value overwrites the default value (None).
2025-07-31 08:34:24 utils.py:l710 DEBUG The parameter "--pangenome: pangenome.h5" has been specified in the command line with a non-default value. Its value overwrites the default value (None).
2025-07-31 08:34:24 utils.py:l710 DEBUG The parameter "--verbose: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2025-07-31 08:34:24 utils.py:l891 DEBUG 1 projection parameters have non-default value: verbose=2
2025-07-31 08:34:24 utils.py:l977 INFO 1 parameters have a non-default value.
2025-07-31 08:34:24 projection.py:l1473 DEBUG The provided file (../Bacillus_A_ombysepticus_ICSA_fasta_29072025_modified.list) is detected as a TSV file.
2025-07-31 08:34:24 projection.py:l1473 DEBUG The provided file (../Bacillus_A_ombysepticus_ICSA_gff_29072025_modified.list) is detected as a TSV file.
2025-07-31 08:34:24 projection.py:l1521 DEBUG
2025-07-31 08:34:26 utils.py:l387 DEBUG Create output directory /biostress/pangenomics/ICSA/Bacillus_A_bombysepticus_ICSA/Bacillus_A_bombysepticus_ICSA_PPGGv2/ppanggolin_projection_DATE2025-07-31_HOUR08.34.24_PID1716937
/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/tables/attributeset.py:363: FiltersWarning:
Failed parsing FILTERS key
2025-07-31 08:34:26 readBinaries.py:l123 INFO Getting the current pangenome status
2025-07-31 08:34:26 readBinaries.py:l1503 INFO Reading pangenome annotations...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2325/2325 [00:00<00:00, 79107.00genome/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 397286/397286 [00:06<00:00, 65175.65contig/s]
Traceback (most recent call last):
File "/SD5/people/s1060627/miniconda3/bin/ppanggolin", line 8, in <module>
sys.exit(main())
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/main.py", line 269, in main
ppanggolin.projection.projection.launch(args)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/projection/projection.py", line 1572, in launch
check_pangenome_info(
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/formats/readBinaries.py", line 1799, in check_pangenome_info
read_pangenome(pangenome, disable_bar=disable_bar, **need_info)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/formats/readBinaries.py", line 1504, in read_pangenome
read_annotation(pangenome, h5f, disable_bar=disable_bar)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/formats/readBinaries.py", line 1244, in read_annotation
genedata_dict = read_genedata(h5f)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/formats/readBinaries.py", line 200, in read_genedata
for row in read_chunks(table, chunk=20000):
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/ppanggolin/formats/readBinaries.py", line 182, in read_chunks
yield from table.read(start=i, stop=i + chunk, field=column)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/tables/table.py", line 1900, in read
arr = self._read(start, stop, step, field, out)
File "/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/tables/table.py", line 1814, in _read
self._read_records(start, stop - start, result)
File "tables/tableextension.pyx", line 645, in tables.tableextension.Table._read_records
tables.exceptions.HDF5ExtError: HDF5 error back trace
File "H5D.c", line 1061, in H5Dread
can't synchronously read data
File "H5D.c", line 1008, in H5D__read_api_common
can't read data
File "H5VLcallback.c", line 2092, in H5VL_dataset_read_direct
dataset read failed
File "H5VLcallback.c", line 2048, in H5VL__dataset_read
dataset read failed
File "H5VLnative_dataset.c", line 363, in H5VL__native_dataset_read
can't read data
File "H5Dio.c", line 383, in H5D__read
can't read data
File "H5Dchunk.c", line 2856, in H5D__chunk_read
unable to read raw data chunk
File "H5Dchunk.c", line 4468, in H5D__chunk_lock
data pipeline read failed
File "H5Z.c", line 1391, in H5Z_pipeline
filter returned failure during read
File "hdf5-blosc2/src/blosc2_filter.c", line 458, in blosc2_filter
Cannot get super-chunk from buffer
End of HDF5 error back trace
Problems reading records.
/SD5/people/s1060627/miniconda3/lib/python3.9/site-packages/tables/file.py:113: UnclosedFileWarning:
Closing remaining open file: /biostress/pangenomics/ICSA/Bacillus_A_bombysepticus_ICSA/Bacillus_A_bombysepticus_ICSA_PPGGv2/pangenome.h5
Can you help me solve this? Thanks for your help, and have a great day C.
Hi Cécile,
Was the pangenome.h5 file generated with a version 2 of PPanGGOLiN?
Best,
David
Hi David,
yes it was produce with v2.2.3
Hello,
From the error message, the issue seems to come from reading the pangenome.h5 file. It may be corrupted or your current environment does not have the correct libraries installed to read it.
Could you please check if the file can be read using another PPanGGOLiN command, such as:
ppanggolin write_pangenome -p pangenome.h5 --stats -o output
Also, note that in version 2 of PPanGGOLiN, the meaning of "projection" has changed. In version 1, it referred to exporting pangenome annotations for the input genomes in a tabular format. In version 2, however, a command called projection has been added to annotate new genomes (not used to build the original pangenome) with an existing pangenome. This projects information such as partitions, spots, modules, and RGPs onto the new genomes. You can find more details in the documentation: https://ppanggolin.readthedocs.io/en/latest/user/projection.html
If you want to retrieve pangenome-based annotations for the original genomes (like in version 1), you can now use the write_genomes command (doc: https://ppanggolin.readthedocs.io/en/latest/user/writeGenomes.html):
The annotation can be exported in several formats:
--tables→ TSV format--gff→ GFF format--proksee→ JSON format compatible with Proksee
For example:
ppanggolin write_genomes -p pangenome.h5 --table -o output_folder
Best regards, Jean