HapHiC icon indicating copy to clipboard operation
HapHiC copied to clipboard

Groups length shows gradient distribution

Open yplee614 opened this issue 1 year ago • 1 comments

I am working on assembling an autohexaploid plant genome with a genome size of ~2.7 Gb by flow cytometry. In addition, the one published closely related species has 2n=6X=90 karyotype, 440 Mb haploid genome size.

I use Haphic clustering model to cluster the contigs (3213 contains, N50=6,639,019bp) into groups, but the group length were shows gradient distribution, which were not grouped together. As a result, I got 354 groups. Below shows the lengths distribution:

group100_7186686bp.txt group101_7159186bp.txt group10_20407479bp.txt group102_7155213bp.txt group103_7145455bp.txt group104_7135638bp.txt group105_7026823bp.txt group106_6852057bp.txt group107_6686659bp.txt group108_6639019bp.txt group109_6580999bp.txt group110_6428437bp.txt group111_6427677bp.txt group11_19902768bp.txt group112_6291399bp.txt group113_6285053bp.txt group114_6190659bp.txt group115_6187999bp.txt group116_6172906bp.txt group117_6141735bp.txt group118_6128549bp.txt group119_6125382bp.txt group120_6013418bp.txt group121_5924620bp.txt group12_19557259bp.txt group122_5874612bp.txt group123_5818126bp.txt group124_5811366bp.txt group125_5744960bp.txt group126_5744730bp.txt group1_27223611bp.txt group127_5672859bp.txt group128_5636760bp.txt group129_5593658bp.txt group130_5549640bp.txt group131_5526988bp.txt group13_19311041bp.txt group132_5487525bp.txt group133_5484683bp.txt group134_5436921bp.txt group135_5410556bp.txt group136_5399992bp.txt group137_5372189bp.txt group138_5361883bp.txt group139_5334807bp.txt group140_5296389bp.txt group141_5289137bp.txt group14_18547765bp.txt group142_5283331bp.txt group143_5267077bp.txt group144_5181430bp.txt group145_5167677bp.txt group146_5159276bp.txt group147_5143313bp.txt group148_5137580bp.txt group149_5118094bp.txt group150_4965186bp.txt group151_4917290bp.txt group15_18482670bp.txt group152_4896147bp.txt group153_4842815bp.txt group154_4786642bp.txt group155_4766777bp.txt group156_4759693bp.txt group157_4757006bp.txt group158_4669444bp.txt group159_4641357bp.txt group160_4641064bp.txt group161_4633905bp.txt group16_17718692bp.txt group162_4554138bp.txt group163_4532771bp.txt group164_4522270bp.txt group165_4477309bp.txt group166_4436397bp.txt group167_4324735bp.txt group168_4308274bp.txt group169_4293925bp.txt group170_4292715bp.txt group171_4283839bp.txt group17_17600983bp.txt group172_4277923bp.txt group173_4255564bp.txt group174_4234340bp.txt group175_4177805bp.txt group176_4176712bp.txt group177_4101442bp.txt group178_4100440bp.txt group179_3992981bp.txt group180_3956939bp.txt group181_3865792bp.txt group18_17578105bp.txt group182_3801601bp.txt group183_3781363bp.txt group184_3772069bp.txt group185_3750342bp.txt group186_3686059bp.txt group187_3649779bp.txt group188_3647162bp.txt group189_3605804bp.txt group190_3542082bp.txt group191_3513370bp.txt group19_17141379bp.txt group192_3482920bp.txt group193_3480926bp.txt group194_3468639bp.txt group195_3416359bp.txt group196_3411821bp.txt group197_3403141bp.txt group198_3368169bp.txt group199_3358795bp.txt group200_3337206bp.txt group201_3334284bp.txt group20_16855502bp.txt group202_3317409bp.txt group203_3306292bp.txt group204_3297089bp.txt group205_3275821bp.txt group206_3273617bp.txt group207_3262028bp.txt group208_3255136bp.txt group209_3242261bp.txt group210_3229716bp.txt group211_3229011bp.txt group21_16577387bp.txt group212_3197744bp.txt group213_3197065bp.txt group214_3155836bp.txt group215_3154564bp.txt group216_3153251bp.txt group217_3141747bp.txt group218_3134627bp.txt group219_3085908bp.txt group220_3039043bp.txt group221_3003820bp.txt group22_16542594bp.txt group222_2991254bp.txt group2_23038324bp.txt group223_2963460bp.txt group224_2925810bp.txt group225_2913276bp.txt group226_2883208bp.txt group227_2827760bp.txt group228_2823559bp.txt group229_2803487bp.txt group230_2798930bp.txt group231_2793094bp.txt group23_16394557bp.txt group232_2786826bp.txt group233_2775543bp.txt group234_2758120bp.txt group235_2731217bp.txt group236_2708130bp.txt group237_2695701bp.txt group238_2688134bp.txt group239_2686455bp.txt group240_2670518bp.txt group241_2670424bp.txt group24_16176119bp.txt group242_2663238bp.txt group243_2654892bp.txt group244_2643656bp.txt group245_2636287bp.txt group246_2633854bp.txt group247_2592924bp.txt group248_2589972bp.txt group249_2563117bp.txt group250_2545267bp.txt group251_2533688bp.txt group25_16034444bp.txt group252_2525268bp.txt group253_2517160bp.txt group254_2515411bp.txt group255_2511229bp.txt group256_2480673bp.txt group257_2476940bp.txt group258_2467231bp.txt group259_2453440bp.txt group260_2450386bp.txt group261_2428853bp.txt group26_15952192bp.txt group262_2381638bp.txt group263_2367241bp.txt group264_2353738bp.txt group265_2333665bp.txt group266_2316028bp.txt group267_2314616bp.txt group268_2292317bp.txt group269_2292271bp.txt group270_2277152bp.txt group271_2264565bp.txt group27_15527612bp.txt group272_2262197bp.txt group273_2260830bp.txt group274_2253715bp.txt group275_2251127bp.txt group276_2235974bp.txt group277_2198477bp.txt group278_2182545bp.txt group279_2137279bp.txt group280_2134442bp.txt group281_2121999bp.txt group28_15487599bp.txt group282_2098427bp.txt group283_2090521bp.txt group284_2089013bp.txt group285_2081106bp.txt group286_2072448bp.txt group287_2059021bp.txt group288_2058192bp.txt group289_2043588bp.txt group290_2037315bp.txt group291_2009329bp.txt group29_15430726bp.txt group292_2004691bp.txt group293_1969770bp.txt group294_1959946bp.txt group295_1958238bp.txt group296_1952981bp.txt group297_1928768bp.txt group298_1928758bp.txt group299_1917635bp.txt group300_1910871bp.txt group301_1887201bp.txt group30_15245801bp.txt group302_1883731bp.txt group303_1871064bp.txt group304_1867822bp.txt group305_1855604bp.txt group306_1853936bp.txt group307_1795602bp.txt group308_1785186bp.txt group309_1772818bp.txt group310_1761124bp.txt group311_1745534bp.txt group31_15237405bp.txt group312_1729250bp.txt group313_1719992bp.txt group314_1716994bp.txt group315_1714641bp.txt group316_1702778bp.txt group317_1692387bp.txt group318_1690676bp.txt group319_1690505bp.txt group320_1663421bp.txt group321_1637806bp.txt group32_15034650bp.txt group322_1633930bp.txt group3_22896452bp.txt group323_1584337bp.txt group324_1580747bp.txt group325_1544010bp.txt group326_1455631bp.txt group327_1403362bp.txt group328_1401128bp.txt group329_1382500bp.txt group330_1356118bp.txt group331_1345363bp.txt group33_14847358bp.txt group332_1325967bp.txt group333_1303129bp.txt group334_1291858bp.txt group335_1290374bp.txt group336_1287107bp.txt group337_1275937bp.txt group338_1274099bp.txt group339_1270066bp.txt group340_1227823bp.txt group341_1198692bp.txt group34_14630131bp.txt group342_1196991bp.txt group343_1187456bp.txt group344_1140458bp.txt group345_1127416bp.txt group346_1123709bp.txt group347_1081873bp.txt group348_1078765bp.txt group349_1076052bp.txt group350_1066342bp.txt group351_1057554bp.txt group35_14390218bp.txt group352_1049359bp.txt group353_1040759bp.txt group354_1037372bp.txt group36_14316780bp.txt group37_14313029bp.txt group38_13583722bp.txt group39_13492278bp.txt group40_13489476bp.txt group41_13344672bp.txt group42_12944416bp.txt group4_22334609bp.txt group43_12825347bp.txt group44_12753099bp.txt group45_12526599bp.txt group46_12501681bp.txt group47_12482280bp.txt group48_12406955bp.txt group49_12218706bp.txt group50_12200164bp.txt group51_12189474bp.txt group52_11995067bp.txt group5_21518842bp.txt group53_11625954bp.txt group54_11607728bp.txt group55_11423141bp.txt group56_10813431bp.txt group57_10304968bp.txt group58_9998414bp.txt group59_9958430bp.txt group60_9812528bp.txt group61_9772984bp.txt group6_21222668bp.txt group62_9770625bp.txt group63_9734723bp.txt group64_9598647bp.txt group65_9559486bp.txt group66_9467174bp.txt group67_9459827bp.txt group68_9391501bp.txt group69_9374958bp.txt group70_9330067bp.txt group71_9304462bp.txt group7_21059450bp.txt group72_9264083bp.txt group73_9194160bp.txt group74_9138828bp.txt group75_9126118bp.txt group76_9123143bp.txt group77_8992036bp.txt group78_8828217bp.txt group79_8615784bp.txt group80_8603584bp.txt group81_8503679bp.txt group8_20616906bp.txt group82_8431643bp.txt group83_8351575bp.txt group84_8344359bp.txt group85_8334063bp.txt group86_8333938bp.txt group87_8317060bp.txt group88_8293020bp.txt group89_8254624bp.txt group90_8160708bp.txt group91_7822992bp.txt group9_20524713bp.txt group92_7753761bp.txt group93_7730299bp.txt group94_7691508bp.txt group95_7621242bp.txt group96_7559192bp.txt group97_7504921bp.txt group98_7486754bp.txt group99_7448074bp.txt

yplee614 avatar Oct 03 '24 17:10 yplee614

but the group length were shows gradient distribution

This is normal. The groups output by HapHiC are sorted by group length by default.

As a result, I got 354 groups.

I'm not sure whether the group lengths you listed are the results of 01.cluster or 02.reassign. If they are the results of 01.cluster, this could be normal. In 01.cluster, HapHiC only performs a preliminary clustering using MCL, so the results may not be at the chromosome level. When the group number (354) exceeds the expected number of chromosomes (90), HapHiC will execute an additional AHC to cluster these groups into chromosomes in 02.reassign. You can check out the results in 02.cluster/final_groups. If the results are already the final groups, could you please upload the full logs and show me the commands you used for Hi-C read mapping and filtering?

zengxiaofei avatar Oct 08 '24 01:10 zengxiaofei

Close this issue as there has been no response for two weeks.

zengxiaofei avatar Oct 21 '24 01:10 zengxiaofei