MetaMorpheus icon indicating copy to clipboard operation
MetaMorpheus copied to clipboard

Should we update common biological modifications

Open trishorts opened this issue 2 years ago • 2 comments

while reading katies TD paper it occured to me that maybe we want to update common biological mods. She did in her work, after looking at uniprot xml and taking enough to get 99% of the total. I repeated the analysis myself. We can argue about where to draw the cutoff. Maybe 99% total is good. But maybe you don't want to take any mod that is not present at 1% of the total. Anyway, here is the ranked list with numbers:

Running fraction Running Sum PTM count PTM absolute fraction PTM   TotalPTMs
0.600855779 32859 32859 0.600855779 Phosphoserine   54687
0.708541335 38748 5889 0.107685556 Phosphothreonine    
0.786786622 43027 4279 0.078245287 N6-acetyllysine    
0.824876113 45110 2083 0.038089491 Phosphotyrosine    
0.848062611 46378 1268 0.023186498 N6-succinyllysine    
0.866842211 47405 1027 0.0187796 Omega-N-methylarginine    
0.884177227 48353 948 0.017335016 N-acetylalanine    
0.899665368 49200 847 0.015488142 N-acetylmethionine    
0.909484887 49737 537 0.009819518 Asymmetric dimethylarginine    
0.917713533 50187 450 0.008228647 4-hydroxyproline    
0.925393604 50607 420 0.00768007 N6-(2-hydroxyisobutyryl)lysine    
0.932872529 51016 409 0.007478926 N-acetylserine    
0.937937718 51293 277 0.005065189 N6-(beta-hydroxybutyryl)lysine    
0.942783477 51558 265 0.004845759 N6-lactoyllysine    
0.94761095 51822 264 0.004827473 N6-glutaryllysine    
0.952200706 52073 251 0.004589756 N6-methyllysine    
0.956260172 52295 222 0.004059466 N6-crotonyllysine    
0.95902134 52446 151 0.002761168 Sulfotyrosine    
0.961398504 52576 130 0.002377165 Cysteine methyl ester    
0.963574524 52695 119 0.00217602 N6,N6,N6-trimethyllysine    
0.965713972 52812 117 0.002139448 Citrulline    
0.967780277 52925 113 0.002066305 N6,N6-dimethyllysine    
0.969645437 53027 102 0.00186516 4-carboxyglutamate    
0.971400881 53123 96 0.001755445 Hydroxyproline    
0.973064897 53214 91 0.001664015 Pyrrolidone carboxylic acid    
0.974546053 53295 81 0.001481156 N-acetylthreonine    
0.975972352 53373 78 0.001426299 5-hydroxylysine    
0.977270649 53444 71 0.001298298 Dimethylated arginine    
0.978568947 53515 71 0.001298298 S-nitrosocysteine    
0.979702672 53577 62 0.001133725 N6-butyryllysine    
0.98081811 53638 61 0.001115439 3'-nitrotyrosine    
0.981805548 53692 54 0.000987438 Symmetric dimethylarginine    
0.9827747 53745 53 0.000969152 N6-(pyridoxal phosphate)lysine    
0.983597564 53790 45 0.000822865 N6-malonyllysine    
0.984347285 53831 41 0.000749721 Allysine    
0.985023863 53868 37 0.000676578 ADP-ribosylserine    
0.985627297 53901 33 0.000603434 Deamidated asparagine    
0.986175874 53931 30 0.000548576 N-acetylglycine    
0.986687878 53959 28 0.000512005 N6-methylated lysine    
0.987181597 53986 27 0.000493719 (3S)-3-hydroxyasparagine    
0.98765703 54012 26 0.000475433 N5-methylglutamine    
0.988095891 54036 24 0.000438861 3-hydroxyproline    
0.988516466 54059 23 0.000420575 S-(2-succinyl)cysteine    
0.988918756 54081 22 0.000402289 N-acetylproline    
0.989284473 54101 20 0.000365718 ADP-ribosylarginine    
0.989631905 54120 19 0.000347432 5-glutamyl polyglutamate    
0.989979337 54139 19 0.000347432 ADP-ribosylcysteine    
0.990290197 54156 17 0.00031086 3-oxoalanine (Cys)    
0.990601057 54173 17 0.00031086 PolyADP-ribosyl glutamic acid    
0.990911917 54190 17 0.00031086 Iodotyrosine    
0.991204491 54206 16 0.000292574 Methionine (R)-sulfoxide    
0.991497065 54222 16 0.000292574 ADP-ribosyl glutamic acid    
0.991753067 54236 14 0.000256002 N-acetylvaline    
0.991990784 54249 13 0.000237716 (Microbial infection) O-acetylthreonine    
0.9922285 54262 13 0.000237716 Deamidated glutamine    
0.992466217 54275 13 0.000237716 Phenylalanine amide    
0.992667362 54286 11 0.000201145 N6-propionyllysine    
0.992868506 54297 11 0.000201145 N-acetylcysteine    
0.993069651 54308 11 0.000201145 Pros-methylhistidine    
0.993270796 54319 11 0.000201145 N6-(retinylidene)lysine    
0.99347194 54330 11 0.000201145 5-glutamyl serotonin    
0.993654799 54340 10 0.000182859 (3R)-3-hydroxyasparagine    
0.993837658 54350 10 0.000182859 Cysteine sulfenic acid (-SOH)    
0.994020517 54360 10 0.000182859 S-glutathionyl cysteine    
0.994203376 54370 10 0.000182859 (Microbial infection) O-acetylserine    
0.994367949 54379 9 0.000164573 Cysteine sulfonic acid (-SO3H)    
0.994532521 54388 9 0.000164573 Diiodotyrosine    
0.994678808 54396 8 0.000146287 N,N,N-trimethylalanine    
0.994825096 54404 8 0.000146287 Methionine amide    
0.994971383 54412 8 0.000146287 5-glutamyl dopamine    
0.99511767 54420 8 0.000146287 Methionine sulfoxide    
0.995263957 54428 8 0.000146287 2',4',5'-topaquinone    
0.995410244 54436 8 0.000146287 Tele-methylhistidine    
0.995556531 54444 8 0.000146287 (3R)-3-hydroxyaspartate    
0.995702818 54452 8 0.000146287 Proline amide    
0.995830819 54459 7 0.000128001 ADP-ribosyl aspartic acid    
0.99595882 54466 7 0.000128001 N6-(ADP-ribosyl)lysine    
0.996068535 54472 6 0.000109715 Tyrosine amide    
0.996178251 54478 6 0.000109715 S-methylcysteine    
0.996287966 54484 6 0.000109715 N6-lipoyllysine    
0.996397681 54490 6 0.000109715 5-glutamyl glycerylphosphorylethanolamine    
0.996489111 54495 5 9.14294E-05 ADP-ribosylasparagine    
0.99658054 54500 5 9.14294E-05 Omega-N-methylated arginine    
0.99667197 54505 5 9.14294E-05 Glycine amide    
0.996763399 54510 5 9.14294E-05 Leucine amide    
0.996854828 54515 5 9.14294E-05 O-(pantetheine 4'-phosphoryl)serine    
0.996946258 54520 5 9.14294E-05 N,N,N-trimethylglycine    
0.997037687 54525 5 9.14294E-05 N6-biotinyllysine    
0.997129117 54530 5 9.14294E-05 5-glutamyl glycine    
0.997220546 54535 5 9.14294E-05 Thyroxine    
0.99729369 54539 4 7.31435E-05 (Microbial infection) O-AMP-tyrosine    
0.997366833 54543 4 7.31435E-05 O-AMP-tyrosine    
0.997439977 54547 4 7.31435E-05 ADP-ribosylglycine    
0.99751312 54551 4 7.31435E-05 (3S)-3-hydroxyhistidine    
0.997586264 54555 4 7.31435E-05 (Microbial infection) Deamidated asparagine    
0.997659407 54559 4 7.31435E-05 Phosphohistidine    
0.997732551 54563 4 7.31435E-05 Arginine amide    
0.997805694 54567 4 7.31435E-05 Triiodothyronine    
0.997878838 54571 4 7.31435E-05 S-(2,3-dicarboxypropyl)cysteine    
0.997933695 54574 3 5.48576E-05 (Microbial infection) O-(2-cholinephosphoryl)serine    
0.997988553 54577 3 5.48576E-05 (Microbial infection) O-AMP-threonine    
0.998043411 54580 3 5.48576E-05 (Microbial infection) S-methylcysteine    
0.998098268 54583 3 5.48576E-05 Cysteine persulfide    
0.998153126 54586 3 5.48576E-05 Tele-8alpha-FAD histidine    
0.998207984 54589 3 5.48576E-05 S-8alpha-FAD cysteine    
0.998262841 54592 3 5.48576E-05 Isoleucine amide    
0.998317699 54595 3 5.48576E-05 (Microbial infection) Deamidated glutamine    
0.998372557 54598 3 5.48576E-05 O-AMP-threonine    
0.998427414 54601 3 5.48576E-05 3-hydroxyasparagine    
0.998482272 54604 3 5.48576E-05 Valine amide    
0.998537129 54607 3 5.48576E-05 Aspartate 1-(chondroitin 4-sulfate)-ester    
0.998591987 54610 3 5.48576E-05 Asparagine amide    
0.998646845 54613 3 5.48576E-05 N6-carboxylysine    
0.998701702 54616 3 5.48576E-05 Leucine methyl ester    
0.99875656 54619 3 5.48576E-05 S-cysteinyl cysteine    
0.998811418 54622 3 5.48576E-05 (Microbial infection) Phosphothreonine    
0.998866275 54625 3 5.48576E-05 N-acetylglutamate    
0.998921133 54628 3 5.48576E-05 Hypusine    
0.998957705 54630 2 3.65718E-05 1-thioglycine    
0.998994277 54632 2 3.65718E-05 Cysteine sulfinic acid (-SO2H)    
0.999030848 54634 2 3.65718E-05 Sulfoserine    
0.99906742 54636 2 3.65718E-05 Lysine amide    
0.999103992 54638 2 3.65718E-05 Alanine amide    
0.999140564 54640 2 3.65718E-05 Diphosphoserine    
0.999177135 54642 2 3.65718E-05 5-glutamyl histamine    
0.999213707 54644 2 3.65718E-05 (3R)-3-hydroxyarginine    
0.999250279 54646 2 3.65718E-05 Blocked amino end (Ser)    
0.999286851 54648 2 3.65718E-05 ADP-ribosyltyrosine    
0.999323422 54650 2 3.65718E-05 Pyruvic acid (Ser)    
0.999359994 54652 2 3.65718E-05 (3S)-3-hydroxylysine    
0.99937828 54653 1 1.82859E-05 N,N-dimethylproline    
0.999396566 54654 1 1.82859E-05 (Microbial infection) ADP-ribosylasparagine    
0.999414852 54655 1 1.82859E-05 5-glutamyl glutamate    
0.999433138 54656 1 1.82859E-05 2,3-didehydroalanine (Ser)    
0.999451424 54657 1 1.82859E-05 Pyrrolidone carboxylic acid (Glu)    
0.999469709 54658 1 1.82859E-05 O-AMP-serine    
0.999487995 54659 1 1.82859E-05 Glutamic acid 1-amide    
0.999506281 54660 1 1.82859E-05 S-(dipyrrolylmethanemethyl)cysteine    
0.999524567 54661 1 1.82859E-05 (Microbial infection) ADP-riboxanated arginine    
0.999542853 54662 1 1.82859E-05 4-hydroxylysine    
0.999561139 54663 1 1.82859E-05 (Microbial infection) N6-acetyllysine    
0.999579425 54664 1 1.82859E-05 Beta-decarboxylated aspartate    
0.999597711 54665 1 1.82859E-05 N,N-dimethylglycine    
0.999615996 54666 1 1.82859E-05 N-methylglycine    
0.999634282 54667 1 1.82859E-05 N6-1-carboxyethyl lysine    
0.999652568 54668 1 1.82859E-05 N-pyruvate 2-iminyl-valine    
0.999670854 54669 1 1.82859E-05 Thiazolidine linkage to a ring-opened DNA abasic site    
0.99968914 54670 1 1.82859E-05 Hydroxyarginine    
0.999707426 54671 1 1.82859E-05 (3S)-3-hydroxyaspartate    
0.999725712 54672 1 1.82859E-05 Glutamine amide    
0.999743998 54673 1 1.82859E-05 Blocked amino end (Thr)    
0.999762284 54674 1 1.82859E-05 PolyADP-ribosyl aspartic acid    
0.999780569 54675 1 1.82859E-05 N4,N4-dimethylasparagine    
0.999798855 54676 1 1.82859E-05 O-acetylserine    
0.999817141 54677 1 1.82859E-05 O-(2-cholinephosphoryl)serine    
0.999835427 54678 1 1.82859E-05 N,N,N-trimethylserine    
0.999853713 54679 1 1.82859E-05 N,N-dimethylserine    
0.999871999 54680 1 1.82859E-05 N-methylserine    
0.999890285 54681 1 1.82859E-05 S-cGMP-cysteine    
0.999908571 54682 1 1.82859E-05 Glycyl adenylate    
0.999926856 54683 1 1.82859E-05 (4R)-5-hydroxyleucine    
0.999945142 54684 1 1.82859E-05 (4R)-5-oxoleucine    
0.999963428 54685 1 1.82859E-05 (Microbial infection) ADP-ribosyldiphthamide    
0.999981714 54686 1 1.82859E-05 Diphthamide    
1 54687 1 1.82859E-05 N-acetylaspartate    

trishorts avatar May 04 '22 15:05 trishorts

Hi, What is the data source?

Apirog9 avatar Jun 20 '22 12:06 Apirog9

It's from the human uniprot xml. I mined the info from it using text manipulation.

trishorts avatar Jun 21 '22 14:06 trishorts