Adjusted homophily and label informativeness
Hi! In a recent paper Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond several characteristics of labeled graphs have been proposed. First, it has been shown that a rarely used in graph ML literature measure named adjusted homophily satisfies more desirable properties than other homophily measures, which makes it appropriate for comparing homophily levels across datasets with different number of classes, class sizes, and degree distributions across classes. Further, a new characteristic - label informativeness - has been proposed. It shows how much information about a node's label we get from knowing its neighbor's label. It has been shown that label informativeness is much more correlated with GNN performance than homophily.
In this PR I've added adjusted homophily and label informativeness to PyG.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
d71bab2) 88.73% compared to head (877cd4c) 88.76%. Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #8308 +/- ##
==========================================
+ Coverage 88.73% 88.76% +0.02%
==========================================
Files 479 480 +1
Lines 29820 29903 +83
==========================================
+ Hits 26462 26543 +81
- Misses 3358 3360 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.