GraphEmbedding icon indicating copy to clipboard operation
GraphEmbedding copied to clipboard

report the results on all datasets

Open dawnranger opened this issue 6 years ago • 7 comments

Results of node2vec, deewalk, line, sdne and struc2vec on all datasets. Hope this will help anyone who is interested in this project.

wiki

Alg micro macro samples weighted acc NMI
node2vec 0.7447 0.6771 0.7193 0.7450 0.6279 0.3536
deepwalk 0.7307 0.6579 0.7058 0.7296 0.6091 0.3416
line 0.5059 0.2461 0.4536 0.4523 0.3160 0.0798
sdne 0.6916 0.5119 0.6528 0.6718 0.5530 0.1801
struc2vec 0.4512 0.1249 0.3933 0.3383 0.2308 0.0516

brazil

Alg micro macro samples weighted acc NMI
node2vec 0.1481 0.1579 0.1481 0.1648 0.1481 0.0442
deepwalk 0.1852 0.1694 0.1852 0.2004 0.1852 0.0471
line 0.4444 0.4167 0.4444 0.4753 0.4444 0.2822
sdne 0.5926 0.5814 0.5926 0.5928 0.5926 0.4041
struc2vec 0.7778 0.7739 0.7778 0.7762 0.7778 0.3906

europe

Alg micro macro samples weighted acc NMI
node2vec 0.4125 0.4156 0.4125 0.4209 0.4125 0.0155
deepwalk 0.4375 0.4358 0.4375 0.4347 0.4375 0.0180
line 0.5000 0.4983 0.5000 0.5016 0.5000 0.1186
sdne 0.5000 0.4818 0.5000 0.4916 0.5000 0.1714
struc2vec 0.5375 0.5247 0.5375 0.5294 0.5375 0.0783

usa

Alg micro macro samples weighted acc NMI
node2vec 0.5420 0.5278 0.5420 0.5351 0.5420 0.0822
deepwalk 0.5504 0.5394 0.5504 0.5472 0.5504 0.0910
line 0.4160 0.4032 0.4160 0.4175 0.4160 0.1660
sdne 0.6092 0.5819 0.6092 0.5971 0.6092 0.2028
struc2vec 0.5210 0.5040 0.5210 0.5211 0.5210 0.0702

dawnranger avatar Apr 22 '19 12:04 dawnranger

For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?

Volcano-plus avatar Apr 22 '19 14:04 Volcano-plus

For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?

here is the document of parameter average of sklean.metirc.f1_score:

average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] This parameter is required for multiclass/multilabel targets.

  • 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
  • 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
  • 'weighted': Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
  • 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).

So, I think it will get different results in a multiclass case.

dawnranger avatar Apr 23 '19 02:04 dawnranger

@dawnranger That's good,I think you can open a pull request about the results on datasets and the codes to reproduce the results in a new folder.

shenweichen avatar Apr 23 '19 02:04 shenweichen

@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.

Volcano-plus avatar Apr 23 '19 05:04 Volcano-plus

@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.

I think you are right. I use shenweichen's code :

averages = ["micro", "macro", "samples", "weighted"]
results = {}
for average in averages:
    results[average] = f1_score(Y, Y_, average=average)
results['acc'] = accuracy_score(Y,Y_)

and I got a warning with wiki dataset:

python3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.

As discussed in stackoverflow, the ill spliting of the train/test set might be blamed for this issue.

dawnranger avatar Apr 23 '19 06:04 dawnranger

@dawnranger Yes. I found that the classify.py is similar to scoring.py in deepwalk which is provided by writer https://github.com/phanein/deepwalk/blob/master/example_graphs/scoring.py what I was confused is author did not provide the result and the origin of wiki. In addition, I tried data BlogCatalog(multi-lable) as the node2vec paper mentioned, and I set parameter as the paper did(d=128, r=10, l=80, k=10. training percent=50%, p=q=0.25), but I got a 0.12(MacroF1), far from the result which author provided(0.2581). So depressed...

Volcano-plus avatar Apr 23 '19 11:04 Volcano-plus

hello, from these results, the accuracy does not seem to be high, what is the cause, is it a data problem?

960924 avatar Oct 27 '19 08:10 960924