data_hacking
data_hacking copied to clipboard
Data Hacking Project
hi may I ask you how to generate the data '*.results' in PE_clustering.ipynb or where to download it? thank u!
The simple stats module is a bit of a mess and needs to have both the internals cleaned up and the outputs changed. Suggested outputs: - contingency table (raw counts)...
The Browser Fingerprinting notebook is our most experimental notebook, there are several issues with it. The RegExp gen is kinda broken so the Validation section at the end validates that...
Right now the HClustering often generates long 'strings' of graphs where the depth of the graph is very deep and each level is only a binary split with just one/few...
In some cases the agg_sim parameter that should control the number of items before splitting off a new subtree doesn't seem to work properly.