pyclustering
pyclustering copied to clipboard
pyclustering is a Python, C++ data mining library.
I did not find any documentation for how to use CURE with large (2M+) datasets. Simply using the cure algorithm as is defined in cure.py is not feasible since the...
Deprecation warnings are raised due to invalid escape sequences. This can be fixed by using raw strings or escaping the literals. pyupgrade also helps in automatic conversion : https://github.com/asottile/pyupgrade/ This...
Often times the GA will get stuck in a given solution and not optimize any further. In these cases it is very useful to have an *Exit Switch* parameter that...
**Introduction** STING (a STatistical INformation Grid approach) clustering algorithm. The general idea is to divide spatial aria into rectangular cells at different levels of resolution which forms tree structure. Statistical...
**Introduction** Almost all objects are returned from C++ pyclustering to python pyclustering using `pyclustering_package`. See files `ccore/include/interface/pyclustering_package.hpp` and `ccore/src/interface/pyclustering_package.cpp`. **Description** In order to return error message new type of data...
**Introduction** DBSCAN algorithm should be able to accept distance metric in the same way as K-Means, K-Medians, etc. **Description** - [ ] Introduce new optional argument `metric` (`distance_metric` type) under...
**Introduction** Support custom distance metric for KD-tree in order to provide way to use Euclidean, Square Euclidean, Manhattan, Chebyshev and other metrics. **Description** - [ ] Introduce optional parameter `metric`...
**Introduction** Minimum energy method for agglomerative clustering algorithm is required. **Description** - [ ] Minimum energy method for python implementation. - [ ] Minimum energy method for C++ implementation. Resources:...
**Introduction** There is request to support categorical data for Gower distance, like in following article: - https://towardsdatascience.com/clustering-on-mixed-type-data-8bbd0a2569c3 **Description** It is not a big deal to support in python part of...
Hi @annoviko , Does the CURE algorithm implementation sample the dataset irrespective of the dataset size? If yes, what is the sampling size? I tried various parameters including different number...