No Description
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Titouan Labourdette e3009c62af 1er commit 2 years ago
..
src 1er commit 2 years ago
.gitignore 1er commit 2 years ago
README-old.asc 1er commit 2 years ago
README.md 1er commit 2 years ago
consensus 1er commit 2 years ago
evolve-sc 1er commit 2 years ago
nb-configuration.xml 1er commit 2 years ago
pom.xml 1er commit 2 years ago
run 1er commit 2 years ago
updreadme.rb 1er commit 2 years ago

README.md

Clustering benchmarks

Datasets

This project contains collection of labeled clustering problems that can be found in the literature. Most of datasets were artificially created.

The benchmark includes:

Artificial data

2d-10c 2d-20c-no0 2d-3c-no123 2d-4c-no4 2d-4c-no9 2d-4c 2sp2glob 3-spiral 3MC D31 DS577 DS850 R15 aggregation atom banana birch-rg1 birch-rg2 birch-rg3 chainlink cluto-t4.8k cluto-t5.8k cluto-t7.10k cluto-t8.8k complex8 complex9 compound cure-t0-2000n-2D cure-t1-2000n-2D cure-t2-4k curves1 curves2 dartboard1 dartboard2 dense-disk-3000 dense-disk-5000 diamond9 disk-1000n disk-3000n disk-4000n disk-4500n disk-4600n disk-5000n disk-6000n donut1 donut2 donut3 donutcurves ds2c2sc13 ds3c3sc6 ds4c2sc8 elliptical_10_2 elly-2d10c13s engytime flame fourty golfball hepta insect jain long1 long2 long3 longsquare lsun mopsi-finland mopsi-joensuu pathbased rings s-set1 s-set2 s-set3 s-set4 sizes1 sizes2 sizes3 sizes4 sizes5 smile1 smile2 smile3 spherical_4_3 spherical_5_2 spherical_6_2 spiral spiralsquare square1 square2 square3 square4 square5 st900 target tetra triangle1 triangle2 twenty twodiamonds wingnut xclara zelnik1 zelnik2 zelnik3 zelnik4 zelnik5 zelnik6

Experiments

This project contains set of clustering methods benchmarks on various dataset. The project is dependent on Clueminer project.

in order to run benchmark compile dependencies into a single JAR file:

mvn assembly:assembly

Consensus experiment

allows running repeated runs of the same algorithm:

./run consensus --dataset "triangle1" --repeat 10

by default k-means algorithm is used.

For available datasets see resources folder.