Java COMPSs K-means clustering example (executed at Marenostrum IV supercomputer, inputs generated by the code)
Version 1

Workflow Type: COMPSs
Stable

Name: K-means
Contact Person: support-compss@bsc.es
Access Level: Public
License Agreement: Apache2
Platform: COMPSs

Description

K-means clustering is a method of cluster analysis that aims to partition ''n'' points into ''k'' clusters in which each point belongs to the cluster with the nearest mean. It follows an iterative refinement strategy to find the centers of natural clusters in the data.

When executed with COMPSs, K-means first generates the input points by means of initialization tasks. For parallelism purposes, the points are split in a number of fragments received as parameter, each fragment being created by an initialization task and filled with random points.

After the initialization, the algorithm goes through a set of iterations. In every iteration, a computation task is created for each fragment; then, there is a reduction phase where the results of each computation are accumulated two at a time by merge tasks; finally, at the end of the iteration the main program post-processes the merged result, generating the current clusters that will be used in the next iteration. Consequently, if ''F'' is the total number of fragments, K-means generates ''F'' computation tasks and ''F-1'' merge tasks per iteration.

Execution instructions

Usage:

runcompss --classpath=application_sources/jar/kmeans.jar kmeans.KMeans <...>

where ''<...>'':

  • -c Number of clusters
  • -i Number of iterations
  • -n Number of points
  • -d Number of dimensions
  • -f Number of fragments

Execution Examples

runcompss --classpath=application_sources/jar/kmeans.jar kmeans.KMeans
runcompss --classpath=application_sources/jar/kmeans.jar kmeans.KMeans -c 4 -i 10 -n 2000 -d 2 -f 2

Build

Option 1: Native java

cd application_sources/; javac src/main/java/kmeans/*.java
cd src/main/java/; jar cf kmeans.jar kmeans/
cd ../../../; mv src/main/java/kmeans.jar jar/

Option 2: Maven

cd application_sources/
mvn clean package

Click and drag the diagram to pan, double click or use the controls to zoom.

Version History

Ran with COMPSs 3.3 (earliest) Created 10th Nov 2023 at 15:14 by Raül Sirvent

Executed with COMPSs 3.3 at Marenostrum IV


Frozen Ran-with-COMPSs-3.3 35e5c74
help Creators and Submitter
Creator
Additional credit

The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

Submitter
Citation
Ejarque, J. (2023). Java COMPSs K-means clustering example (executed at Marenostrum IV supercomputer, inputs generated by the code). WorkflowHub. https://doi.org/10.48546/WORKFLOWHUB.WORKFLOW.663.1
Activity

Views: 795

Created: 10th Nov 2023 at 15:14

help Attributions

None

Total size: 120 KB
Powered by
(v.1.14.1)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH