**data_persistence**'.

**Name:** GridSearchCV
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: MareNostrum5

GridSearch of kNN algorithm for the iris.csv dataset (https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv). This application used dislib-0.9.0

A demonstration workflow for Reduced Order Modeling (ROM) within the eFlows4HPC project, implemented using Kratos Multiphysics, EZyRB, COMPSs, and dislib.

**Type**: COMPSs

**Creators: **Jose Raul Bravo Martinez, Sebastian Ares de Parga Regalado, Riccardo Rossi Bernecoli, Jorge Ejarque

**Submitter**: Raül Sirvent

**Name:** Matrix multiplication with Files
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Matrix multiplication is a binary operation that takes a pair of matrices and produces another matrix.

If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying A and B, the elements ...

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Matrix multiplication with Objects
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Matrix multiplication is a binary operation that takes a pair of matrices and produces another matrix.

If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying A and B, the ...

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Word Count
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Wordcount is an application that counts the number of words for a given set of files.

To allow parallelism every file is treated separately and merged afterwards.

# Execution instructions

Usage:

```
runcompss --lang=python src/wordcount.py datasetPath
```

where:

- datasetPath: Absolute path of the file to parse (e.g. /home/compss/tutorial_apps/python/wordcount/data/) ...

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Contact Person:** support-compss@bsc.es
**Access Level:** public
**License Agreement:** Apache2
**Platform:** COMPSs

# Description

Simple is an application that takes one value and increases it by five units. The purpose of this application is to show how tasks are managed by COMPSs.

# Execution instructions

Usage:

```
runcompss --lang=python src/simple.py initValue
```

where:

- initValue: Initial value for counter

# Execution Examples

```
runcompss --lang=python src/simple.py 1
runcompss
...
```

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Increment
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Increment is an application that takes three different values and increases them a number of given times.

The purpose of this application is to show parallelism between the different increments.

# Execution instructions

Usage:

```
runcompss --lang=python src/increment.py N initValue1 initValue2 initValue3
```

where:

- N: Number of times to increase ...

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Word Count
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Wordcount is an application that counts the number of words for a given set of files.

To allow parallelism the file is divided in blocks that are treated separately and merged afterwards.

Results are printed to a Pickle binary file, so they can be checked using: python -mpickle result.txt

This example also shows how to manually add input or ...

**Type**: COMPSs

**Creators: **Javier Conejero, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

Wordcount reduce version COMPSs application

K-means COMPSs application

Cholesky factorisation COMPSs application

Cluster Comparison COMPSs application

Lysozyme in water sample COMPSs application

PyCOMPSs implementation of Probabilistic Tsunami Forecast (PTF). PTF explicitly treats data- and forecast-uncertainties, enabling alert level definitions according to any predefined level of conservatism, which is connected to the average balance of missed-vs-false-alarms. Run of the Boumerdes-2003 event test-case with 1000 scenarios, 8h tsunami simulation for each and forecast calculations for partial and full ensembles with focal mechanism and tsunami data updates.

**Type**: COMPSs

**Creators: **Louise Cordrie, Jorge Ejarque, Carlos Sánchez Linares, Jacopo Selva, Jorge Macías, Steven J. Gibbons, Fabrizio Bernardi, Roberto Tonini, Rosa M. Badia, Sonia Scardigno, Stefano Lorito, Finn Løvholt, Fabrizio Romano, Manuela Volpe, Alessandro D'Anca, Marc de la Asunción, Manuel J. Castro

**Submitter**: Jorge Ejarque

PyCOMPSs implementation of Probabilistic Tsunami Forecast (PTF). PTF explicitly treats data- and forecast-uncertainties, enabling alert level definitions according to any predefined level of conservatism, which is connected to the average balance of missed-vs-false-alarms. Run of the Kos-Bodrum 2017 event test-case with 1000 scenarios, 8h tsunami simulation for each and forecast calculations for partial and full ensembles with focal mechanism and tsunami data updates.

**Type**: COMPSs

**Creators: **Louise Cordrie, Jorge Ejarque, Carlos Sánchez Linares, Jacopo Selva, Jorge Macías, Steven J. Gibbons, Fabrizio Bernardi, Roberto Tonini, Rosa M. Badia, Sonia Scardigno, Stefano Lorito, Finn Løvholt, Fabrizio Romano, Manuela Volpe, Alessandro D'Anca, Marc de la Asunción, Manuel J. Castro

**Submitter**: Jorge Ejarque

**Name:** Dislib Distributed Training - Cache OFF
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: Minotauro-MN4

PyTorch distributed training of CNN on GPU. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101

Average task execution time: 84 seconds

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** Dislib Distributed Training - Cache ON
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: Minotauro-MN4

PyTorch distributed training of CNN on GPU and leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101

Average task execution time: 36 seconds

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** K-Means GPU Cache ON
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: Minotauro-MN4

K-Means running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9

Average task execution time: 16 seconds

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** K-Means GPU Cache OFF
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: Minotauro-MN4

K-Means running on GPUs. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9

Average task execution time: 194 seconds

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** Matmul GPU Case 1 Cache-OFF
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs 3.3
**Machine**: Minotauro-MN4

Matmul running on the GPU without Cache. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000) B: shape (56_900_000, 10) block_size (11_380_000, 10) C: shape (320, 10) block_size (10, 10) Total dataset size 291 ...

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** Matmul GPU Case 1 Cache-ON
**Contact Person**: cristian.tatu@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: Minotauro-MN4

Matmul running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000) B: shape (56_900_000, 10) block_size (11_380_000, 10) C: shape (320, 10) block_size ...

**Type**: COMPSs

**Creators: **Cristian Tatu, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Cristian Tatu

**Name:** Matrix multiplication with Files, reproducibility example
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Matrix multiplication is a binary operation that takes a pair of matrices and produces another matrix.

If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying ...

COMPSs Matrix Multiplication, out-of-core using files. Hypermatrix size used 2x2 blocks (MSIZE=2), block size used 2x2 elements (BSIZE=2)

**Name:** KMeans
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs
**Machine**: MareNostrum5

KMEans for clustering the housing.csv dataset (https://github.com/sonarsushant/California-House-Price-Prediction/blob/master/housing.csv). This application used dislib-0.9.0

Lysozyme in water full COMPSs application

Lysozyme in water full COMPSs application, using dataset_small

Lysozyme in water full COMPSs application

**Name:** Matrix Multiplication
**Contact Person:** support-compss@bsc.es
**Access Level:** public
**License Agreement:** Apache2
**Platform:** COMPSs

# Description

If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying A and B, the elements of the ...

**Type**: COMPSs

**Creators: **Jorge Ejarque, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing)

**Submitter**: Raül Sirvent

**Name:** SparseLU
**Contact Person:** support-compss@bsc.es
**Access Level:** public
**License Agreement:** Apache2
**Platform:** COMPSs

# Description

The Sparse LU application computes an LU matrix factorization on a sparse blocked matrix. The matrix size (number of blocks) and the block size are parameters of the application.

As the algorithm progresses, the area of the matrix that is accessed is smaller; concretely, at each iteration, the 0th row and column of the current matrix are discarded. ...

**Type**: COMPSs

**Creators: **Jorge Ejarque, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing)

**Submitter**: Raül Sirvent

**Name:** K-means
**Contact Person**: support-compss@bsc.es
**Access Level**: Public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

K-means clustering is a method of cluster analysis that aims to partition ''n'' points into ''k'' clusters in which each point belongs to the cluster with the nearest mean. It follows an iterative refinement strategy to find the centers of natural clusters in the data.

When executed with COMPSs, K-means first generates the input points by means of ...

**Type**: COMPSs

**Creators: **Jorge Ejarque, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Java Wordcount
**Contact Person**: support-compss@bsc.es
**Access Level**: public
**License Agreement**: Apache2
**Platform**: COMPSs

# Description

Wordcount application. There are two versions of Wordcount, depending on how the input data is given.

## Version 1

''Single input file'', where all the text is given in the same file and the chunks are calculated with a BLOCK_SIZE parameter.

## Version 2

''Multiple input files'', where the text fragments are already in different files under ...

**Type**: COMPSs

**Creators: **Jorge Ejarque, The Workflows and Distributed Computing Team (https://www.bsc.es/discover-bsc/organisation/scientific-structure/workflows-and-distributed-computing/)

**Submitter**: Raül Sirvent

**Name:** Matrix Multiplication
**Contact Person:** support-compss@bsc.es
**Access Level:** public
**License Agreement:** Apache2
**Platform:** COMPSs

# Description

If A is an n×m matrix and B is an m×p matrix, the result AB of their multiplication is an n×p matrix defined only if the number of columns m in A is equal to the number of rows m in B. When multiplying A and B, the elements of the ...