User's guide MondriaanOpt

This page is continuously being improved and updated; therefore, a more recent version may be obtained online. This offline version is bundled with the software for your convenience.

Whereas Mondriaan uses heuristics to obtain good partitionings for sparse matrix-vector multiplication for any number of processors, MondriaanOpt calculates an optimal solution to this partitioning problem with 2 processors. More precisely, it calculates a partitioning with minimum communication volume among all solutions that obey the imbalance constraint.

A database with already solved problems with use of MondriaanOpt can be found online.

How to install MondriaanOpt

MondriaanOpt comes packaged with the Mondriaan software. Refer to this page for instructions on using Mondriaan. MondriaanOpt is automatically compiled when you compile Mondriaan. The executable is then available at tools/MondriaanOpt.

How to run MondriaanOpt

The MondriaanOpt program has the following interface:

% ./tools/MondriaanOpt matrix [P [eps]] [options]

One, two or three parameters may be passed, after which further options may be given. Either [eps], -e or -k must be passed, and it is advised to pass -v (see options). Take note that while MondriaanOpt may be called with the same parameters as Mondriaan, the actual problem being solved may be slightly different.

Some equivalent examples are:

% ./tools/MondriaanOpt tests/arc130.mtx 2 0.03 -v 17
% ./tools/MondriaanOpt tests/arc130.mtx -e 0.03 -v 17
% ./tools/MondriaanOpt tests/arc130.mtx -k 660 -v 17

The above examples partition the arc130.mtx matrix (Matrix Market file format) for 2 processors with at most 3% load imbalance, knowing that solutions must exist with volume at most 17. The matrix should be the full relative path; in the above example output is saved in the Mondriaan tests folder (../tests/).

Output

The MondriaanOpt tool yields, after a successful run on an input matrix, various output files. All possible output files are described below. Typically, the output filenames are that of the input matrix filename, modified with a small descriptor and the number of parts (=2).

Formats with free nonzeros

All assigned nonzeros are assigned a processor-number to be assigned to, either 1 or 2. All free nonzeros will be assigned index 3. (Free nonzeros are nonzeros that are not assigned to a processor because assigning it to either one will not influence communication volume.) To stress the potential presence of free nonzeros, the number of processors (2) in the filename is followed by a suffix f.

Processor indices (`-I2f`)

The MondriaanOpt program writes the processor indices of each nonzero to the Matrix Market file input-2f.mtx where the value of each nonzero is replaced by the processor index to which the nonzero has been assigned.

Graphical output (`-2f.svg`)

If the option -svg is given, at the end of the algorithm an SVG graphic is written to the file input-2f.svg, containing a visualisation of the partitioning.

Formats without free nonzeros

Here, the free nonzeros of a partitioning are distributed among the two processors in such a way that load imbalance is kept at a minimum. Note that whenever we write P for the number of processors below, it implicitly equals 2.

Distributed matrix (`-P2`)

The MondriaanOpt program writes the distributed matrix to a file called input-P2, where input is the name of the input matrix. We use an adapted Matrix Market format, with this structure:
%%MatrixMarket distributed-matrix coordinate real general m n nnz P Pstart[0] ( this should be 0 )
...
...
...
Pstart[P]( this should be nnz )
A.i[0] A.j[0] A.value[0] ...
...
...
A.i[nnz-1] A.j[nnz-1] A.value[nnz-1]
Here, Pstart[k] points to the start of the nonzeroes of processor k.

Processor indices (`-I2`)

The MondriaanOpt program also writes the processor indices of each nonzero to the Matrix Market file input-I2 where the value of each nonzero is replaced by the processor index to which the nonzero has been assigned. The order of the nonzeroes is exactly that of the distributed matrix (-P2).

Cartesian submatrices (`-C2`)

The program writes the row index sets I(q) and column index sets J(q) of the Cartesian submatrix I(q) x J(q) for the processors q=1,...,P to the file called input-C2. This file is additional information, useful e.g. for visualisation, and you may not need it.

Graphical output (`-2.svg`)

If the option -svg is given, at the end of the algorithm an SVG graphic is written to the file input-2.svg, containing a visualisation of the partitioning.

Output to `stdout`/`stderr`

In a successful run, at the end of execution general statistics are written to stdout. Also, during such a run, every 2^23 = 8388608 iterations the current depth in the tree is written to stderr, in the format `current depth`/`maximum depth`. Last but not least, every time a new solution is found which improves on the previous solution regarding total volume, a message is written to stderr reporting the newly found volume and load distribution in the format `load P0`, `load P1`, `load Free`.

Program options

The MondriaanOpt program has the following interface:

% ./tools/MondriaanOpt matrix [P [eps]] [options]

One, two or three parameters may be passed, after which further options may be given. An overview of the available parameters and options is given below.

Parameters

Parameter	Name	Description
`matrix`	Matrix file	Required. The input matrix file in Matrix Market (.mtx) format
`P`	Number of processors	Present for consistency with other Mondriaan* commands. This parameter, if given, must be equal to 2.
`eps`	Load imbalance	The maximum allowed load imbalance

Options

Apart from the matrix, at least one of [eps], -e or -k must be given, defining the maximum allowed load imbalance.

Option	Value	Description
-v	Volume	Recommended. The starting upper bound volume. This defaults to `min(m,n)+1`, with `m` and `n` denoting the dimensions of the matrix to be partitioned. While this is a valid upper bound, you may wish to pass a tighter upper bound to reduce computing time.
-e	Load imbalance	The maximum allowed load imbalance
-k	Number of nonzeros	The maximum allowed number of nonzeros per part
-t	Seconds	Max running time in seconds
-h	None	Show help
-svg	None	Write visualisations of the partitioning to `.svg` files

Difference in imbalance constraints

While the command line interface of MondriaanOpt can be used just as Mondriaan, there is a subtle difference in the problem being solved in these two. More precisely, with N being the total number of nonzeros, p being the total number of processors (which equals 2) and load the number of nonzeros assigned to a processor, compare:

the imbalance constraint load <= (1+epsilon) (N/p) which is used in Mondriaan (In the code, this amounts to floor( ((1+epsilon)*N)/p ).), and
the imbalance constraint load <= (1+epsilon) ceil(N/p) which is used in MondriaanOpt.

As p=2, this difference may only be of importance whenever N is odd. In [1,p.2] it is explained that this different choice was made to ensure feasibility of the problem, even if epsilon=0.

Using MondriaanOpt in MATLAB

For more information about MATLAB usage, please see the Mondriaan MATLAB guide. MondriaanOpt is available in Matlab using the MatlabMondriaanOpt MEX routine. Example matlab files are given as mondriaanOpt.m and mondriaanOptPlot.m.

References

[1] An exact algorithm for sparse matrix bipartitioning, Daniel M. Pelt and Rob H. Bisseling, Journal of Parallel and Distributed Computing, 85 (2015) pp. 79-90.

Last updated: September 7, 2017.

November 4, 2016 by Marco van Oort,
September 7, 2017 by Marco van Oort.

To the Mondriaan package home page.