Parallel Algorithms (WISM 459), 2021/2022
Time and place
Every Wednesday from 10.00-12.45 hour on location at Utrecht Science Park:
Room BBG017 (Buys Ballot Building), except on Oct. 13, when we occupy room HFG611.
The format is 2x45 minutes lectures (10-10.45 hour, 11-11.45 hour),
followed by 45 minutes laboratory class (12-12.45 hour),
which is an interactive session. In between we have 15 minute breaks.
We follow the current RIVM rules to fight the pandemic.
Last year's lectures were recorded, and they will be made available
on the Mastermath Vimeo server. If you miss a class because you have a cold, or are otherwise
ill disposed you can view these recordings instead, and contact me for interaction
by any of the other electronic means (email, Teams, Zoom).
But nothing beats a live lecture, with human interaction,
and you will get a lot more from class if you physically attend.
First lecture: September 15, 2021.
Midterm exam: Oct 20, 11.30-13.30 hour. Room EDU ALFA in the Educatorium.
Retake exam: Dec 15, 11.00-13.00 hour. Room Minnaert 206.
The exams will be open book.
This is a Master's course which is an elective course for students
of the Master's programme Mathematical Sciences
of Utrecht University,
part of the track Applied Mathematics, Complex Systems, and
The course is recommended for students in Mathematics or Computer Science
who are interested in scientific computing
and would like to obtain a first `hands-on experience'
in parallel programming.
It is also highly suitable for the Honours master
programme in Theoretical Physics and Mathematical Sciences
and for students interested in Computational Physics.
The course is part of the Dutch National Master
hence is open for all interested master students in Mathematics
in the Netherlands.
Registration for everybody is through Mastermath.
Students will learn how to design a parallel algorithm for a
problem from the area of scientific computing
and how to write a parallel program that
solves the problem.
After completion of the course, the student is able to
design a parallel algorithm for a problem from the area of scientific computing or big data;
analyse the cost of the algorithm in terms of computing time, communication time, and synchronisation time;
- write a parallel program based on an algorithm that solves the problem;
- write a report on the algorithm, its analysis, and numerical experiments performed, in a concise and readable form,
- present the results in an oral presentation, in a clear manner, highlighting the most important ideas.
You get 8 ECTS credit points.
Today, parallel computers are appearing on our desktops. The advent of dualcore, quadcore, and octacore computers and the expected increase in the number of cores in the coming years causes a major change in our approach to software, such as the mathematical software we use in scientific computations and in the emerging area of big data computations.
Parallel computers drastically increase our computational capabilities and thus enable us to model more realistically in many application areas. To make efficient use of parallel computers, it is necessary to reorganise the structure of our computational methods. In particular, attention must be paid to the division of work among the different processors solving a problem in parallel and to the communication between them. Suitable parallel algorithms and systems software are needed to realise the capabilities of parallel computers.
We will discuss extensively the most recent developments in the area of parallel computers, ranging from multicore laptop or desktop PCs to clusters of PCs connected by switching devices, to massively parallel computers with distributed memory such as our national supercomputer Snellius at SURF in Amsterdam, to be officially opened by our Queen on September 16, 2021.
The following subjects will be treated:
- Types of existing parallel computers.
- Principles of parallel computation: distributing the work evenly
and avoiding communication.
- The Bulk Synchronous Parallel (BSP) model as an idealised model
of a parallel computer.
- Use of BSPlib software as the basis for architecture-independent
Parallel algorithms for the following problems: inner product computation, sorting, prime number sieving, LU decomposition, sparse matrix-vector multiplication, iterative solution of linear systems, graph matching.
- Analysis of the computation. communication and synchronisation time requirements of these algorithms by the BSP model.
- Hands-on experience using your own laptop or the Snellius supercomputer.
The course will be given in English.
All reading material will be in English.
Introductory course in linear algebra. Some knowledge of algorithms. Knowledge of a modern programming language such as C, C++, C#, Python, or Java. Basic knowledge of C is helpful, as we will use this language in class, but we will organise the course in such a way that there will be time to adapt to it for those coming from another language.
There are plenty of books on C;
you may consider the book
Practical C Programming by Steve Oualline.
You can also use C++ in class, if you already know that language.
Some knowledge of the UNIX/Linux operating system is helpful.
We will make use of the
MulticoreBSP for C software developed by Albert-Jan Yzelman, version 2.0.4 (March 2019).
This sofware runs on shared-memory multicore PCs, and you can also run
your program without any changes on distributed-memory machines
such as Snellius.
C++ programmers can use
Bulk, version 2.0.0 (March 27, 2020)
as a communications library, instead of MulticoreBSP for C.
It is a very modern library that can run on both shared-memory and distributed-memory machines,
and on hybrids of the two, without any change.
Bulk: a Modern C++ Interface for Bulk-Synchronous Parallel Programs by Jan-Willem Buurlage, Tom Bannink, and Rob H. Bisseling,
in Proceedings Euro-Par 2018, Lecture Notes in Computer Science, Vol. 11014, Springer, 2018, pp. 519-532.
Recommended: Use your own device!
Please install the MulticoreBSP library for C (in most cases)
or Bulk for C++ (only if you are an experienced C++ user).
On Macs and Linux computers this is straightforward.
On Macs you have UNIX through your terminal.
On Windows machines you may want to use the Windows Subsystem for Linux (WSL).
For Windows, it is more difficult to get the software running and
you may want to ask assistance.
You will get access to the national supercomputer Snellius,
where you can
use BSPonMPI which will be installed by SURF,
or Bulk, both giving access to thousands of cores.
The examination is in the form of an initial assignment (20%), a final assignment (40%), a presentation on the final assignment (20%), and a midterm exam (20%). The two assignments are carried out at home.
A written report must be returned for each assignment before the deadline. Students can work individually or in pairs (but not in larger teams) on the computer programs and can hand in a single report, provided each did a fair share of the work and can account for that. The contribution of each individual must then be stated briefly in the report. Each participant has to give an individual presentation on the chosen topic of the final assignment.
Presentations will be scheduled on 8 and 22 December 2021 (with additional
dates outside the regular schedule if needed).
The midterm exam must be passed with a grade of 5.0 (on a scale of 10).
All students should submit reports for the assignments
electronically in PDF format through the ELO system of Mastermath.
There is no further homework.
All students must use LaTeX for the assignments.
The midterm exam can be retaken near the end of the course (in December) by any course participant.
The maximum of the two exam grades counts as the midterm exam grade.
In case of illness, or for another urgent reason, the final presentation can be retaken
on an individual basis in January 2021.
In case of an insufficient final grade for the whole course (<=5), the participant
is entitled to a retake, which means that one assignment can be redone within six weeks
after the final grading of the course.
We closely follow the book
Scientific Computation: A Structured Approach using BSP (second edition)
by Rob H. Bisseling,
Oxford University Press,
September 2020. ISBN 9780198788355 (for the paperback version).
The book is available as paperback, hardcover, ebook, and also in Oxford Scholarship Online.
Supplementary material such as all slides accompanying
the book and the software package BSPedupack 2.0 with all the programs of the book are available on the supplementary material page.
Here you can find all the lecture slides of this course.
Weekly schedule and links
This year we will cover parts of chapters 1, 2, 4, 5
of the second edition.
Wednesday September 15, 2021
The Bulk Synchronous Parallel model
(PSC2 section 1.2)
Parallel Inner Product Computation
(PSC2 section 1.3)
hour. Laboratory class.
Try your luck on the theoretical exercise 1.3.
- Top 500 of supercomputers
in the world.
in Amsterdam, home of the Snellius national supercomputer
which we will use in our laboratory class and homework. It is a Lenovo machine.
Snellius will contain 76,832 AMD cores en 144 Nvidia A100 GPUs.
Its theoretical peak performance will be 14 Pflop/s.
Wednesday September 22, 2021.
BSPlib, the Bulk Synchronous Parallel library.
(PSC2 section 1.4)
(PSC2 sections 1.5-1.7)
Laboratory class. Starting with BSPlib.
This is your first hands-on session with BSPlib.
Download the latest version of
my package of educational programs that
teaches how to use BSP.
Solve Exercise 1.3 from PSC2: try to run the benchmark,
exploring your first parallel environment: your own laptop.
Modify your program to send larger messages.
How does this influence g?
Start working on the first assignment, Exercise 1.7 from PSC2.
Design a parallel algorithm
for prime number generation.
Write your first parallel
This program should generate all prime numbers below 10,000,000 in parallel.
Choosing the right distribution is the name of the game.
You should carry out the experiments of your first assignment
on your own parallel computer.
(This may be your laptop or desktop,
or a compute server of your own university.)
Snellius will only be available for the second assignment.
Hand in a report on Exercise 1.7 presenting your algorithm and the experimental results,
before the deadline Wednesday October 27, 10.00 hour.
Note that you can work in pairs.
Wednesday September 29, 2021.
Parallel sample sort
(PSC2 section 1.8)
Program bspsort and bulk synchronous message passing
(PSC2 section 1.9)
Laboratory class. Discussion of Solution Exercise 1.2
Wednesday October 6, 2021.
Sequential LU Decomposition
(PSC2 sections 2.1-2.2)
Parallel LU Decomposition
(PSC2 section 2.3)
Laboratory class. Discussion of report requirements
Wednesday October 13, 2021. Please note the exceptional location Hans Freudenthal building Room 611
(PSC2 section 2.4)
High-performance LU decomposition
(PSC2 section 2.5)
Laboratory class. Discussion of solution midterm exam from October 24, 2018
We will discuss Questions 2, 3, 4, which are representative for this year's
open-book exam. Please try first to solve the questions yourself.
Wednesday October 20, 2021. Location: Educatorium Alfa
11.30-13.30 Midterm exam on Chapters 1 and 2.
The exact exam material will be announced.
The exam will be held on location and you may use the book either as hardcopy
(easiest to use) or ebook/OSO,
the accompanying slides, your own notes,
but no other materials or the internet.
Interesting links: previous exams from the archive. Exams from 2012, 2013 are only on Chapter 1 of the material (60 minutes).
Exams 2017 on Chapter 1, 3 (90 minutes).
Please note that this year's exam (with open book)
will have no knowledge questions, which is usually the first question
of the exams <= 2019 below.
Wednesday October 27, 2021.
Sequential sparse matrix–vector multiplication
(PSC2 section 4.1)
Sparse matrices and their data structures
(PSC2 section 4.2)
Discussion of midterm exam
Wednesday November 3, 2021.
Parallel sparse matrix-vector multiplication
(PSC2 section 4.3)
Cartesian matrix distributions
(PSC2 section 4.4)
Choose an assignment for your final project. Possibilities:
You can parallelise your own problem, or choose one of the exercises from the book PSC2.
Please try to discuss the chosen topic with the teacher before November 17, 2021 and register
the topic by sending an email to the teacher. You can work in pairs, but have to give individual presentations
on the topic. It is OK for teams to choose the same topic, but please try to be original in your approach.
- solve Exercise 1.5 on parallel sorting algorithms;
- solve Exercise 2.7 on boolean matrix multiplication;
- solve Exercise 2.8 on Strassen matrix multiplication;
- solve Exercise 3.5 on 3D Fast Fourier Transform
- solve Exercise 3.8 on the Number Theoretic Transform
- solve Exercise 3.10 on decimals of pi
- solve Exercise 4.6 on connected components in a 3D image;
- solve Exercise 4.8 on optimal hypergraph partitioning;
- solve Exercise 4.10 on sparse matrix-vector multiplication in C++ using Bulk;
- solve Exercise 5.4 on the maximum independent set problem in graphs;
- solve Exercise 5.5 on solving the n-queens problem
on a chessboard;
- solve Exercise 5.6 on counting self-avoiding walks (SAWs)
on a graph or a 2D or 3D lattice;
- solve Exercise 5.7 on counting triangles in a social network graph;
- solve Exercise 5.8 on computing betweenness centrality on a graph.
- solve Exercise 5.9 on optimal bipartite graph matching.
- participate in the Huawei Hyper challenge
Wednesday November 10, 2021.
Mondriaan sparse matrix distribution
(PSC2 section 4.5)
Medium-grain method for sparse matrix distribution
(PSC2 section 4.6)
Laboratory class. Discussion of first assignment
Wednesday November 17, 2021. Guest lecture Albert-Jan Yzelman (Huawei Research Zurich) "Algebraic programming"
The lecture (2x 45 min) will be followed by a question and answer session.
Albert-Jan Yzelman has a PhD from Utrecht University. He is coauthor of the MulticoreBSP for C library that we use in class and of the Mondriaan package. One of his recent contributions to open-source software is an implementation of the GraphBLAS in C.
Abstract: Algebraic Programming annotates operations with algebraic relations explicitly, while simultaneously maintaining a data-centric view of the containers being operated on. Its systematic application allows automatic parallelisation, vectorisation, and fusion. Programmers, meanwhile, maintain an easy-to-understand view of their codes, with clear semantics that follows basic mathematics.
While the recent GraphBLAS standard promotes the use of generalised linear algebra for graph algorithms specifically, the same concepts could play a role for general-purpose computations. By involving both generic programming techniques and MLIR, we furthermore transcend the boundaries of classical library approaches by, for example, allowing automated optimisations across operations.
Finally, performance semantics attached to algebraic operations allows the systematic performance characterisation of workloads, guides good algorithm design, and may guide automatic compile-time and run-time optimisations.
Wednesday November 24, 2021.
(PSC2 section 4.9)
Sequential graph matching
(PSC2 sections 5.1-5.2)
Discuss your topic for the final project with the teachers.
Wednesday December 1, 2021.
Suitors and sorting
(PSC2 section 5.3)
Parallel graph matching
(PSC2 section 5.4)
Wednesday December 8, 2021.
Presentations final project
Wednesday December 15, 2021. Location BBG020
11.00-13.00 Retake midterm exam
Wednesday December 22, 2021.
More presentations final project
Deadline second assignment Monday January 17, 2022, 12.00 hour.
Hand in a report (through the Mastermath ELO, in PDF) before the deadline.
Only one report per team. No need to send in duplicates.
Please use the batch system of Snellius and not the interactive system
for test runs of your parallel program.
The interactive system is only for development.
Running your parallel program on hundreds of processor cores
will give you a bonus.
Please fill out the Mastermath digital evaluation form,
which will be sent to you around the deadline.
Frequently asked questions
Last update of this page: November 23, 2021
Back to homepage of Rob Bisseling