Parallel Algorithms (WISM 459), 2021/2022


Teacher

Rob Bisseling.

Teaching assistant

Katharina Klein

Time and place

Every Wednesday from 10.00-12.45 hour on location at Utrecht Science Park: Room BBG017 (Buys Ballot Building), except on Oct. 13, when we occupy room HFG611. The format is 2x45 minutes lectures (10-10.45 hour, 11-11.45 hour), followed by 45 minutes laboratory class (12-12.45 hour), which is an interactive session. In between we have 15 minute breaks. We follow the current RIVM rules to fight the pandemic. Last year's lectures were recorded, and they will be made available on the Mastermath Vimeo server. If you miss a class because you have a cold, or are otherwise ill disposed you can view these recordings instead, and contact me for interaction by any of the other electronic means (email, Teams, Zoom). But nothing beats a live lecture, with human interaction, and you will get a lot more from class if you physically attend. First lecture: September 15, 2021.

Midterm exam: Oct 20, 11.30-13.30 hour. Room EDU ALFA in the Educatorium. Retake exam: Dec 15, 11.00-13.00 hour. Room Minnaert 206. The exams will be open book.

Intended audience

This is a Master's course which is an elective course for students of the Master's programme Mathematical Sciences of Utrecht University, part of the track Applied Mathematics, Complex Systems, and Scientific Computing. The course is recommended for students in Mathematics or Computer Science who are interested in scientific computing and would like to obtain a first `hands-on experience' in parallel programming. It is also highly suitable for the Honours master programme in Theoretical Physics and Mathematical Sciences and for students interested in Computational Physics. The course is part of the Dutch National Master Mastermath and hence is open for all interested master students in Mathematics in the Netherlands. Registration for everybody is through Mastermath.

Course aim

Students will learn how to design a parallel algorithm for a problem from the area of scientific computing and how to write a parallel program that solves the problem.

Learning goals

After completion of the course, the student is able to

Credits

You get 8 ECTS credit points.

Contents

Today, parallel computers are appearing on our desktops. The advent of dualcore, quadcore, and octacore computers and the expected increase in the number of cores in the coming years causes a major change in our approach to software, such as the mathematical software we use in scientific computations and in the emerging area of big data computations.

Parallel computers drastically increase our computational capabilities and thus enable us to model more realistically in many application areas. To make efficient use of parallel computers, it is necessary to reorganise the structure of our computational methods. In particular, attention must be paid to the division of work among the different processors solving a problem in parallel and to the communication between them. Suitable parallel algorithms and systems software are needed to realise the capabilities of parallel computers.

We will discuss extensively the most recent developments in the area of parallel computers, ranging from multicore laptop or desktop PCs to clusters of PCs connected by switching devices, to massively parallel computers with distributed memory such as our national supercomputer Snellius at SURF in Amsterdam, to be officially opened by our Queen on September 16, 2021.

The following subjects will be treated:

Language

The course will be given in English. All reading material will be in English.

Prerequisites

Introductory course in linear algebra. Some knowledge of algorithms. Knowledge of a modern programming language such as C, C++, C#, Python, or Java. Basic knowledge of C is helpful, as we will use this language in class, but we will organise the course in such a way that there will be time to adapt to it for those coming from another language. There are plenty of books on C; you may consider the book Practical C Programming by Steve Oualline. You can also use C++ in class, if you already know that language. Some knowledge of the UNIX/Linux operating system is helpful.

Software

We will make use of the MulticoreBSP for C software developed by Albert-Jan Yzelman, version 2.0.4 (March 2019). This sofware runs on shared-memory multicore PCs, and you can also run your program without any changes on distributed-memory machines such as Snellius.

C++ programmers can use Bulk, version 2.0.0 (March 27, 2020) as a communications library, instead of MulticoreBSP for C. It is a very modern library that can run on both shared-memory and distributed-memory machines, and on hybrids of the two, without any change. See Bulk: a Modern C++ Interface for Bulk-Synchronous Parallel Programs by Jan-Willem Buurlage, Tom Bannink, and Rob H. Bisseling, in Proceedings Euro-Par 2018, Lecture Notes in Computer Science, Vol. 11014, Springer, 2018, pp. 519-532. Published version.

Hardware

Recommended: Use your own device! Please install the MulticoreBSP library for C (in most cases) or Bulk for C++ (only if you are an experienced C++ user). On Macs and Linux computers this is straightforward. On Macs you have UNIX through your terminal. On Windows machines you may want to use the Windows Subsystem for Linux (WSL). For Windows, it is more difficult to get the software running and you may want to ask assistance.

You will get access to the national supercomputer Snellius, where you can use BSPonMPI which will be installed by SURF, or Bulk, both giving access to thousands of cores.

Examination

The examination is in the form of an initial assignment (20%), a final assignment (40%), a presentation on the final assignment (20%), and a midterm exam (20%). The two assignments are carried out at home. A written report must be returned for each assignment before the deadline. Students can work individually or in pairs (but not in larger teams) on the computer programs and can hand in a single report, provided each did a fair share of the work and can account for that. The contribution of each individual must then be stated briefly in the report. Each participant has to give an individual presentation on the chosen topic of the final assignment. Presentations will be scheduled on 8 and 22 December 2021 (with additional dates outside the regular schedule if needed). The midterm exam must be passed with a grade of 5.0 (on a scale of 10). All students should submit reports for the assignments electronically in PDF format through the ELO system of Mastermath. There is no further homework. All students must use LaTeX for the assignments.

The midterm exam can be retaken near the end of the course (in December) by any course participant. The maximum of the two exam grades counts as the midterm exam grade. In case of illness, or for another urgent reason, the final presentation can be retaken on an individual basis in January 2021. In case of an insufficient final grade for the whole course (<=5), the participant is entitled to a retake, which means that one assignment can be redone within six weeks after the final grading of the course.

Literature

We closely follow the book Parallel Scientific Computation: A Structured Approach using BSP (second edition) (PSC2), by Rob H. Bisseling, Oxford University Press, September 2020. ISBN 9780198788355 (for the paperback version). The book is available as paperback, hardcover, ebook, and also in Oxford Scholarship Online.

Supplementary material such as all slides accompanying the book and the software package BSPedupack 2.0 with all the programs of the book are available on the supplementary material page. Here you can find all the lecture slides of this course.

Weekly schedule and links

This year we will cover parts of chapters 1, 2, 4, 5 of the second edition.

Wednesday September 15, 2021

The Bulk Synchronous Parallel model

(PSC2 section 1.2)

Parallel Inner Product Computation

(PSC2 section 1.3)

hour. Laboratory class.

Try your luck on the theoretical exercise 1.3.

Interesting links

Wednesday September 22, 2021.

BSPlib, the Bulk Synchronous Parallel library.

(PSC2 section 1.4)

Benchmarking

(PSC2 sections 1.5-1.7)

Laboratory class. Starting with BSPlib.

This is your first hands-on session with BSPlib. Download the latest version of BSPedupack, my package of educational programs that teaches how to use BSP. Solve Exercise 1.3 from PSC2: try to run the benchmark, exploring your first parallel environment: your own laptop. Modify your program to send larger messages. How does this influence g?

Start working on the first assignment, Exercise 1.7 from PSC2. Design a parallel algorithm for prime number generation. Write your first parallel program. This program should generate all prime numbers below 10,000,000 in parallel. Choosing the right distribution is the name of the game. You should carry out the experiments of your first assignment on your own parallel computer. (This may be your laptop or desktop, or a compute server of your own university.) Snellius will only be available for the second assignment.

Hand in a report on Exercise 1.7 presenting your algorithm and the experimental results, before the deadline Wednesday October 27, 10.00 hour. Note that you can work in pairs.

Interesting links:

Wednesday September 29, 2021.

Parallel sample sort

(PSC2 section 1.8)

Program bspsort and bulk synchronous message passing

(PSC2 section 1.9)

Laboratory class. Discussion of Solution Exercise 1.2

Interesting links:

Wednesday October 6, 2021.

Sequential LU Decomposition

(PSC2 sections 2.1-2.2)

Parallel LU Decomposition

(PSC2 section 2.3)

Laboratory class. Discussion of report requirements

Wednesday October 13, 2021. Please note the exceptional location Hans Freudenthal building Room 611

Two-Phase Broadcasting

(PSC2 section 2.4)

High-performance LU decomposition

(PSC2 section 2.5)

Laboratory class. Discussion of solution midterm exam from October 24, 2018

We will discuss Questions 2, 3, 4, which are representative for this year's open-book exam. Please try first to solve the questions yourself.

Interesting links:

Wednesday October 20, 2021. Location: Educatorium Alfa

11.30-13.30 Midterm exam on Chapters 1 and 2.

The exact exam material will be announced. The exam will be held on location and you may use the book either as hardcopy (easiest to use) or ebook/OSO, the accompanying slides, your own notes, but no other materials or the internet.

Interesting links: previous exams from the archive. Exams from 2012, 2013 are only on Chapter 1 of the material (60 minutes). Exams 2017 on Chapter 1, 3 (90 minutes). Please note that this year's exam (with open book) will have no knowledge questions, which is usually the first question of the exams <= 2019 below.

Wednesday October 27, 2021.

Sequential sparse matrix–vector multiplication

(PSC2 section 4.1)

Sparse matrices and their data structures

(PSC2 section 4.2)

Discussion of midterm exam

Wednesday November 3, 2021.

Parallel sparse matrix-vector multiplication

(PSC2 section 4.3)

Cartesian matrix distributions

(PSC2 section 4.4)

Laboratory class

Choose an assignment for your final project. Possibilities: You can parallelise your own problem, or choose one of the exercises from the book PSC2. For example:

Please try to discuss the chosen topic with the teacher before November 17, 2021 and register the topic by sending an email to the teacher. You can work in pairs, but have to give individual presentations on the topic. It is OK for teams to choose the same topic, but please try to be original in your approach.

Wednesday November 10, 2021.

Mondriaan sparse matrix distribution

(PSC2 section 4.5)

Medium-grain method for sparse matrix distribution

(PSC2 section 4.6)

Laboratory class. Discussion of first assignment

Interesting links:

Wednesday November 17, 2021. Guest lecture Albert-Jan Yzelman (Huawei Research Zurich) "Algebraic programming"

The lecture (2x 45 min) will be followed by a question and answer session.

Albert-Jan Yzelman has a PhD from Utrecht University. He is coauthor of the MulticoreBSP for C library that we use in class and of the Mondriaan package. One of his recent contributions to open-source software is an implementation of the GraphBLAS in C.

Abstract: Algebraic Programming annotates operations with algebraic relations explicitly, while simultaneously maintaining a data-centric view of the containers being operated on. Its systematic application allows automatic parallelisation, vectorisation, and fusion. Programmers, meanwhile, maintain an easy-to-understand view of their codes, with clear semantics that follows basic mathematics. While the recent GraphBLAS standard promotes the use of generalised linear algebra for graph algorithms specifically, the same concepts could play a role for general-purpose computations. By involving both generic programming techniques and MLIR, we furthermore transcend the boundaries of classical library approaches by, for example, allowing automated optimisations across operations. Finally, performance semantics attached to algebraic operations allows the systematic performance characterisation of workloads, guides good algorithm design, and may guide automatic compile-time and run-time optimisations.

Interesting links

Wednesday November 24, 2021.

Laplacian matrices

(PSC2 section 4.9)

Sequential graph matching

(PSC2 sections 5.1-5.2)

Exercise class

Discuss your topic for the final project with the teachers.

Wednesday December 1, 2021.

Suitors and sorting

(PSC2 section 5.3)

Parallel graph matching

(PSC2 section 5.4)

Wednesday December 8, 2021.

Presentations final project

Wednesday December 15, 2021. Location BBG020

11.00-13.00 Retake midterm exam

Wednesday December 22, 2021.

More presentations final project

Deadline second assignment Monday January 17, 2022, 12.00 hour.

Hand in a report (through the Mastermath ELO, in PDF) before the deadline. Only one report per team. No need to send in duplicates. Please use the batch system of Snellius and not the interactive system for test runs of your parallel program. The interactive system is only for development. Running your parallel program on hundreds of processor cores will give you a bonus.

Please fill out the Mastermath digital evaluation form, which will be sent to you around the deadline.

Frequently asked questions


Last update of this page: November 23, 2021
Back to homepage of Rob Bisseling