Project Proposal:

Parallel Minimum Spanning Tree

Michael Choquette

Summary

I plan to write multiple parallel implementations of Minimum Spanning Tree, and
compare their performance on large graphs of different structures. Initial
implementations will be for the CPU (likely based on openMP or pthreads), but if
time permits I will also investigate GPU-based algorithms.
Background

A spanning tree of a connected undirected graph G is a subset of n-1 edges of G
that, together with the vertices of G, form a tree. The Minimum Spanning Tree
problem is to, given a connected undirected graph with edge weights, find a
spanning tree of G the sum of whose edge weights is minimal. Minimum Spanning
Tree (hereafter called MST) has a host of applications, from travelling-salesman
approximations to circuit design to utility networks. With this in mind, it's
useful to develop MST algorithms that scale well to the amount of concurrency
available on modern-day machines.Essentially all implementations of MST work because of a theorem known as the

Challenge

The main challenges I can foresee are:
- Dividing the work into big enough chunks to utilize the given processors. How to do this isn't obvious, since it's often not clear what edges will be in the mst until others have already been added, but I know one way to do it from an eariler class, and hope to think of others.
- If applicable, merging information about the added edges generated by the different processors together. One implementation of MST uses a union-find data structure, and I may need to do something similar but in parallel.
- Avoiding contention over the graph structure. Since all the different processors will be reading the same data structures, having them also frequently updating those same data structures would cause a lot of cache misses/coherency overhead.

Resources

I will be working on the Gates machines, starting from the graph I/O code we
used in assignment 3. I know of various sequential implementations of MST, as
well as one parallel one, from an earlier CS class (15-210), but will not be
using a reference implementation or research papers. I've done research with
professor Guy Blelloch in the past, and if the project goes well I may ask him
for permission to run my code on the 40-core machine he uses for his research.
Goals/Deliverables

I plan to achieve the following:
- A fast sequential MST implementation for the CPU.
- At least two parallel MST implementations for the CPU with fundamentally different algorithms.
- Meaningful comparisons of these approaches on different types of graphs, for example random, planar, and real-world.
- Speedup of at least n/2 on n cores for small n, compared to the parallel implementation.

- Performance meeting or beating Professor Guy Blelloch's solution here for the Gates machines.
- Performance that scales well to high numbers of cores (at least 32).
- An efficient GPU implementation of MST, with performance comparing well to the CPU implementation.

Platform

I'll be working with straight C++ and pthreads on the CPU, possibly with openMP
at some point. I believe this is the best choice of platform and libraries
because I don't yet know what algorithms will work well, and I don't want to
restrict myself to a specific paradigm until then.
Proposed Schedule

[Please do not modify the schedule on this page
after the proposal process (it is your proposed schedule, and it will
be useful to compare it to your actually project logs at the end of
the project). Update the schedule on your project main page if your
goals or timeline changes over the course of the project.]

Week | What We Plan To Do |

Apr 1-7 | Write/compare sequential implementations |

Apr 8-14 | Write first parallel implementation |

Apr 15-21 | Write second parallel implementation |

Apr 22-28 | Write third parallel implementation |

Apr 29-May 5 | Write GPU implementation |

May 6-11 | Schedule padding |