Project Proposal:
Parallel Minimum Spanning Tree
Michael Choquette
Main Project Page
I plan to write multiple parallel implementations of Minimum Spanning Tree, and compare their performance on large graphs of different structures. Initial implementations will be for the CPU (likely based on openMP or pthreads), but if time permits I will also investigate GPU-based algorithms.
A spanning tree of a connected undirected graph G is a subset of n-1 edges of G that, together with the vertices of G, form a tree. The Minimum Spanning Tree problem is to, given a connected undirected graph with edge weights, find a spanning tree of G the sum of whose edge weights is minimal. Minimum Spanning Tree (hereafter called MST) has a host of applications, from travelling-salesman approximations to circuit design to utility networks. With this in mind, it's useful to develop MST algorithms that scale well to the amount of concurrency available on modern-day machines.

Essentially all implementations of MST work because of a theorem known as the Light Edge Theorem: partition the vertices of any graph with a minimum spanning tree into two nonempty subsets, and one of the edges of minimum weight among those that go between the two subsets must be in the minimum spanning tree. This theorem implies a few easy sequential algorithms, but it also hints at parallel ones as well. In particular, it suggests that edges from different parts of the graph could be selected for inclusion in the mst in parallel.
The main challenges I can foresee are:
I will be working on the Gates machines, starting from the graph I/O code we used in assignment 3. I know of various sequential implementations of MST, as well as one parallel one, from an earlier CS class (15-210), but will not be using a reference implementation or research papers. I've done research with professor Guy Blelloch in the past, and if the project goes well I may ask him for permission to run my code on the 40-core machine he uses for his research.
I plan to achieve the following: I hope to achieve the following:
I'll be working with straight C++ and pthreads on the CPU, possibly with openMP at some point. I believe this is the best choice of platform and libraries because I don't yet know what algorithms will work well, and I don't want to restrict myself to a specific paradigm until then.
Proposed Schedule
[Please do not modify the schedule on this page after the proposal process (it is your proposed schedule, and it will be useful to compare it to your actually project logs at the end of the project). Update the schedule on your project main page if your goals or timeline changes over the course of the project.]

Week What We Plan To Do
Apr 1-7 Write/compare sequential implementations
Apr 8-14 Write first parallel implementation
Apr 15-21 Write second parallel implementation
Apr 22-28 Write third parallel implementation
Apr 29-May 5 Write GPU implementation
May 6-11 Schedule padding