Lecture videos for 6.172 Performance Engineering of Software Systems, Fall 2010. Also includes one session introducing industry mentors to 6.172.
Saman Amarasinghe, Charles Leiserson
Information about the final project, followed by a guest lecture covering fractal trees, which combine the strengths of B-trees and append-to-file. Analysis of search and insert in a simplified fractal tree.
Students vote on which ray-tracer images are accurate enough to be included in the competition, and these entries are tested against each other for speed. Students also provide feedback on the course.
Meeting for 6.172 industry mentors. Description of mentorship role, expectations, overview of course and how it fits into EECS curriculum.
Primer on ray tracing techniques, given to prepare students for the final project. Includes some ray tracing background and a code overview covering classes and high-level execution.
Lecture covering distributed systems at the cluster, data center, and planet scales. Topics include message passing, examples of the need to scale, Google's programming model, and cloud computing.
Lecture covering nondeterministic programming, including mutual exclusion, implementation of mutexes, and locking anomalies.
Lecture covering analysis of multithreaded algorithms, including divide-and-conquer recurrences, loop parallelism in Cilk++, and matrix multiplication and merge sort examples.
The first part of the lecture covers parallelism analysis, caches, and synchronization correctness. The second focuses on compiler optimization questions: is the optimization legal, faster, and automatic?
Guest lecture by Ravi Soundararajan of VMware, covering ten case studies in performance engineering.
Discussion of project 3 beta. Lecture covering multicore programming, including shared-memory hardware, concurrency platforms, and race conditions.
Lecture covering the impact of synchronization and memory on parallel performance, using OpenMP instead of Cilk. Topics include granularity of parallelism, true and false sharing, and load balancing.
Lecture covering dynamic storage allocation, including reference counting, a graph abstraction, and updating pointers.
Lecture covering synchronizing without locks, including memory consistency, lock-free protocols, the ABA problem, and reducer hyperobjects.
Lecture covering cache-efficient algorithms, with tiled and recursive matrix multiplication examples.
Lecture covering parallelism, scheduling theory, the Cilk++ runtime system, and Cilk chess programs.
Basic performance engineering. Bentley's rules (modifying data, modifying code) and the traveling salesman problem.
Lecture covering memory systems, including cache concepts, access pattern concepts, data structure transformations, and computation transformations.
Lecture covering compiler hacks, when to optimize, data-flow analysis and optimizations, and instruction scheduling. Discussion of quiz, including common mistakes.
Lecture covering bit hacks. Examples include swap, population count, and the n-queens problem, used to highlight backtracking search and bitvector representation.
Discussion of project 2.2 beta, how pair programming can lead to less time debugging, and importance of working well in groups. Lecture covering more cache-efficient algorithms, including a heat diffusion simulation.
Overview of computer architecture, profiling a program, and a set of example programs. Student performance on project 1 is also discussed.
Theory and background of profiling tools. Two interactive walkthroughs: matrix multiply and branchless sorting.
Lecture covering single threaded performance and assembly language. Discussion of project results, emphasizing importance of writing tests first and pair programming.
Introduction to course, administrative information. Lecture covering matrix multiply as a case study, including matrix representation, performance counters, instruction level optimizations, and parallel execution.