Practical Elegance: Designing programming models for large-scale systems Michael Isard , Microsoft Research Monday, November 11, 4:30pm Computer Science 105 The growing need for systems that operate on large datasets is well known, but most programming models for such systems are either too low level to appeal to non-expert programmers, or are specialized to tightly restricted problem domains. I will describe the evolution of a programming model, and its associated systems, that I have worked on at Microsoft Research. The model encourages programmers to describe an algorithm as a series of data-parallel steps, embedded within a familiar language and programming environment (C#/.NET). In order to write code this way programmers must think indirectly about data-dependencies, but can leave out the messy details of concurrency. We believe this elegantly balances our desire to insulate programmers from implementation details with the need for the system to automatically infer safe parallel and distributed execution strategies. We have been able to design systems that take programs written in this model and execute them efficiently on large datasets and clusters of hundreds of computers. As the model has evolved we have made it more expressive, so that it now encompasses both incremental and iterative computation, while keeping the ability to execute these richer programs efficiently, at scale. Our ultimate goal is to integrate data-parallelism seamlessly into all aspects of a general-purpose programming language. Michael Isard started out as a computer vision researcher, but for the last few years has mostly been building distributed execution engines and thinking about how to program them. He received his D.Phil in computer vision from the Oxford University Engineering Science Department in 1998, and worked at the Compaq Systems Research Center in Palo Alto for three years before joining Microsoft Research in Silicon Valley in 2002.