[talks] Jonathan Ragan-Kelley , TODAY March 22, 12:30pm

Mitra D. Kelly mkelly at CS.Princeton.EDU
Tue Mar 22 09:37:38 EDT 2016


Jonathan Ragan-Kelley, Stanford

March 22, 12:30pm

Computer Science 105

 

Title: Organizing Computation for High-Performance Visual Computing

 

Abstract:

Future visual computing applications-from photorealistic real-time
rendering, to 4D light field cameras, to pervasive sensing and computer
vision-demand orders of magnitude more computation than we currently have.
>From data centers to mobile devices, performance and energy scaling is
limited by locality (the distance over which data has to move, e.g., from
nearby caches, far away main memory, or across

networks) and parallelism. Because of this, I argue that we should think of
the performance and efficiency of an application as determined not just by
the algorithm and the hardware on which it runs, but critically also by the
organization of computations and data. For algorithms with the same
complexity-even the exact same set of arithmetic operations and
data-executing on the same hardware, the order and granularity of execution
and placement of data can easily change performance by an order of magnitude
because of locality and parallelism. To extract the full potential of our
machines, we must treat the organization of computation as a first class
concern while working across all levels from algorithms and data structures,
to compilers, to hardware.

 

This talk will present facets of this philosophy in systems I have built for
visual computing applications from image processing and vision, to 3D
rendering, simulation, optimization, and 3D printing. I will show that, for
data-parallel pipelines common in graphics, imaging, and other
data-intensive applications, the organization of computations and data for a
given algorithm is constrained by a fundamental tension between parallelism,
locality, and redundant computation of shared values. I will focus
particularly on the Halide language and compiler for image processing, which
explicitly separates what computations define an algorithm from the choices
of organization which determine parallelism, locality, memory footprint, and
synchronization. I will show how this approach can enable much simpler
programs to deliver performance often many times faster than the best prior
hand-tuned C, assembly, and CUDA implementations, while scaling across
radically different architectures, from ARM cores, to massively parallel
GPUs, to FPGAs and custom ASICs.

 

 

Mitra Kelly

Academic Secretary

Princeton University

Computer Science Dept

35 Olden Street

Princeton NJ 08540

mkelly at cs.princeton.edu <mailto:mkelly at cs.princeton.edu> 

609-258-4562

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/talks/attachments/20160322/3654bad0/attachment.html>


More information about the talks mailing list