In this talk, I will present recent work on
incorporating analysis into the ChucK music programming language. A consequence
of this project is that composers and musicians can perform live audio analysis,
even coding on the fly, using a powerful and flexible programming language
framework, and then use the output of analysis tasks to drive synthesis
parameters or to provide features for training and running classifiers in
real-time. This work also provides music information retrieval researchers with
a tool for fast prototyping of music analysis and modeling algorithms, which can
be incorporated into live performance settings as well as off-line music
analysis. As such, this project draws heavily on prior work in music information
retrieval (MIR), applied machine learning, human computer interaction, and
computer music systems and performance practices.
In the past decade, MIR has blossomed as a
research area, drawing on disciplines including machine learning, library
science, signal processing, HCI, systems design, and information retrieval.
Popular research problems include learning semantically meaningful
categorizations of acoustic or symbolic data (e.g., genre labels), extracting
musically or cognitively-salient features from audio (e.g., pitch and harmony
transcription, audio source separation), and providing new interfaces and
systems for people to search for music and organize large music collections
(e.g., query-by-humming, content-based playlist generation). Many of these
problems have been addressed in recent years using standard or novel machine
learning and modeling techniques.
A variety of programming frameworks and
languages have been developed to create music since the 1950s.
These languages are used in live musical performance, as well as interactive
sound and art installations. Recently, languages such as ChucK have enabled
performers to write and modify code during performances, a practice known as
live-coding. Most music programming languages offer tight control over sound
synthesis and modification and draw on established metaphors for exposing this
control. These languages tend not to offer much support for analysis tasks
(e.g., feature extraction); when musicians do integrate analysis into real-time
performance (e.g., pitch tracking), they tend to use customized, task-specific
systems.
Computer music and MIR have proceeded as mainly
independent fields, despite potential benefits of cross-pollination. Many MIR
systems would be appropriate and useful in a performance context, and MIR
research would be enriched by a consideration of the goals and constraints
unique to live music, but no existing MIR or computer music framework is
appropriate for both performance and analysis. Therefore, we have worked to add
analysis capabilities into ChucK, lowering the barriers to MIR researchers
wishing to prototype or port algorithms to a real-time performance context, and
to computer musicians wishing to incorporate such algorithms into their
performance. The first step of this work was to design new classes and data flow
models for analysis that were in harmony with ChucK's existing conventions for
syntax, synchronization, objects, and controlling events in time. We were then
able to implement common analysis-driven synthesis tasks and extractors for
standard MIR features. Following this, we incorporated the ability to perform
“on-the-fly” machine learning using the extracted features. At present, several
standard computer music and MIR tasks have been implemented using ChucK's
growing collection of feature extractors and classifiers.
It is hoped that this work will provide new
avenues for both artistic expression and technical exploration. We have released
the new version of ChucK with analysis to the public, and we plan to take
advantage of these new capabilities in future compositions for the Princeton
Laptop Orchestra. We are also excited to explore practical issues presented by
“on-the-fly” learning, including dealing with the computational constraints of
real-time performance. Next steps in this work involve efficiency-driven
refinements to the language and building infrastructure to accommodate more
sophisticated musical modeling tasks.