Abstract

In this talk, I will present recent work on incorporating analysis into the ChucK music programming language. A consequence of this project is that composers and musicians can perform live audio analysis, even coding on the fly, using a powerful and flexible programming language framework, and then use the output of analysis tasks to drive synthesis parameters or to provide features for training and running classifiers in real-time. This work also provides music information retrieval researchers with a tool for fast prototyping of music analysis and modeling algorithms, which can be incorporated into live performance settings as well as off-line music analysis. As such, this project draws heavily on prior work in music information retrieval (MIR), applied machine learning, human computer interaction, and computer music systems and performance practices.

In the past decade, MIR has blossomed as a research area, drawing on disciplines including machine learning, library science, signal processing, HCI, systems design, and information retrieval. Popular research problems include learning semantically meaningful categorizations of acoustic or symbolic data (e.g., genre labels), extracting musically or cognitively-salient features from audio (e.g., pitch and harmony transcription, audio source separation), and providing new interfaces and systems for people to search for music and organize large music collections (e.g., query-by-humming, content-based playlist generation). Many of these problems have been addressed in recent years using standard or novel machine learning and modeling techniques.

A variety of programming frameworks and languages have been developed to create music since the 1950s. These languages are used in live musical performance, as well as interactive sound and art installations. Recently, languages such as ChucK have enabled performers to write and modify code during performances, a practice known as live-coding. Most music programming languages offer tight control over sound synthesis and modification and draw on established metaphors for exposing this control. These languages tend not to offer much support for analysis tasks (e.g., feature extraction); when musicians do integrate analysis into real-time performance (e.g., pitch tracking), they tend to use customized, task-specific systems.

Computer music and MIR have proceeded as mainly independent fields, despite potential benefits of cross-pollination. Many MIR systems would be appropriate and useful in a performance context, and MIR research would be enriched by a consideration of the goals and constraints unique to live music, but no existing MIR or computer music framework is appropriate for both performance and analysis. Therefore, we have worked to add analysis capabilities into ChucK, lowering the barriers to MIR researchers wishing to prototype or port algorithms to a real-time performance context, and to computer musicians wishing to incorporate such algorithms into their performance. The first step of this work was to design new classes and data flow models for analysis that were in harmony with ChucK's existing conventions for syntax, synchronization, objects, and controlling events in time. We were then able to implement common analysis-driven synthesis tasks and extractors for standard MIR features. Following this, we incorporated the ability to perform “on-the-fly” machine learning using the extracted features. At present, several standard computer music and MIR tasks have been implemented using ChucK's growing collection of feature extractors and classifiers.

It is hoped that this work will provide new avenues for both artistic expression and technical exploration. We have released the new version of ChucK with analysis to the public, and we plan to take advantage of these new capabilities in future compositions for the Princeton Laptop Orchestra. We are also excited to explore practical issues presented by “on-the-fly” learning, including dealing with the computational constraints of real-time performance. Next steps in this work involve efficiency-driven refinements to the language and building infrastructure to accommodate more sophisticated musical modeling tasks.

Reading list

Bergstra, James, Norman Casagrande, Dumitru Erhan, Douglas Eck, and Balazs Kegl. 2006. Aggregate features and AdaBoost for music classification. Machine Learning 65: 473-484.

Collins, Nick, Alex McLean, Julian Rohrhuber, and Adrian Ward. 2003. Live coding in laptop performance. Organised Sound 8(3): 321-330.

Downie, J. Stephen. 2003. Music information retrieval. In Annual Review of Information Science and Technology 37, ed. Blaise Cronin, 295-340. Medford, NJ: Information Today.

Ellis, Daniel P. W., and Graham E. Poliner. 2006. Classification-based melody transcription. Machine Learning 65: 439-456.

Jordan, Michael I. 2004. Graphical models. Statistical Science 19(1): 140-155.

Lew, Michael S., Nicu Sebe, Chabane Djeraba, and Ramesh Jain. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications and Applications 2(1): 1-19.

Raphael, Chris. 2001. A probabilistic expert system for automatic musical accompaniment. Journal of Computational and Graphical Statistics 10(3): 487-512.

Rowe, Robert. 2001. Machine musicianship. Cambridge, MA: The MIT press.

Schapire, Robert E. 2001. The boosting approach to machine learning: An overview. MSRI Workshop on Nonlinear Estimation and Classification, Berkeley, CA, March.

Tzanetakis, Georg, and Perry R. Cook. 2002. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio, July.

Wang, Ge. Forthcoming. A history of programming and music. In Cambridge Companion to Electronic Music, ed. Nick Collins and Julio D'Escrivan. Cambridge University Press.