Network Musical Performance

By John Lazzaro and John Wawrzynek, CS Division, UC Berkeley.

Introduction

A Network Musical Performance (NMP) occurs when musicians in different locations interact over the Internet, to perform as they would if located in the same room.

An NMP system unavoidably introduces time delays between the musicians, due to the network latency of the links connecting the players and the local latency at each host. The total latency must be kept reasonably short for the NMP system to be usable.

However, some latency is always present in conventional musical performance -- the acoustic latency due to the speed of sound.

One way to think about NMP is to consider the physical separation between network hosts that would yield the equivalent acoustic latency between players in a room. For example, data packets travel from the Stanford University campus to the UC Berkeley campus, located 40 miles apart, in the time it takes for sound to travel 2.4 feet.

However, the quality of NMP depends on the total system latency: network delays plus the local latency at each host. If we take local delays into account, we find a total latency between Berkeley and Stanford that corresponds to a musician separation of about 7 feet, a typical distance between two players in rehearsal.

Resiliency vs. Latency

In many networks, occasional packet delays and losses are inevitable, as other users transiently consume resources. Internet telephony copes with congestion delay by using large audio buffers at the receiver.

Buffer delay is tolerable for telephony, but is not acceptable for network musical performance. A key research issue in NMP is to design systems that handle lost and late packets gracefully, using methods that do not increase total system latency.

Gestural Coding

Concealing packet loss is easier if the musical performance is sent across the Internet at a higher level of abstraction, that describes the physical gestures musicians use to manipulate their instruments.

Gestural data sent across the network should be tagged with timestamps and sequence numbers, and should include contextual information about recently sent gestures, so that late and lost packets can be detected and concealed.

We have implemented a system for network musical performance based on these ideas. In this system, the musicians play electronic instruments that produce MIDI control data. MIDI data is sent to the remote players, using a resilient coding to protect against packet loss. Audio software on each host turns both local and remote MIDI data into sound.

To Learn More

This paper, presented at the NOSSDAV 2001 conference, describes our NMP system in detail.

This paper, presented at the 117th AES convention, describes the RTP MIDI payload format used by our NMP system.

RTP MIDI is an IETF Proposed Standard (RFC 4695 and RFC 4696). Read about RTP MIDI here.

Unfortunately, we no longer run the servers necessary to host NMP sessions, and so we have disabled networking support in sfront.

Read about related work on NMP systems at CCRMA, CNMAT, SoftSynth, and USC/ISI.

References

John Lazzaro and John Wawrzynek (2004). An RTP Payload Format for MIDI. The 117th Convention of the Audio Engineering Society, October 28-31, 2004, San Francisco, CA. [PDF].

John Lazzaro and John Wawrzynek (2001). A Case for Network Musical Performance. The 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York [PDF] [ps.gz] [ps].