RTP MIDI: An RTP Payload Format for MIDI

By John Lazzaro and John Wawrzynek, CS Division, UC Berkeley.

RTP MIDI

Internet telephony and video-conferencing programs send audio and video over the net using the Real-time Transport Protocol (RTP). RTP is an Internet Engineering Task Force (IETF) standard, whose payload formats are developed in the Audio-Video Transport payload working group (payload).

We have worked within AVT-payload to standardize RTP MIDI, a payload format to send MIDI over networks using RTP. MIDI is a standard for coding the gestures of musical performance -- pressing piano keys, striking drum pads, moving faders, etc).

RFC 6295 normatively defines the RTP MIDI payload format. RFC 4696 is an implementation guide for RTP MIDI. The RFCs were developed in cooperation with the MIDI Manufacturers Association (MMA) and the Motion Pictures Expert Group (MPEG).

RTP MIDI is able to send MIDI over a "lossy" network (a network that loses packets). To prevent "stuck notes" and other artifacts, RTP MIDI uses a feed-forward resiliency system (the recovery journal) to recover from packet loss.

We anticipate three major application areas for RTP MIDI:

  • MIDI over wired and wireless LANs. RTP MIDI may be used to send real-time MIDI streams over wired and wireless Local Area Networks (LANs). Apple uses RTP MIDI as the transport layer for the MIDI Network Driver, included in OS X and in iOS (the operating system for the iPhone, the iPad, and the iPod Touch).
  • Network Musical Performance. VoIP and videoconferencing applications may add support for network musical performance via RTP MIDI. In a network performance, musicians located in different physical locations interact over a network to perform as they would if located in the same room.
  • Content Streaming. Content streams may begin to use MIDI for low-bitrate music coding, perhaps in conjunction with normative sound synthesis methods such as Structured Audio. Applications include Internet broadcasting, multimedia presentations, and telephony audicons and ring tones.

To Learn More

Implementors should refer to RFC 6295 and RFC 4696 for the final version of RTP MIDI.

RFC 6295 was approved in 2011, and fixes many document errors in the first RTP MIDI RFC (RFC 4695). See Section 12 of RFC 6295 for a complete change log. Errors in RFC 4696 are documented on its errata page.

This paper, presented at the 117th AES convention, is a good introduction to how RTP MIDI works, and how it fits into the IETF media protocol stack. The AES paper discusses a protocol that is a snapshot of RTP MIDI as it existed in October 2004.

In network musical performance applications, one cause of concern is the latency between performers. This paper, presented at the NOSSDAV 2001 conference, discusses latency (and other issues) in network musical performances, in the context of an application that uses a proto-version of RTP MIDI as the network transport.

Implementations

Apple uses RTP MIDI as the transport layer for the MIDI Network Driver that ships in Mac OS X and iOS.

See this Sound on Sound magazine article for a comprehensive guide to using Apple's MIDI Network Driver on OS X. A shorter introduction to the topic is presented in this article.

iOS app developers can access RTP MIDI via the MIDINetworkSession Class. Hundreds of iOS apps use RTP MIDI; one example is AC-7.

Tobias Erichsen has created a MIDI Network Driver for Windows that can interoperate with Apple's RTP MIDI implementation. His driver is free for private, non-commercial use, and is available for download here.

Kiss-Box manufactures Ethernet networking hardware that interoperates with Apple's RTP MIDI implementation. The Kiss-Box RTP MIDI stack was developed by Benoit Bouchez, who also develops embedded implementations of RTP MIDI on a consulting basis (email: beb [dot] digitalaudio [at] free [dot] fr).

nmj is a Java library that lets developers write Android apps that interoperate with Apple's RTP MIDI implementation. TouchDAW is an example of an Android app that is based on nmj technology.

Jim Young has written an RTP MIDI stack for Windows 8 using the WinRT API (video).

Wireshark now includes an RTP MIDI dissector, written by Tobias Erichsen, that interoperates with Apple's RTP MIDI implementation.

MidiShare, a realtime operating system for musical applications, includes an RTP MIDI library in its development branch.

The (unofficial) reference implementation for RTP MIDI is the network stack in sfront, an MPEG 4 Structured Audio decoder.

Networking is no longer enabled in the sfront distribution, because we no longer host the required network services. However, the networking source code still ships in the distribution. Developers wishing to examine the network code can download sfront here, and follow these instructions for locating the network source code. Alternatively, we offer a smaller distribution that contains only the network source code (click here to download). Note that the network code (and sfront itself) is BSD-licensed.

References

John Lazzaro and John Wawrzynek (2011).  RTP Payload Format for MIDI. RFC 6295, IETF Proposed Standard Protocol [document].

John Lazzaro and John Wawrzynek (2006).  An Implementation Guide for RTP MIDI. RFC 4696, IETF Standards-Track (Informative) [document] [errata].

John Lazzaro and John Wawrzynek (2006).  RTP Payload Format for MIDI. RFC 4695, IETF Proposed Standard Protocol [document] [errata]. Obsoleted by RFC 6295.

J. Lazzaro (2006).  Framing RTP and RTCP Packets over Connection-Oriented Transport. RFC 4571, IETF Proposed Standard Protocol [document].

John Lazzaro and John Wawrzynek (2004). An RTP Payload Format for MIDI. The 117th Convention of the Audio Engineering Society, October 28-31, 2004, San Francisco, CA. [PDF].

John Lazzaro and John Wawrzynek (2001). A Case for Network Musical Performance. The 11th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York [PDF].