The Xiph.Org Foundation just announced their latest improvement to the Opus audio codec with the release of their libopus 1.2 encoder. With this latest improvement, Xiph.Org has managed to make Opus usable for fullband stereo audio at just 32 kb/s, which will pair well with the upcoming royalty-free AV1 video format in the WebM container to bring higher quality audio and video on slower connections.
For those of you who are not familiar with the format, Opus is a IETF standard royalty-free audio codec that came about by merging the Xiph.Org Foundation’s CELT codec and Skype’s SILK codec, in an attempt to create one royalty-free format for all lossy audio. It was designed to scale well with changing bitrates, to require extremely low throughput, and to be able to be encoded and decoded with very little processing power used, all of which are critical for uses in video conferencing, mobile streaming, and any other real-time audio applications. In the 5 years since Opus was standardized, it has already found widespread adoption throughout the web, seeing adoption from streaming services, IP phones, media players, and others.
Opus 1.2 brings with it some substantial improvements to both music quality and speech quality. As mentioned above, Opus has now reached the point where it is usable for fullband stereo audio at just 32 kb/s, something that was thought to be unachievable just a few years ago. The enhancements brought with libopus 1.2 enable the use of VBR encoding at 32 kb/s, which was previously avoided due to the inaccurate impression that it would damage the audio quality in the extremely low bitrate areas, which Opus is able to avoid.
Opus 1.2 Music Audio demo @ 32 kbps MP3 LAME 3.99.5 VBR Music Audio demo @ 32 kbps Opus 1.1 Music Audio demo @ 32 kbps Opus 1.0 Music Audio demo @ 32 kbps MP3 Music Audio demo @ high bitrate (source file converted from FLAC for viewing)Opus 1.2 also brings speech quality to the point where it is usable for fullband speech at just 14 kb/s, down from 21 kb/s in Opus 1.1, and 29 kb/s in Opus 1.0. This is driven in part thanks to improvements to Opus’ hybrid mode, which uses SILK for frequencies below 8 kHz, and CELT for frequencies from 8 kHz to 20 kHz. The tuning done in libopus 1.2 allows it to use both CELT and SILK in conjunction at bitrates as low as 16 kb/s, which is half the previous limit of 32 kb/s.
Opus 1.2 Spoken Audio demo @ 12 kbps Speex Spoken Audio demo @ 12 kbps Opus 1.2 Spoken Audio demo @ 16 kbps Speex Spoken Audio demo @ 16 kbps Opus 1.1 Spoken Audio demo @ 16 kbps Opus 1.0 Spoken Audio demo @ 16 kbps MP3 Spoken Audio demo @ high bitrate (source file converted from FLAC for viewing)One thing that is interesting to note is that there was no one major change that this improvement can be attributed to. While Opus 1.1’s improvements came primarily from a small selection of changes, Opus 1.2 is the result of iterative development and a plethora of minor tweaks that added up to a massive improvement.
Despite those substantial quality improvements, work on the encoder has actually resulted in Opus requiring even less processing power than it previously did. Opus was already a market leader in terms of how little processing power it used, but the 1.2 update to libopus has brought the encodes to the point where you can can decode 128 kb/s fullband stereo music in realtime with just ~11 MHz of processing power on an Intel Haswell CPU in floating-point mode (or just ~33 MHz on an ARM Cortex-A53 in fixed-point mode) and 12 kb/s wideband mono speech in just ~2 MHz on an Intel Haswell CPU in floating-point mode (or just ~6 MHz on an ARM Cortex-A53 in fixed-point mode). Similarly, encoding time has also decreased for most situations, with some of them more extreme ones being cut in half (such as encode complexity 5 for 128 kb/s fullband stereo music on an Intel Haswell CPU in floating-point mode, which dropped from ~40 MHz with libopus 1.0 to just ~21 MHz with libopus 1.2).
The continuing development of Opus with libopus 1.2 is exciting to see, and hopefully we will see Opus continue to gain adoption as time goes on. Royalty free codecs are crucial to the development of an open and interoperable internet. They are the only codecs that can be implemented on all devices, as patent encumbered codecs will frequently run into various showstopping issues, ranging from content distributors and streaming services not wanting to pay the exorbitant licensing fees that some demand, to open source software frequently being unable to guarantee proper licensing on behalf of their users, or even software being completely unable to integrate it without violating their own licensing terms. These problems with patent encumbered codecs cause fragmentation instead of collaboration, as different groups create and implement their own codecs in order to avoid the licensing fees and various other problems that patent encumbered codecs bring. It results in groups creating their own codecs that require specific browsers, operating systems, and/or hardware to use, and which can completely lock large swaths of users out of being able to use certain content. The only way for a truly universal codec to emerge is if it is royalty free, and widespread adoption of the few codecs in use is vital to a healthy internet where all users have the capability to access any content. Open standards are the only way to guarantee a consistent user experience across the market, and it is fantastic when the royalty free option is also the best one.
0 comments:
Post a Comment