rob.jpg

Hi, my name is R.R. Perkins.

I’ve been an acoustic engineer, sound designer, and audio producer for the last fifteen-plus years.

It is my hope that through this blog you can gain some valuable insight(s) into the potentials of spatial audio within emerging immersive AR/VR technologies and enjoy some of the content.

The Effects of Bitrate Per Channel & Compression on Localization Accuracy

The Effects of Bitrate Per Channel & Compression on Localization Accuracy

When high bit rate optimized for spatial resolution pairs with an inefficient compression scheme, the result(s) can be quite disappointing….

A conference paper (from a collaborative team of researchers at the Dublin Institute of Technology and Google) presented at the 2017 23rd International Conference on Virtual System & Multimedia (VSMM) discusses the difficulties of “delivering a 360-degree soundscape that matches full sphere visuals” which “take into account the azimuth and elevation of sound sources, portraying source location above and below as well as around the horizontal plane of the listener.”

From Fig. 3. showing the Listening Quality MUSHRA scores aggregated by encoding scheme.

From Fig. 3. showing the Listening Quality MUSHRA scores aggregated by encoding scheme.

From Fig. 4. showing the Localization Accuracy MUSHRA scores aggregated by encoding scheme.

From Fig. 4. showing the Localization Accuracy MUSHRA scores aggregated by encoding scheme.

This is an essential conversation to have as streaming ambisonics via multiple platforms at the current moment is often quite inefficient without the aid of a great processing system to troubleshoot every step of the way prior to code injection. More video editing platforms, like Final Cut Pro X for example, need to offer support for “efficient encoding techniques that compress the raw audio content without compromising quality of experience“

“…these results will be used to validate and optimize work currently underway to develop a full reference objective spatial audio quality metric.”

As you can see above from the two figures, the HOA512 (third-order ambisonics) and FOA128 (first-oder ambisonics) are the two standout encoding schemes that perform good in regard to listening quality. When it comes to localization accuracy, none of the other encoding schemes hold a candle to the HOA512. Third-order ambisonics is likely the near-future of professional ambisonics. As of publishing of this paper, the researchers hope “these results will be used to validate and optimize work currently underway to develop a full reference objective spatial audio quality metric.” Overall, a really simple, straightforward, and good reference going forward.

You can view the full article here.

Narbutt, M., O'Leary, S., Allen, A., Skoglund, J., & Hines, A. (2017). Streaming VR for immersion: Quality aspects of compressed spatial audio. 2017 23rd International Conference on Virtual System & Multimedia (VSMM), 1-6.




Newly Standardized Immersive Sound Extension Tools for MPEG‐H 3D

Newly Standardized Immersive Sound Extension Tools for MPEG‐H 3D

New 3D Coding Approach | Concentric Spheres Spatial Quantization (CSSQ) Method

New 3D Coding Approach | Concentric Spheres Spatial Quantization (CSSQ) Method