Effectively Using Spatial Audio - Interview with Michael Wohl

Photographers and videographers often say that audio is the most important element of their work. How does this translate into the world of 360° video? The new RICOH THETA V boasts 4 microphones internal to the camera, and spatial audio is a key feature. When used with earphones or with a VR headset, the directionality of audio significantly enhances the overall experience.

In order to effectively use spatial audio, we turned to Michael Wohl, award-winning filmmaker, to get more details.

Michael has over 15 years of experience writing, directing, and editing independent films. He helped develop the Emmy award-winning Final Cut Pro while at Apple, and taught at UCLA’s School of Film and Television. He has presented at film festivals and conferences including Sundance Film Festival, South by Southwest, DV Expo, and many others. In his recent book The 360° Video Handbook, he brings the professional video producer’s perspective to 360° video.

We’re planning to give away a free copy of Michael’s book, The 360 Video Handbook.

Sign up for your chance to win

Wohl pictured here with his demanding Hollywood agent

Want are your main tips for positioning the 360° camera for best audio results? Are the camera and the mic in the same location?

The ambisonic mic should absolutely be placed as close to the camera as possible. That audio will create a soundscape that matches the directionality of the 360° video. So the sounds from objects that make noise will naturally seem to originate in the correct location.

One critical tip is that it’s very important to note which direction the mic is facing, so that if you make any adjustments to the center point of the video when editing, you can ensure that the audio is rotated as well.

In your webinar on 360° Video Production, you mentioned that the camera is a head. If people are sitting at a dinner table, the camera is one of the people. Does this change with spatial audio?

Recording spatial (ambisonic) audio is a powerful tool for creating an immersive audio mix, but for best results you’ll want to combine the output of the your ambisonic mic with the output of traditional mono mics like wireless lavaliers and plant (hidden) mics.

A spatial mic creates a great directional background, but it’s not going to be the best mic for capturing high-quality audio for voices or important sound effects. As mentioned above, it must be placed as close to the camera as possible (so the directions and distances of sounds match that of the visuals), but this means that by definition it can’t be very close to the source of specific sounds in the location. If its locked to the camera position, it also can’t follow moving sound sources without the sound quality changing.

Just like in traditional video, you should always try to position mics as close as possible to the source of the sound (especially human voices, but also any kind of sound effects such as a whirring espresso machine in a cafe or a tractor’s blades cutting through the soil on a farm).

Putting the mic close to those sound sources ensures a clean, wide-spectrum recording with minimal reverb. Those sound elements can be mixed so they appear to originate in the correct part of the sphere during post production.

What’s the best length of a 360° video with spatial audio on YouTube? Will the viewer/listener may get tired moving around?

I don’t think the presence or absence of spatial audio has much impact on the length of a video. If anything, having a natural, well-mixed audio track will make the viewing experience more immersive and thus less taxing to watch.

In your webinar, you mentioned that with visual cues, a 15° turn was good. People don’t like to turn their heads too much. For spatial audio, are there some guidelines? 45°?

Spatial audio is always going to be 360°, and sounds should come from wherever they would naturally occur within the scene. If you’ve got sounds in your piece that are going to cause the viewer to turn and look (presumably to see some specific subject within the video) then trying to keep the movement to the same 15° - 20° from the center point is going to prevent the viewer from getting frustrated by having to keep looking over their shoulder.

The key is to create a rich, immersive soundscape that incorporates the primary audio for your piece (voices, etc.) any practical sound effects, ambiances, and sound design elements, and finally any music or score. I discuss this whole sound design process (for 360°) is great detail in my book: The 360° Video Handbook.

Hearing about spatial audio, people generally think of left and right. Do you have tips for the audio coming from an up and down direction? It seems like people would be more willing to look up and down.

I think the best way to think about spatial audio is to simply close your eyes and pay attention to where sounds are coming from around you. Plenty of sounds may originate above you (i.e. birds or airplanes outside, air conditioning systems indoors).

Remember that sounds are constantly bouncing off of the surfaces around you, so even a sound that is coming from directly in front of you is naturally reverberating off the floor and ceiling (and walls) too. You can add artificial reverb to a sound to simulate this effect, but one of the best aspects of recording with an ambisonic mic is that you are recording the actual sound waves as they come at you from all those different directions.

How do you mix background music into spatial audio? Does the background music have a source?

With rare exception, I think that music should be mixed so it is not coming from a particular place in the room. Turning your head should not change where the music appears to originate (unless it’s purposefully sourced—like coming from a bluetooth speaker or radio that’s visible in the scene).

How you mix it in depends somewhat on the specific software you use and the way your final spatial audio is to be encoded (Ambix, FuMa, Dolby Atmos, etc.). Some encoding formats include a pair of stereo channels (in addition to the spatialized channels), others require you to just direct the music source to come out equally from all directions. The details are specific to the software you use to mix (see next question) but it’s very easy to do regardless.

What do you use for editing spatial audio? Premiere Pro?

If all you’re doing is passing through audio recorded on an ambisonic mic (and you don’t want to change the positions of the sound or add any elements) you can work in Premiere or another video editor. (Though beware: Apple’s Final Cut Pro and Compressor may strip the ambisonic encoding out of your audio when you export your finished file.)

If you want to mix mono sources into an spatialized mix (as I recommend above) you need to use an audio editor like ProTools, Logic, or Reaper, and furthermore you need to use special plugins that are designed to handle spatialized sound. One of the best is Blue Ripple Sound’s 03A plugin. If you’re on a budget, Facebook has a free tool that works with most popular audio editors.

The Facebook package also includes an app to combine your final audio output with your final video output to make a master file ready for uploading to a video hosting service, or to use with an HMD (head-mounted display) like Daydream View, Rift or any of the Windows VR headsets.

Is it realistic to use spatial audio in a web page? Or, should we target VR headset users primarily?

For the most part, directional audio is best experienced using an HMD, but remember that videos on YouTube, Facebook or Vimeo can be viewed in a headset (even a Cardboard or Gear VR) if you’ve got one. As long as you’re wearing headphones you’ll hear spatialized sound.

Also, on Facebook (with properly encoded audio) playback is automatically adjusted so audio from certain parts of the sphere sound louder or softer depending on which portion of the screen is currently showing in the video player.

If your video has spatial audio and someone watches it in a player that doesn’t support it, it will generally default to a stereo (or even mono) audio version, so you don’t lose anything by including the directional information. And for any viewer who does choose to view your work wearing headphones, they’ll get the added immersiveness and improved experience that spatial audio provides.

For further information please see Part 6, “Use spatialized sound,” starting at 29:22, in Michael Wohl’s 360° Video Production Webinar

For a chance to win a free copy of Michael’s book, The 360 Video Handbook.

Sign up for your chance to win

3 Likes

Really useful interview!

thanks!

2 Likes

Daniel,

Are you doing any work with spatial audio? If you have any comments to add, definitely interested.

Jesse

1 Like

Hi @jcasman! I’m really interested but is a whole different world! I have a starter kit for Spatial Sound with the Zoom H2n, but haven’t quite figure it all out!
As soon as I achieve something I’ll write a post :slight_smile:

Thanks!

2 Likes

Daniel, what kind of setup do you use for audio normally? Or do you mostly work with images, not video?

Jesse,

I do mostly 360 video.

Depends on the project but mainly I use old fashion Gopro Hero 3+ x6 in a case called Freedom360. Plus a spatial audio zoom h2n. If the production needs actors, we use lavalier mics. If the occasion is not so PRO I use Samsung Gear 360 o Xiaomi Mijia Sphere. I would love to use Ricoh Theta V, but I feel chromatic aberration scares me too much. I’m keen for a Gopro Fusion instead :slight_smile:

You can watch our channel for a mix of cameras in use: www.youtube.com/3govideo