Core Audio: a quick example using Extended Audio File Services

By vb, 01/01/2017

This post is more a reminder to myself: every now and then I keep running into Core Audio – and need to start all over again, cause I can hardly remember anything from the last encounter… Yes Core Audio is quite convoluted as it goes very deep, but it opens a lot of possibilities.

For this project I wanted to make use of the automatic file conversion capabilities of Core Audio – which supposedly is not the most complicated use case – and next to the usual, invaluable, while in the meantime slightly outdated help resource “Learning Core Audio” by Adamson and Avila, the internet already offers a number of coding examples showing how format conversions can be done, converting e.g. a MP3 file to a file with linear PCM format and so on.
But to give it a little twist, I wanted to find out how we could export raw audio samples residing in memory to an audio file with compressed format. The process is almost the same as described in the usual audio converter examples, but not quite – and lame as I am, it took me some time to find out the fine adjustments that needed to be made.
So hopefully this example can serve as a little memory hook for using Extended Audio File Services to accomplish the task.

As a practical use case let’s implement the desired functionality in a little external object for MaxMSP, that can access the raw audio samples residing in a buffer~ object and export those to an mpeg4/aac file. To keep it simple, no attention will be paid to possible threading issues etc. The complete code can be found on github.

exporterScreenshot

Here is an outline of the steps to be taken:

  1. fill out AudioStreamBasicDescriptions (asbd) for the source and the destination formats
  2. create an output file reference (ExtAudioFileRef) with the destination format
  3. set the client data format using the source asbd
  4. create a destination buffer and wrap it in an AudioBufferList
  5. copy the source sample data to the destination buffer in blocks and write them to the output file (ExtAudioFileWrite)
  6. clean up

CoreAudio-conversion

Step 5 can be simplified in most cases: we don’t really need to copy the source data to the destination buffer. We can simply let the AudioBufferList’s mData point to the source data with appropriate offsets.

By the way, the necessary conversion work is done for us automatically behind the scenes by the Extended Audio File Services – quite convenient.

 

Filling out AudioStreamBasicDescriptions for the source and the destination formats

Although MaxMSP uses double precision for audio processing, a buffer~ stores it’s samples still as 32bit floats. So our source format is always linear PCM with 32 bits per channel.

Zeroing out the AudioStreamBasicDescription structure before populating the specific fields is a good idea, as with some formats we don’t know (and can’t know) every single field in advance and need to set those to zero anyway.

A frame is a collection of time-coincident samples. For instance, a linear PCM stereo sound file has two samples per frame, one for the left channel and one for the right channel. Most audio files store their data in “interleaved format”, which means, that the frames are stored one after the other, so the sample bytes of respective channels alternate in a single stream.

In an uncompressed format we always have one packet per frame. In compressed formats this is different and as we don’t know anything about it, we let the Extended Audio File Services figure that out for us, by leaving those fields blank.

 

Create an output file ref and set source and destination formats

Next we create an output file with the correct destination format.

The next step is very important! We must tell the ExtAudioFile API the format that we will be sending samples in, i.e. we tell it the source format by using a call to ExtAudioFileSetProperty
In Core Audio this is called  ClientDataFormat  – which I find slightly confusing…

 

Fill destination buffer in blocks and write to output file

Ok, now we are actually ready to start the conversion and write the resulting audio data to the output file.

Don’t forget to call  ExtAudioFileDispose when you are done with it.

Let’s have a closer look at convertAndWriteToDisk  function:

The call to ExtAudioFileWrite is not only writing samples to a file but also taking care of necessary format conversions. For this to work we need to hand it over an AudioBufferList which contains blocks of samples that we want to write to disk.
So first we create an AudioBufferList (“convertedData”) – as the whole process is an offline task, one buffer is ok for this – make sure the number of channels match and there is enough space to hold one block of samples (in the original format, which is float32!).
Then before handling over the AudioBufferList to ExtAudioFileWrite we need to fill its destination buffer repeatedly with a block worth of sample data. We can do this by actually copying the original sample data (“pcmData”) to the destination buffer mData , or simply by letting it point to the original pcmData with the appropriate block offsets, like in the example above.

The last block of audio samples is most likely going to be a little smaller, as the number of exported samples will probably not be an exact integer multiple of the chosen block size, so some special care must be taken here.

OK, that’s it! Hope this is helpful.
The complete code can be found on github.

 

 

One Comment

  1. Where can I get an updated version of this example using Swift?

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *