The target audio encoding format of iTunes is AAC, and Apple recommends using 96 kHz, 24-bit sources, even though the resulting file will be at 44.1 kHz. So, the process implies a sampling rate conversion stage. Let’s test the quality of this conversion (analysis of SRC done by Alexey Lukin).
Apple’s afconvert utility has several variants of conversion, including norm and bats. Here are the results in norm mode:
afconvert norm transition
afconvert norm tone
afconvert norm pulse
afconvert norm phase
afconvert norm passband
CONCLUSION: not bad, but not exactly the super-quality declared by Apple — the results are rather average, compared to modern professional audio editors.
The second mode is bats and here are the results:
afconvert bats transition
afconvert bats tone
afconvert bats pulse
afconvert bats phase
afconvert bats passband
Here everything is looking much better: all the results are on par with best existing sampling rate converters.
CONCLUSION:there is no difference whether you are converting the sampling rate with an external high-quality converter or with Apple’s recommended converter in bats mode.
The next issue is signal levels.
Apple recommends keeping the maximal peak level of the source file (including all intersample peaks) at −1 dBFS. Can this level be exceeded? Is 1 dB of headroom enough to prevent clipping during the following conversion?
To answer this question one can use another Apple’s utility — afclip. It measures signals levels, including intersample peaks.
We take 3 source files at 44.1 kHz for this experiment: the first one has hard clipping at 0 dBFS, the second one has been limited so that its true peak level is −0.5 dBFS («true peak» level includes intersample peaks), and the third one has been limited so that its true peak level is −1 dBFS (any limiter with intersample peak detection feature could be used for that).
The analysis of source files produces expectable results: the first file shows thousands of clipping intersample peaks, the second and the third one show none. However after converting them to AAC and back the picture is different: the first file looks even worse (as expected). The second file now shows thousands of clipped samples — this indicates that 0.5 dB headroom (even when intersample peaks have been limited) is not sufficient. The third file has performed up to Apple’s Astandards: no clipping was detected and the audio signal has been preserved in a best possible way.
CONCLUSION:Apple’s recommendation on signal peak levels is valid: the headroom of 1 dB is necessary and sufficient for preventing clipping in musical material.
CONCLUSION:The process of adaptation of existing published CDs for iTunes distribution should include limiting of the signal to −1dBTP (true peak) levels and saving it to a 24-bit format.
Now let’s switch from adapting an already published material to mastering of a new material for iTunes without the need of CD publishing.
What is the difference? The main difference is that compression and limiting «just for loudness» do not make sense anymore. The loudness of played tracks in iTunes is controlled by a built-in Sound Check function which automatically matches the loudness of all tracks in the playlist.
So, our first clipped file has been simply attenuated by 8 dB, but the clipping distortion is still there. We made a copy of it with −10 dB attenuation and it has been automatically turned up to 2 dB. Sound Check does not just normalize peak or RMS levels in a file to match the loudness: it works with a more elaborate subjective measure based on Fletcher-Manson loudness curves. If we insert two tones at 50 and 5000 Hz and the same peak level of −10 dBFS in our playlist, their resulting output levels will be −1 and −20.4 dBFS respectively.
CONCLUSION:Compression and limiting do not make sense anymore for increasing loudness; they should only be used for artistic purposes, like shaping the overall dynamics of a track.
The abovementioned recommendation also applies to other audio compression formats, not just AAC.