Dollop of Desi: September 2011

Audio files have inherent characteristics like number of channels, sample size, frame size, sample rate, file type, number of samples, etc. to quote a few.

Here is a nice description of samples and channels from the java sound faq.

Each second of sound has so many (on a CD, 44,100)
		  digital samples of sound pressure per second. The number of
		  samples per second is called sample rate or sample
		  frequency. In PCM (pulse code modulation) coding, each
		  sample is usually a linear representation of amplitude as a
		  signed integer (sometimes unsigned for 8 bit).  

There is one
		  such sample for each channel, one channel for mono, two
		  channels for stereo, four channels for quad, more for
		  surround sound. One sample frame consists of one sample for
		  each of the channels in turn, by convention running from
		  left to right.

Each sample can be one byte (8 bits), two bytes (16
		  bits), three bytes (24 bits), or maybe even 20 bits or a
		  floating-point number. Sometimes, for more than 16 bits per
		  sample, the sample is padded to 32 bits (4 bytes) The order
		  of the bytes in a sample is different on different
		  platforms. In a Windows WAV soundfile, the less significant
		  bytes come first from left to right ("little endian" byte
		  order). In an AIFF soundfile, it is the other way round, as
		  is standard in Java ("big endian" byte
		  order). 

Some more audio file fundamentals from the javadocs...

frameSize is the number of bytes in each frame of a sound that has this format.

sampleRate is the number of samples played or recorded per second, for sounds that have this format.

More definitions can be found here.
Audio files are made up of samples and samples are made up of bytes.
Audio files come in two types, mono and stereo. Mono files have only one channel (perhaps was recorded with one receiver). Stereo audio files can have from two to as many channels.

Mono audio files when played, samples from the single channel are automatically duplicated and sent to all the channels. For example if speakers are attached to a computer, that would be two channels.In the case of a stereo audio file, it is pre-recorded for multiple channels.

Now what if you want to interleave two different audio files into one audio file such a way that each audio file is played on separate channels. There are many possible use cases for doing that. One could be to get the remix effect. For example you can take a song and mix it with some background music or add some percussion effect. Of course there are many sophisticated audio software out there which can do this and much more. But the goal here is to demonstrate a simple java program to achieve interleaving of audio files.

Here I am going to take two mono audio files of the same format (wave file), encoding (PCM) and sample rate. Interleave them to produce another wave file with different audio playing on each channel.

I am using Java Sound API for this exercise. Java supports only AU, AIFF and WAV formats. Extensions/plug-ins for MP3 and other audio formats are available through 3rd party vendors.
The audio files used in this example are leftChannelAudio.wav and rightChannelAudio.wav

The format details for the audio files are

rightChannelAudio.wav

nbChannel = 1

frameRate = 44100.0

frameSize = 2

sampleSize(bits)= 16

nbSamples = 1105408

encoding = PCM_SIGNED

sample rate = 44100.0

leftChannelAudio.wav

nbChannel = 1
frameRate = 44100.0
frameSize = 2
sampleSize(bits) = 16
nbSamples = 1069056
encoding = PCM_SIGNED
sample rate = 44100.0

Here is the main method of the class.

public static void main(String[] args) {
        try {
            String soundFileLeft = "leftChannelAudio.wav";
            File fileLeft = new File(soundFileLeft);   // This is the file we'll be playing on the left channel.
            String soundFileRight = "rightChannelAudio.wav";
            File fileRight = new File(soundFileRight);
            
            float sampleRate = 44100.0f;
            int sampleSizeInBits = 16;
            int channels = 2;
            boolean signed = true;
            boolean bigEndian = false;
            AudioFormat targetFormat = new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian);
            AudioMixer mixAudio = new AudioMixer(fileLeft, fileRight, targetFormat);
            
            File outFile = new File("outSingleSingleMixer.wav");
            mixAudio.mixnWrite(AudioFileFormat.Type.WAVE, outFile);
        } catch(Exception e) {
            e.printStackTrace();
        }
    }

The interleaving of audio bytes is done in mixIntoStereoAudio method as shown below.

 private AudioInputStream mixIntoStereoAudio(AudioInputStream leftAudioInputStream,
                                                AudioInputStream rightAudioInputStream) throws IOException{
        ArrayList byteArrays = new ArrayList();
        int nbChannels = 1;
        byte[] compiledStream = null;
        int leftAudioBytes = -1;
        int rightAudioBytes = -1;
        
        byteArrays.add(convertStream(leftAudioInputStream));
        byteArrays.add(convertStream(rightAudioInputStream));
        
        long maxSamples;
        if (leftAudioInputStream.getFrameLength() > rightAudioInputStream.getFrameLength()) {
            maxSamples = leftAudioInputStream.getFrameLength();
            nbChannels = leftAudioInputStream.getFormat().getChannels();
        } else {
            maxSamples = rightAudioInputStream.getFrameLength();
            nbChannels = rightAudioInputStream.getFormat().getChannels();
        }
        long maxOutputSizeinBytes = maxSamples * sampleSizeinBytes;
        if (nbChannels == 1)
            maxOutputSizeinBytes = maxOutputSizeinBytes * 2;
        
        compiledStream = new byte[(int) maxOutputSizeinBytes]; //max size of number of bytes
            
        log.info("Output bytes size: " + compiledStream.length);
        for(int i = 0; i < compiledStream.length; i += sampleSizeinBytes){
            leftAudioBytes = writeSamplestoChannel(byteArrays, 0,
                    sampleSizeinBytes, compiledStream, leftAudioBytes, i);
            
            i += sampleSizeinBytes;
            
            rightAudioBytes = writeSamplestoChannel(byteArrays, 1, 
                    sampleSizeinBytes, compiledStream, rightAudioBytes, i);
        }

        AudioInputStream newaudioStream = generateNewAudioStream(compiledStream);
        return newaudioStream;
    }

In this case the sample has two bytes. In a stereo audio file samples are written out in the following fashion. First sample for left channel and second for right and so on.
This example uses a simple technique of filling a sample from each audio file for each channel respectively.Filling in silence value whenever we run out of samples from an audio file. This is because the two audio files being interleaved might not have the same number of samples.

This technique can be extended easily to handle stereo audio files or list of audio files for each channel.
Go ahead and run this program with the sample audio files provided or you can just listen to how the interleaved output file plays out.

The complete program and sample audio files are available here.

Some good resources on using Java Sound API are listed below.

http://www.jsresources.org/faq_audio.html
http://www.builogic.com/java/javasound-read-write.html

Dollop of Desi

wikipedia

Friday, September 23, 2011

Interleaving audio files to different channels

About Me

Blog Archive