Converting audio samples using libav*

Hello guys,
I've been looking for a solution for several days by now and have not gotten any smarter, I hope that someone of you is able to help me
What I'm currently working on is an application able to generate audio fingerprints (specifically acoustID fingerprints). To do so, I started with the sample
code that demonstrates the basic fingerprinting capability. You can find it here.
It is part of a library called chromaprint.
As you will see, a lot of this code uses deprecated functions. As a newbie to sound conversion, the sense of a lot of the calculations in the file is oblivious to me
Still, I want to renew the code to use functions that are up-to-date. Ideally I'd understand the code more thorough then, or so I thought.
I started modifiying the decode_audio_file function from fpcalc.c and got until here, where I am now stuck:
int decode_audio_file(ChromaprintContext *chromaprint_ctx, int16_t *buffer1, int16_t *buffer2, const char *file_name, int max_length, int *duration)
AVFormatContext *format_ctx = NULL;
AVCodecContext *codec_ctx = NULL;
AVAudioConvert *convert_ctx = NULL;
AVStream *stream = NULL;
AVCodec *codec = NULL;
AVPacket avpacket;
AVFrame *decoded_frame = NULL;
FILE *f;
int frameFinished = 0;
int stream_id, ok = 0;
int buffersize = AVCODEC_MAX_AUDIO_FRAME_SIZE + FF_INPUT_BUFFER_PADDING_SIZE;
uint8_t inbuf[buffersize];
/* initialize data packet that is read from the stream */
av_init_packet(&avpacket);
avpacket.data = inbuf;
avpacket.size = buffersize;
/* make space for frame that contains decoded data */
decoded_frame = avcodec_alloc_frame();
/* tell format_ctx about the input */
if (avformat_open_input(&format_ctx, file_name, NULL, NULL) < 0) {
fprintf(stderr,"ERROR: couldn't open the file");
goto done;
if (avformat_find_stream_info(format_ctx, 0) < 0) {
fprintf(stderr,"ERROR: couldn't find stream information in the file");
goto done;
for (int i = 0; i < format_ctx->nb_streams; ++i) {
codec_ctx = format_ctx->streams[i]->codec;
if (codec_ctx && codec_ctx->codec_type == AVMEDIA_TYPE_AUDIO) {
stream = format_ctx->streams[i];
break;
if (!stream) {
fprintf(stderr,"ERROR: couldn't find any audio stream in the file\n");
goto done;
codec = avcodec_find_decoder(codec_ctx->codec_id);
/* chromaprint expects signed 16 bit samples */
codec_ctx->request_sample_fmt = AV_SAMPLE_FMT_S16;
if (codec_ctx->sample_fmt != AV_SAMPLE_FMT_S16) {
convert_ctx = av_audio_convert_alloc(AV_SAMPLE_FMT_S16, codec_ctx->channels,
codec_ctx->sample_fmt, codec_ctx->channels, NULL, 0);
if (!convert_ctx) {
fprintf(stderr, "ERROR: couldn't create sample format converter");
goto done;
if (!codec) {
fprintf(stderr,"ERROR: unknown codec");
goto done;
if (avcodec_open2(codec_ctx, codec, NULL) < 0) {
fprintf(stderr,"Could not open codec\n");
goto done;
chromaprint_start(chromaprint_ctx, codec_ctx->sample_rate, codec_ctx->channels);
*duration = stream->time_base.num * stream->duration / stream->time_base.den;
int len;
while (av_read_frame(format_ctx, &avpacket)>=0) {
if (avpacket.stream_index == stream->id) {
len = avcodec_decode_audio4(codec_ctx, decoded_frame, &frameFinished, &avpacket);
if(frameFinished) {
int data_size = av_samples_get_buffer_size(NULL, codec_ctx->channels,
decoded_frame->nb_samples,
codec_ctx->sample_fmt, 1);
if (convert_ctx) {
const void *ibuf[6] = { decoded_frame->data };
void *obuf[6] = { buffer2 };
int istride[6] = { av_get_bytes_per_sample(codec_ctx->sample_fmt) };
int ostride[6] = { 2 };
len = data_size / istride[0];
if (av_audio_convert(convert_ctx, obuf, ostride, ibuf, istride, 4) < 0) {
fprintf(stderr,"WARNING: unable to convert %d samples\n", len);
break;
if (!chromaprint_feed(chromaprint_ctx, buffer2, decoded_frame->nb_samples/2 )) {
fprintf(stderr,"ERROR: fingerprint calculation failed\n");
goto done;
} else if (!chromaprint_feed(chromaprint_ctx, decoded_frame->extended_data, decoded_frame->nb_samples)) {
fprintf(stderr,"ERROR: fingerprint calculation failed\n");
goto done;
ok = 1;
done:
avformat_close_input(&format_ctx);
avcodec_free_frame(&decoded_frame);
return ok;
You can find the documentation for the used chromaprint functions here.
As I said, I am not really sure whether I have understood all of what is going on in the unmodified version of the function, so please bear with me
Any answers or suggestions for reading material will be gladly appreciated!
PS: goto marks will be removed once the code works
Last edited by n0stradamus (2013-03-29 23:10:36)

First slice the track using the Strip Silence function (in L9 you could also use Flex mode to splice the track), then convert all the resulting regions into new Audio files using the copy/convert function. Then drag all those files into the EXS (hit the edit button first) using the contiguous mapping feature, set the lowest note to C0 or wherever you want the mapping to start.

Similar Messages

Converting audio samples to EXS24 instrument

I have an audio track that contains a sequence of samples from a live instrument. The samples run from C0 to C7 in increasing order.
What's the easiest way to turn this into an EXS24 instrument?
Thanks
Eric

First slice the track using the Strip Silence function (in L9 you could also use Flex mode to splice the track), then convert all the resulting regions into new Audio files using the copy/convert function. Then drag all those files into the EXS (hit the edit button first) using the contiguous mapping feature, set the lowest note to C0 or wherever you want the mapping to start.

Importing audio - sample rate/bit depth

Hi forum,
I am working on project at 44.1K, 24 bit. Audio elements are being sent to me to be added. Some have come in incorrectly; at 48K, 16 bit. I can convert easily but didn't think I needed to.
I thought by selecting the "Convert Audio Sample Rate When Importing" option when creating the project that would all be worked out.
That is what I've done - and the file seems to be the correct pitch -- yet shows up in the audio window with it's original specs (48/16). Also ... will logic keep it at 16 bit and play all other files at 24 bit?
I want to be sure about this ... something seems fishy.
Cheers
Dee, Ottawa

Hi,
I am re posting.
Regarding the same project: must everything in the arrange window of a project be the same sample rate (and bit depth). My understanding is that there is real- time conversion during playback. That all file types supported by logic - and all virtual instrument samples are converted in real time to conform to the selected bit depth and sample rate of project.
I ask only as I recieved reference sound files to temporarily place in a mix to see how mix will sit when going to Post. Two audio files are almost a semi tone higher than they are supposed to be (which is odd - so I am pretty sure it was just quickly sung in the wrong key at there end). And one file which was supposed to be timed out is not lining up.
I can work around this on this project. And I can simply convert in the sample editor and re import to compare.
But again ... I just want to check my understanding for future reference. The manual indicates differing rates etc should not be a problem. (ie: That Logic allows one to have differing rates and bit depths. Inversely, the M Sitter video implies just the opposite.
I just want to be sure of this for future reference. Any adive?
Thanks in advance.
Cheers
Dee

Logic Pro X - Is there a way to convert a sample kick from audio to midi ?

Is there a way to convert a sample that you drag into logic pro for example im using a sample kick for a dance track, and want to use it as MIDI because it a **** of a lot easier to edit than a audio file, is this possible?

Not exactly.
This site explains MIDI in a straightforward and eay way:
http://morganstudios.com/Tutorials/MIDI.htm
But for a "solution" of sorts, you can use the EXS24 to sample the kick.
Video of how to add the EXS24 on a track in your project (Logic X):
http://www.youtube.com/watch?v=3vbjlZZs_24
How to add your own sample to the EXS24 (Don't worry that it's Logic 9):
http://www.youtube.com/watch?feature=player_detailpage&v=tB4_2IYBess#t=53
Hopefully this helps you out!

Q: Convert audio bit & sample rate during import

I have a project that's in 24bit/48KHz, and want to import an audio file (mp3) that's in 16bit/44.1KHz.
When I import the mp3 file into the project either through drag & drop or the Add Audio File dialog box, it doesn't convert the file into the project's native bit & sample rate. Consequently, when I play it back in the Arrange window it plays in chipmunk voice.
I thought Logic used to automatically convert audio files into native resolution. How do I do this in Logic 8?
Thanks,
Steven

Thanks for the response. My project currently does have those options checked, and it's still not converting.
To double check, the mp3 file does get copied to the project's Audio Files directory. It also creates a file with the same name and the file extension .ovw - but doesn't seem to be converted.
Any ideas? Thanks again -

Hoe I can convert a word or sentence into audio output using LabVIEW like the windows narrator?

I am converting Braille script to audio output using LabVIEW. So, I need to convert the script to english and then to audio. How can I do it?

Amaravian wrote:
I am converting Braille script to audio output using LabVIEW. So, I need to convert the script to english and then to audio. How can I do it?
From what I've seen in my brief search, you need to go the ActiveX route.

Converting audio using QT 10

Hi
How do I convert audio files in QT 10. It was possible in older versions using the export command but that function has gone in this version. Im trying to convert a WAV to AIFF.
Many Thanks
Adrian

Quicktime X can't do this.... You need to install Quicktime 7 from your Leopard installation disk, here's how:
http://support.apple.com/kb/HT3678
And then you need to buy (if you don't already have) the Pro key from Apple's online store:
http://store.apple.com/us/product/D3380Z/A?fnode=NDQ4OTY4OA&mco=MTM3NDgwNTg
You can convert the audio file from a WAV to AIFF in iTunes for free.
Message was edited by: David M Brewer

What is the best way to deal with different audio sample rates on the same timeline ?

what is the best way to deal with different audio sample rates on the same timeline ?

You don't have to do anything special. If possible, start your project with a clip that has the desired target frame rate and audio sample rate, and your project parameters will be set automatically. Other sample rates will be converted under the covers.
For example, if your video is shot at 48khz, you can add music files at 44.1khz with no problem.
If you are recording audio that you want to synch with video (multicam), you will get best results if everything is 48khz, but you can use 44.1 if that is all you have. Once I forgot to reset my Zoom to 48,000 and it still worked.

Javax.sound.sampled - Writie PCM audio samples to an AIFF file?

Greetings,
I've been going at this for awhile and seem to be stuck.
I'm importing data from an old system that stored audio recordings in raw PCM. I can easily use a Clip object to convert that PCM into a DataLine (Clip) object and play it.
Now I need to write out that audio data as an AIFF file (actually I need to capture it and store it in a database, but the AIFF file format seems like a good, portable, format).
To write a file using AudioSystem.write() you need an AudioInputStream. To create an AudioInputStream you need a TargetDataLine. However, I can see no direct way of creating a TargetDataLine that can be populated with raw data.
Reading the documention (I know, it's a bad habit), it seems like what I needed to do is get a mixer, create a SourceDataLine with my data, then pump the SourceDataLine with the audio samples. The mixer would do nothing but pass the data on to the TargetDataLine where it would be captured and written to disk. Here's my code:
public void writeAIFF( OutputStream out ) throws IOException
try
// First, find a mixer (we assume it will be a software mixer) that can provide us with
// both source and target datalines that can handle the desired format.
AudioFormat format = getAudioFormat(); // This is the format of our data
DataLine.Info sourceInfoRequest = new DataLine.Info(SourceDataLine.class,format);
DataLine.Info targetInfoRequest = new DataLine.Info(TargetDataLine.class,format);
Mixer ourMixer = null;
// Get all of the available mixers
Mixer.Info[] info = AudioSystem.getMixerInfo();
for (int i=0; i<info.length; i++)
Mixer mixer = AudioSystem.getMixer(info);
if (mixer.isLineSupported(sourceInfoRequest) && mixer.isLineSupported(targetInfoRequest))
ourMixer = mixer;
break;
if (ourMixer==null)
throw (new IOException("can't obtain audio components"));
// Get the source and target lines from the mixer
SourceDataLine sourceLine = (SourceDataLine)ourMixer.getLine(sourceInfoRequest);
TargetDataLine targetLine = (TargetDataLine)ourMixer.getLine(targetInfoRequest);
AudioInputStream targetStream = new AudioInputStream(targetLine);
// Load up the source line with the data
sourceLine.open(format);
sourceLine.write(samples,0,samples.length);
// Write our data out as an AIFF file
AudioSystem.write(targetStream,AudioFileFormat.Type.AIFF,out);
// So what happens to all of these lines and mixers when we're done?
catch ( LineUnavailableException noline )
throw (new IOException("audio line unavailable: "+noline.getMessage()));
My problem is that the code never finds a mixer. The statement (mixer.isLineSupported(sourceInfoRequest) && mixer.isLineSupported(targetInfoRequest)) is always false.
So here's my question (finally): Am I going about this the right way? My other thought is just to create my own TargetDataLine object and populate it with the data myself.
Any thoughts or suggestions?
P.S. I hope this is the right place for this question -- this is the first time I've used this forum
P.P.S. Java 1.3.1 (MacOS X and Windows 2000)

Solved -
This was way over engineered! Overlooked in my reading was a constructor for an AudioInputStream that can use any input stream as it's source.
This was all I needed:
     AudioFormat format = getOurAudioFormat();               // Get the format of our audio data
     AudioInputStream targetStream = new AudioInputStream(new ByteArrayInputStream(samples),format,samples.length);
     AudioSystem.write(targetStream,AudioFileFormat.Type.AIFF,out);

Audio upsampling using Quicktime: best export settings?

I'm using Quicktime to convert audio from 32 kHz to 48 kHz, for use in FCE/FCP. Quicktime gives me several options for export settings. I want to make the conversion lossless, without increasing the size of the audio file too much.
These are the settings I'm using so far:
Format: Linear PCM
Channels: Stereo (L R)
Rate: 48 kHz
Sample Rate Converter Settings: Quality: Best
Linear PCM Settings: Sample size: 16 bits
Are these settings the best choices? Does anyone have experience doing this kind of upsampling?
Here are the other options in Quicktime:
Format: Linear PCM, A-Law 2:1, IMA 4:1, MACE 3:1, MACE 6:1, QDesign Music 2, Qualcomm PureVoice, and mu-Law 2:1
Channels: 2 Discrete Channels, Mono, Stereo (L R)
Rate: there are other choices, but I know 48 kHz is what I want
Advanced Settings:
Sample Rate Converter Settings: Faster, Fast, Normal, Better, Best
Linear PCM Settings (when Linear PCM is the format): 8 bit, 16 bit, 24 bit, 32 bit (floating point checked), 32 bit (floating point unchecked).
The file size grows quickly as you increase the Linear PCM bit settings.
PowerMac G5 Quad 2.5 GHz 3GB RAM Mac OS X (10.4.5) NVIDIA GeForce 7800GT

What you have picked makes sense. The file will be 50% larger.

How to convert audio only avi to wav?

How to convert audio only avi to wav?
Can I extract only the audio from an avi and insert in to a wav file?
Regards,
Achi

Sure.
You just need to program a Processor to turn off any video tracks, transcode to a WAV-acceptable audio encoding, and then use a DataSink to store the output of the processor as a WAV file.
Also, I bet the following sample program will do it without any modification...
[http://java.sun.com/javase/technologies/desktop/media/jmf/2.1.1/solutions/Transcode.html]

Creative Video/Audio Samples

Can someone please upload the creative video and audio sample when you first bought your creative? products?
i recently formatted my x fi 2 and forgot to back up it..
thankss!

Zipert wrote:
Hello Everyone,
I've got a SDHC card 8GB in my ZEN and I converted some family guy videos which are perfect if I play them on the computer.
I have put all the episodes of family guy on my ZEN and when I played it, it just played but the longer it plays the more the video stays behind on the audio. But when I play the SAME episodes which I put on my ZEN, on the computer it just works fine...
HEEEEEELP!!!
Zipert
That sounds like a software conversion problem. Try using this program http://forums.creative.com/creativel...thread.id=2983? to encode the video & then? load it on the card. I've used it & it works really good,fast & no playback issues. It also has a preset for the Zen

Convert Audio to Aiff

Hello,
I imported some sound into FCE 4. However, whenever I try to play the current Sequence without rendering, the video does play, but I don't hear any audio.
I take it that this is because the audio needs to be converted to AIFF? If so, how do I convert to AIFF? I looked for it in iTunes and it's not there. Garage band also only seems to let you export to AAC or Mp3 formats.
Any help? Thanks.

I ask again, where did this video come from? It's doesn't conform to any standard that I'm aware of, and it's certainly not FCE-native. You need to convert the video so that it conforms to one of FCE's Easy Setups (DV-NTSC or DV-PAL, depending on where you are).
I've never worked with H.264 media, but I think a free utility called MPEG Streamclip can do the conversion. (http://www.squared5.com/)
Open your video in Streamclip, then export to QuickTime using the appropriate Compression, frame size, frame rate, audio sample rate, etc. to match your Easy Setup. Import the resulting QuickTime file into FCE.

Audio sample rate does not match (HDcam to dvcam)

I'm trying to import clips and keep getting this message:
"The audio sample rate of one or more of your captured media files does not match the sample rate on your source tape. This may cause the video and audio of these media files to be out of sync. Make sure the audio sample rate of your capture preset matches the sample rate of your tape."
Footage was originally shot on HDCAM and transferred to DVCAM elsewhere. Using FCP 5, am importing via firewire from a Sony DSR-11 deck. Using DV NTSC 48kHz Anamorphic as capture settings (though I've tried everything that I thought might possibly work with no success). The audio does not seem to drift over the course of several 5 minute or so clips. Clip settings show audio at 48 kHz (don't know if that's from capture settings or from actual data). Seems to me all audio should be 48 kHz 16 bits, so can't figure out what's going on. I have to export an EDL for the project to be finished in HD. Read some similar threads that ended in December, seemingly without much resolution. My broader concern is why this is happening; my immediate concern is do I need to worry about this right now since the media files will need to be recaptured in HD anyway. Any thoughts?
Thanks

A little more info. I'm having this problem on 4 tapes (from different cameras) that were transferred to DVCAM in a squished format to appear full screen on a 4x3 monitor. Video that was letterboxed and I can bring in with the standard DV NTSC capture settings does not have this problem. Still have the problem if I try to import the clips from the squished video with standard settings. Any thoughts?

How can I compress an audio sample?

Hi, I have raw audio sample in a byte array. I want to compress this sample. more specifically, I want to develop a method which will take this byte array as an input and returns the compressed sample in the byte array again. Please suggest me if anyone knows about this.
The code which I am using for capturing the data through the microphone is as follows. After executing the following, a byteArrayOutputStream is created, from which I am getting the captured audio byte array just by calling byteArrayOutputStream.toByteArray() method.
public void captureAudio(){
    try{
      //Get everything set up for
      // capture
      audioFormat = getAudioFormat();
      DataLine.Info dataLineInfo =
                new DataLine.Info(
                  TargetDataLine.class,
                   audioFormat);
      targetDataLine = (TargetDataLine)
                   AudioSystem.getLine(
                         dataLineInfo);
      targetDataLine.open(audioFormat);
      targetDataLine.start();
      //Create a thread to capture the
      // microphone data and start it
      // running. It will run until
      // the Stop button is clicked.
      Thread captureThread =
                new Thread(
                  new CaptureThread());
      captureThread.start();
    } catch (Exception e) {
      System.out.println(e);
      System.exit(0);
    }//end catch
}//end captureAudio method
class CaptureThread extends Thread{
//An arbitrary-size temporary holding
// buffer
byte tempBuffer[] = new byte[10000];
public void run(){
    byteArrayOutputStream =
           new ByteArrayOutputStream();
    stopCapture = false;
    try{//Loop until stopCapture is set
        // by another thread that
        // services the Stop button.
      while(!stopCapture){
        //Read data from the internal
        // buffer of the data line.
        int cnt = targetDataLine.read(
                    tempBuffer,
                    0,
                    tempBuffer.length);
        if(cnt > 0){
          //Save data in output stream
          // object.
          byteArrayOutputStream.write(
                   tempBuffer, 0, cnt);
        }//end if
      }//end while
      byteArrayOutputStream.close();
    }catch (Exception e) {
      System.out.println(e);
      System.exit(0);
    }//end catch
}//end run
}//end inner class CaptureThreadI am new to sound api and I hope you got my question. Please ask me if anything required from my side.
Regards

Thanks Andrew and Captfoss, I agree both of you and I am really a newbie. Andrew, you are talking about the decreasing the quality, like what we do in video conferencing application where quality of an image not really matters [we set the jpeg quality to 0.5 etc]. I'll definitely do that. Please tell me the values for lower quality sound. The audio format I am using currently is as follows:
private AudioFormat getAudioFormat(){
    float sampleRate = 8000.0F;
    //8000,11025,16000,22050,44100
    int sampleSizeInBits = 16;
    //8,16
    int channels = 1;
    //1,2
    boolean signed = true;
    //true,false
    boolean bigEndian = false;
    //true,false
    return new AudioFormat(
                      sampleRate,
                      sampleSizeInBits,
                      channels,
                      signed,
                      bigEndian);
}//end getAudioFormatcaptfoss, I also thought of ULAW compression. But I am not getting the way to apply this compression on the piece of sound which is stored in a byte[]. I am capturing the sound in piece by piece through the microphone.each piece of sound is stored in a byte[]. Now I want to compress this byte[] using ULAW. How can I do that? I don't have any file on the hard disk as I am capturing through the mic. Also, I don't want the compressed file to be stored on the hard disk. Instead, I'll want the compressed output in a byte[].

Converting audio samples using libav*

Similar Messages

Maybe you are looking for