Extract Audio from Video using FFmpeg

Modifying and processing audio and video is a daily task for media professionals. One common operation is to extract audio from video because there are many audio-only applications such as creating podcasts, remixing audios, transcription, dubbing, and so much more!

And, extracting audio is not difficult, especially, if you have FFmpeg installed on your computer.

In this guide, I’ll walk you through the process of extracting audio from a video using FFmpeg. We’ll cover the basics (installing FFmpeg), extracting audio, and explore auxiliary topics like converting to different audio formats, adjusting the bitrate, and extracting multiple audio tracks.

Let’s get started with installing FFmpeg. If you already have FFmpeg on your machine, skip the following section and go directly to the code.

Table of Contents

Step 0: Installing FFmpeg on your machine

It all begins with ensuring FFmpeg is installed on your computer – be it, Linux, Windows, or Mac.

For Linux users (Ubuntu or Debian), you can install it using apt as follows –

sudo apt-get install ffmpeg

For Mac users, you can use brew –

brew install ffmpeg

For Windows users, you can download the latest precompiled build from FFmpeg’s official website, unpack it, and you’re all set.

Assuming you have completed the installation process, let’s move on to the code.

Extract Audio from Video Using FFmpeg

Before you extract audio from video, you should first understand what format is your audio in inside the audio-video container. What I mean to say is, if you are extracting an audio track from an MP4 file, do you know if your audio is in mp3, or aac format? If you are unsure about audio codecs, check out this beginner’s guide to audio codecs and containers.

To find out which codec your audio track is using, run the command –

ffmpeg -i inputVideo.mp4 -hide_banner

Here is the output of the command for the video that I am using

[krishna@debian Downloads]$ ffmpeg -i inpVideo.mp4 -hide_banner
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'inpVideo.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2021-09-19T10:50:04.000000Z
  Duration: 00:01:00.05, start: 0.000000, bitrate: 2440 kb/s
  Stream #0:0[0x1](und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], 2309 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      creation_time   : 2021-09-19T10:50:04.000000Z
      handler_name    : ISO Media file produced by Google Inc. Created on: 09/19/2021.
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      creation_time   : 2021-09-19T10:50:04.000000Z
      handler_name    : ISO Media file produced by Google Inc. Created on: 09/19/2021.
      vendor_id       : [0][0][0][0]

From the output, we can see that the audio is in the aac format.

Now that you know your audio is in aac format, you need to decide the output audio format. Do you want your audio as an aac file or mp3 or wav? Let’s try all three options –

Extract audio to AAC

If both the input and outputs are in aac format, then the command line is very simple. The command is –

ffmpeg -i inputVideo.mp4 -acodec copy outputAudio.aac

In this command,

inputVideo.mp4 is the name of the input video file that contains the audio track that we want to extract.
outputAudio.aac is the file name where we want to store the extracted audio file. Here it matches the input audio format (aac).
acodec copy implies that FFmpeg should not perform any re-encoding. Since the input and output file types are the same (aac), we can instruct FFmpeg to simply copy the audio to the output.

If we want to re-encode the content, then skip forward to the next section where we show you how to extract audio from video with re-encoding and changing the audio quality.

Extract audio to MP3

If you are going to save the audio as an MP3 file, simply set the output file format to mp3. FFmpeg will automatically choose the right codec to convert aac to mp3 and re-encode the audio track. In the next section, we will learn how to control the audio quality via re-encoding and adjusting the bitrate.

ffmpeg -i inputVideo.mp4 outAudio.mp3

Extract audio to WAV

If you are going to save the audio as an WAV file, simply set the output file format to wav. FFmpeg will use the right codec to convert aac to wav. Again, we can control the parameters via re-encoding which we will learn next.

ffmpeg -i inputVideo.mp4 outAudio.wav

Okay, let’s move on to the next section now.

Adjusting Bitrate and Re-Encoding Audio

When you want to extract audio from video and store it in a format like mp3 or wav, and change its bitrate, then you need to re-encode it.

For this, you’ll need to specify the bitrate and the audio codec you want to use.

Bitrate determines the quality and size of the audio file. You can adjust it to balance between file size and audio quality – if you set a high bitrate, the quality will be better, but the file size will also increase. On the other hand, if you set a lower bitrate, the file size will be small, and the quality will reduce.

For example, if you want to set the audio bitrate to 192 kbps, in your commandline, you need to specify the following –

ffmpeg -i inputVideo.mp4 -vn -acodec libmp3lame -b:a 192k outAudio.mp3

If you do not want to specify the audio bitrate, then you can use the -q:a parameter that specifies the audio quality. Here we specify the codec as libmp3lame and set the quality to 2 using -q:a which produces a high quality output. We should also set the output file’s extension to mp3.

ffmpeg -i inputVideo.mp4 -vn -acodec libmp3lame -q:a 2 outAudio.mp3

Handle Multiple Audio Tracks Using the map command in FFmpeg

We can use the map command to accurately target a particular audio track in a video that has multiple audio tracks.

In the output shown above, there is one stream, which has one audio and one video. So, the way to target the audio is by using the map command and setting it to -map 0:a:0. This means that you are targeting the first audio track (a:0) in the first stream (-map 0:).

If there were two audio tracks you can use -map 0:a:1 to target the 2nd audio.

Note: Remember that FFmpeg counts from 0 and not 1. This is a common mistake that Python programmers might make 🙂

Extract Multiple Audio from Video

Now that you know how to use the map command, we can easily extract multiple audio tracks if they exist in a video. Sometimes, videos have multiple audio tracks, like different languages or commentary and FFmpeg allows you to extract these tracks into separate audio files.

First, list the audio streams:

ffmpeg -i input_video.mp4 -hide_banner

Look for the audio stream IDs, and then extract them using the -map option along with the appropriate stream ID:

ffmpeg -i input_video.mp4 -map 0:a:0 output_audio_1.mp3
ffmpeg -i input_video.mp4 -map 0:a:1 output_audio_2.mp3

In these commands, 0:a:0 and 0:a:1 refer to the first and second audio streams, respectively.

Conclusion

By now, you’ve gained a solid understanding of how to extract audio from a video using FFmpeg. Starting with the basics, you’ve learned how to convert to different audio formats, adjust bitrate, and re-encode it. Moreover, you’re equipped to deal with videos containing multiple audio tracks.

To learn more about FFmpeg, head over our Recipes in FFmpeg section.

Until next time, happy streaming!

Krishna Rao Vijayanagar

Founder at OTTVerse

Krishna Rao Vijayanagar, Ph.D., is the Editor-in-Chief of OTTVerse, a news portal covering tech and business news in the OTT industry.

With extensive experience in video encoding, streaming, analytics, monetization, end-to-end streaming, and more, Krishna has held multiple leadership roles in R&D, Engineering, and Product at companies such as Harmonic Inc., MediaMelon, and Airtel Digital. Krishna has published numerous articles and research papers and speaks at industry events to share his insights and perspectives on the fundamentals and the future of OTT streaming.