I’ve always been fascinated with how YouTube encodes the endless hours of videos uploaded to the site. YouTube employs the best and the brightest and their task is positively Sisyphean.
For years, I’ve gotten a glimpse of YouTube’s encoding practices via download tools like the Wondershare Uniconverter, but it’s always been a very restrictive view. Recently, I learned of a tool called youtube-dl, which reveals how YouTube encodes every file that they make available on the service, every rung, every codec, audio, video, and subtitles. For a researcher like myself, it was (to paraphrase Springsteen and show my age yet again), like finding the keys to the universe in the engine of an old parked car.
This article identifies what youtube-dl is and details how to perform some basic operations.
As stated here:
youtube-dl is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter (2.6, 2.7, or 3.2+), and it is not platform-specific. We also provide a Windows executable that includes Python. youtube-dl should work in your Unix box, in Windows, or in Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
youtube-dl downloads the individual fragments that make up the audio and video and then consolidates them into a single file. Here’s a list of over 700 sites that
youtube-dl supports, which includes Vimeo and Udemy, though I only tested on YouTube.
youtube-dl does not “crack” files encrypted with PlayReady, Widevine, or presumably FairPlay, so if the video files are encrypted with these technologies, you may be able to download them but the result won’t be playable. Again, I tested only on YouTube and didn’t encounter any encrypted files.
There are many tools that let you download videos from YouTube. What’s special about
youtube-dl is that:
- You can generate a file list of all encodes performed by YouTube for a particular file, which I find useful for research (see here).
- You can download the specific file that you want to download; so, if you want the 1080p AV1 version, you can.
You can download
youtube-dl here. I downloaded the Windows version mentioned so I didn’t need Python installed (though it was). Installation for other platforms is covered on the linked page. If you need more detailed help on getting the program installed, check here.
Note that some articles on
youtube-dl indicate that FFmpeg is needed, which makes sense during the file consolidation phase. I have FFmpeg installed at
c:\ffmpeg\bin on my test computer and it’s in the file path, so presumably,
youtube-dl could easily find it. However, when I changed the folder to ffmpegx and didn’t update the path youtube-dl worked as normal. So, I’m not sure if you need FFmpeg installed or not.
Here’s the program documentation which is very well organized and specific. If you’ll be experimenting with multiple functions you’ll want to print this out.
There’s an extensive and useful discussion here on what comprises a
youtube-dl command string and the program’s core functions. I’m just going to cover a very basic feature set that I found very useful in my research.
Downloading a Video
Navigate to the YouTube video that you want to download. Let’s work with this video:
To download the highest quality video iteration, use this string :
Here, you’re calling the command and listing the URL of the YouTube video to download. Here’s what it looks like in the Command window.
The program downloads
Jan Ozer - MPEG-DASH Explained and HTML5 Video-hWbCSAcNOCU.mp4 with the file details shown in Figure 2, most notably that the program downloads both video and audio and muxes it into a file that’s ready for viewing. Note that the file is H.264-encoded video and AAC-encoded audio.
I make a big point about both audio and video because when you download specific output files, as we’ll do in a moment, you only get that file, whether audio or video. If you take this route you’ll have to mux the files together manually (click here for a useful tutorial).
Downloading the File List
Now let’s download a list of all audio/video files encoded by YouTube for this file and save it locally in a file called jan.txt. Use this command adding the
youtube-dl.exe -F https://youtu.be/hWbCSAcNOCU > jan.txt
If you’re trying to figure out how YouTube encodes the files that are uploaded to the service, this data is the Rosetta Stone, with all format, codec, resolution, and data rate information. You see that in addition to the 720p MP4 file that youtube-dl downloaded above, there’s a WebM file encoded using VP9, which is file
247. Let’s download that.
Download a Specific File
To download a specific file in the list, use the command string below.
youtube-dl.exe -f 247 https://youtu.be/hWbCSAcNOCU
This downloads the file shown in Figure 5, which as noted, is only video. If you want to add audio you’ll have to download that separately and then use FFmpeg to mux the two streams together.
To download the subtitles, use this command string. Note that write-auto-sub is YouTube only; check the documentation for commands for other services (which should be
youtube-dl.exe --write-auto-sub https://youtu.be/hWbCSAcNOCU
The program downloads the VTT file shown in Figure 6.
That’s as far as I got with the program; which was all I needed to complete my research. As noted in several of the referenced articles, I’m barely scratching the surface, and
youtube-dl is capable of a whole lot more.
Develops training courses for streaming media professionals; provides encoding-related testing services to encoder developers; helps video producers perfect their encoding ladders and deploy new codecs. Jan blogs primarily at the Streaming Learning Center.