Introduction
Okay, so I have encountered ffmpeg
in my early career already. And now, although I have got away from that industry, I still love to do some physics simulation for refreshment.
If you have worked with 3D software before, you might know that we render the output in image sequence or simply called frames. To do anything with my simulation, I need to convert that sequence of images into a video.
What is ffmpeg?
ffmpeg
is one of the industry standard transcoder. According to Wikipedia:
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of video and audio files. It is widely used for format transcoding, basic editing (trimming and concatenation), video scaling, video post-production effects and standards compliance (SMPTE, ITU).
Let’s not get into jargon now. Just understand that we can use ffmpeg to convert from one codec to another. We can also use this package to convert from sequence to video.
Basic commands to get comfortable
1. See metadata about any video/audio/image:
$ ffmpeg -i story.mp4
It will throw an output something like this:
|
|
If you want to avoid the initial metadata about the ffmpeg itself, you can do so by appending -hide_banner
to the command. I actually like to create an alias for it in my ~/.bashrc file.
$ ffmpeg -i story.mp4 -hide_banner
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/santosh/Videos/249774011_924906985088194_8434993645291029887_n.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.20.100
Duration: 00:00:13.78, start: 0.000000, bitrate: 797 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p(tv, bt709), 480x854, 744 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 48 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
At least one output file must be specified
In the output you can see the video has one video stream, and one audio stream. The video is encoded using h264
and audio using aac
.
For video, you can also see that its resolution is 480x854 and bps is 744. Also, it’s a 30fps video. The audio is 48000 Hz, and it’s a stereo audio.
2. Convert video from one format to another
You can convert from one format to another by specifying. For more granularity, you can even specify the encoders to use.
ffmpeg -i story.mp4 -c:v copy -c:a libvorbis story.avi
In above example, we are taking story.mp4, we have to convert it into story.avi. Let’s interpret other two parameters. -c:v copy -c:a libvorbis
actually can be split into 2 parts.
-c:v
is used to specify codec for video. We specify copy
which means it will copy same video as it is in source. And for audio, we are using libvorbis
.
You can find the entire list on the ffmpeg codecs documentation.
3. Separate audio/video from a video
There could be a moment where you would want to extract audio from a video. E.g. audio from a video song from YouTube. With ffmpeg
you can achieve that with following command:
ffmpeg -i video.mp4 -vn audio.mp3
Now for -vn
, you can take it memorize it as video no.
Similarly, you can do a following for an audio no. The resultant video will have no audio.
ffmpeg -i video.mp4 -an audio.mp4
4. Convert video into sequence of images
ffmpeg -i video.mp4 image%d.jpg
This is the simplest kind of conversion. In the start of the command, as always, we take the video input. For output, we output in format of image.jpg
. But we should also note that the resultant will be a sequence of images, not a single image. We put a %d
in middle of sequence name and the file format. In this case, the image names will go from image1.jpg to imageN.jpg.
You can modify the image name to be following:
ffmpeg -i video.mp4 image%4d.jpg
The resultant image will now start with image0001.jpg. Not that it is padded with leading 0.
If you want to take this further, you can also do this:
ffmpeg -i video.mp4 -r 24 -f image2 image-%4d.png
The -r 24
here is the frame rate. Without this, ffmpeg is going to assume frame rate to be 30 fps. The image format is specified with -f image2
.
I frankly don’t know what image2 is at the moment. But I anticipate it is very common with jpg/png input and output.
5. Cut media files
If you want to 1 minute chunk from middle of a 6 minutes video, you should consider something like this:
ffmpeg -i video.mp4 -ss 00:03:00 -to 00:04:00 video_out.mp4
The resultant video will start from 3rd minute and will go till 4th minute of the source video.
6. Record screen/webcam/audio
You can capture all of those with ffmpeg, yes.
6.1 Let’s start with the screen recording first
Simplest way to capture your screen is to:
ffmpeg -f x11grab -s 1600x900 -i :0.0 output.mp4
Here, the encoder is x11grab
. -s
is the screen size, in my case it is 1600x900
. The the input is quite cryptic here. :0.0
actually means my first monitor. So for those who are only having 1 monitor setup, this command would work. For those with
Advance version of this would be to capture only a part of the screen. For that, obviously you’d have to lower down the resolution. And secondly, you’d want to append the top left coordinate of the screen. So the input part would look like this -i :0.0+100,200
. It will take top left point to be 100 in x, and 200 in y. And say if you have passed -s 500x500
, it will add 500 in both of x and y. That will be bottom right corner of the record.
6.2 Capture audio
Most basic command.
ffmpeg -f alsa -i default output.mp3
But what if you want to capture screen, as well as audio?
ffmpeg -f x11grab -s 1600x900 -i :0.0 -f alsa -ac 2 -i default output.mkv
The -ac
above is audio channel, which is set to be 2.
6.3 Capture webcam
Capturing webcam is quite simpler.
ffmpeg -i /dev/video0 output.mkv
This simply captures the webcam, which is /dev/video0
in Linux.
But what I noticed in my case it, the default setting is the it does not captures full resolution which my webcam supports. For that, I had to use the following command.
ffmpeg -f v4l2 -framerate 24 -video_size 1280x720 -i /dev/video0 webcam.mkv
Here I’m explicitly setting the -video_size
. -framerate
or -r
is also set. I think it would already would have been 24 by default. I have also specified the encoder, i.e. v4l2. By the way I haven’t seen any difference without it.
7. Convert sequence of images into video/gif
cat my_photos/* | ffmpeg -framerate 24 -f image2pipe -i - -c:v copy video.mkv
If I simply drop down the framerate enough, I will be making a slideshow.
Conclusion
FFmpeg is a free and open-source command-line tool used for processing and converting multimedia files. It can handle various types of audio, video, and image formats and can perform tasks such as converting one format to another, resizing, cropping, adding subtitles, and more. Essentially, it’s a powerful tool for working with media files.
FFmpeg is backend to many application. If you are using any GUI such as for video editing, or screen recording. Then there is a good chance that you are ffmpeg in backend.