Compressing mp4 files, listing directory contents, and copying files to a remote server in Python

Bear with me.

In this tutorial, I’m going to create a ridiculously specialized app that might not have relevance to anyone but me. But, as I describe how I programmed the application, hopefully I’ll communicate the basics of media compression and file system operations in Python effectively enough that you can take these lessons and apply them to whatever you want to create.

In this article, I create an application that takes mp4 files and compresses them further before posting them to a web server. As a teacher who records his lectures, I need to do this. I have a device that records my lectures straight to mp4, capturing whatever I put on my computer screen and whatever I say into the mic, but the resulting mp4s are not as small as they could be. I seldom include full-motion video in my lectures, and the video changes far more slowly than the audio. So, I should be able to make do with far fewer frames per second than what my recording device assume when it produces the mp4. It is important that I compress the mp4 recordings to a reasonable size that doesn’t compromise quality, because I post these recordings on a website so that my students can watch them later. If the file size is too big, it will take them too long to download and watch it, even if they view it as the video continues to buffer. So, I needed a solution that could reduce the recordings to a smaller size, sacrificing the video quality that I do not need for my application.

My first thought was to utilize a tool like Camtasia, which I’ve used for a long time to record and produce my lectures. But, as wonderful as Camtasia is, it doesn’t have an API that would allow me to automate it, and so I’d be doing a lot of manual manipulation with Camtasia to compress the lectures I record. Since I typically give 4 or 5 lectures per day, using Camtasia seemed too inconvenient. So, I started exploring more automated approaches that didn’t depend on Camstasia, and I eventually decided to write my own.

Since I was writing my own application, I then realized I could automate the rest of the use case, too. After I record and compress the lecture recordings, I upload them to our university department’s web server. I have always done this manually using winscp. Why not automate that, too, while I was at. And so, I did, and you’ll see how in this post.

I’m a big fan of Python, mostly because it has a vibrant developer community that has helped produce an unrivaled collection of libraries that enable one to add features to applications quickly. So, I wrote the app I’m presenting here in Python.

To compress the mp4s, I used the fabulous ffmpy library.  ffmpy is a Python library that provides access to the ffmpeg command line utility. ffmpeg is a command-line application that can perform a number of different kinds of transformations on video files, including video compression, which is the most commonly requested feature of ffmpeg. The resource https://unix.stackexchange.com/questions/28803/how-can-i-reduce-a-videos-size-with-ffmpeg provides some examples of using ffmpeg to compress a video file. The most helpful example presented on that page is this one:

ffmpeg -i input.mp4 -vcodec libx264 -crf 20 output.mp4

Notice the format of this instruction. The -i clause that proceeds ffmpeg specifies the name of the input file. After the input file name, the command specifies output options. In this case, the output options specify the video codec to use and the constant rate factor, or CRF, to use to compress the video file. The CRF, which, in this case, is 20, can range from 18 to 24, where a higher number will compress the output to a smaller size. The last part of the instruction is the name of the resulting compressed file.

Note that ffpeg has nothing to do with Python. If you install ffmpeg on your computer from https://www.ffmpeg.org/download.html, you’ll be able to run this command at the command line to compress files in this way.

In this example, however, we want to compress files not by typing commands at the command line but by issuing instructions from a python script we write. To do that, we need to use the ffmpy library. A good resource on the ffmpy library is https://pypi.python.org/pypi/ffmpy/0.0.4.

First, we need to install the ffmpy library. Assuming you’ve already installed Python, install ffmpy by issuing the following command at the command prompt.

pip install ffmpy

Once you’ve installed it, you can use it in a Python program. To do so, you first have to import ffmpy. Then, once you’ve set the names of the input and output files and the video codec and constant rate factor to use, you need to set up two dictionaries, one describing the options to apply to the input file, and a second identifying the options to apply to the output file. For this example, we don’t have any options to apply to the original input file (we just want to use it), but we have to specify the video codec and CRF to use for the resulting output file. If our input and output file names are stored in variables input_name and output_name, then we could set up a dictionary called inp to store the parameters to apply to the input file, and a dictionary called outp to apply to the output file. Here is the code:

inp={input_name:None}
outp = {output_name:’-vcodec libx264 -crf %d’%crf}

We pass these dictionaries to the FFmpeg function defined in the ffmpy library to create an FFmpeg object. Once the object is created, we ask it to execute its run function, which actually does the compression.

ff=ffmpy.FFmpeg(inputs=inp,outputs=outp)
ff.run()

For debugging purposes, it might be handy to see what corresponding command-line instruction the FFmpeg object is executing. You can do so by printing the value of ff.cmd.

Here’s the full code to take one mp4 file and compress it using a user-specfied constant rate factor:

import ffmpy
input_name = input(“Enter name of input file: “)
crf = int(input(“Enter constant rate factor between 18 and 24: “))
output_name = input(“Enter output file name: “)
inp={input_name:None}
outp = {output_name:’-vcodec libx264 -crf %d’%crf}
ff=ffmpy.FFmpeg(inputs=inp,outputs=outp)
print(ff.cmd) # just to verify that it produces the correct ffmpeg command
ff.run()
print(“done!”)

In my application, however, we probably have multiple mp4s that need to be compressed. These could be all the mp4s I recorded during a day of teaching, for example, and I want to compress them all and then upload them to a remote server. To do this, we need to list all the files in a particular folder and then, for each of them, execute the instructions we just learned to compress mp4s. Once all the files are compressed, we need to upload them, then to a remote server.

A very easy way to list all the files in a particular directory can be accomplished by this Python command:

file_names = glob.glob(“%s\\*.mp4” % the_dir)

The value of the_dir specifies the directory whose mp4 files we want to list. The resulting file_names will be a Python list, and we can move through this, one at a time, using a for loop. Inside that for loop, we’ll execute code to compress each.

for f in file_names:
do the code to compress f and produce a new output file

As we compress each file, we might want to delete the original. Most file system tasks are handled by the os library in Python, which we can import for use in our own programs. For example, in os, there is a remove function which takes the name of the file we want to remove. So, to remove a file called junk.txt, for example, we could issue the following command in Python:

os.remove(“junk.txt”)

Finally, how about transferring our reduced videos to a remote web server? To do that, we need to import Python’s ftplib library into our program. Once we have access to ftplib, we can create a session using ftplib’s FTP command, which takes the name of the server, the username, and the user’s password to connect to the remote server. Once the connection is established by the FTP command, we can use the resulting session’s storbinary function to send the targeted file to the remote location.

Here is the code to set up the connection in the first place:

try:
session = ftplib.FTP(config[“server”],config[“user”],config[“passwd”])
print(“Done connecting to remote server.”)
except:
print(“Could not connect to the remote server.”)

Assuming that worked and a connection has been established, we can send a file to the remote server as follows. Let’s assume the mp4 file’s name is stored in a variable called outfile, and the desired remote location’s name is stored in a variable called remote_file. Then, the following code will copy the local mp4 file to the remote location:

compressed = open(outfile,”rb”)
try:
session.storbinary(“STOR %s”%remote_file,compressed)
compressed.close()
except:
print(“An error occurred while uploading.”)
session.quit()

Let’s now put it all together. The following code is available at https://github.com/klumpra/lecture_recording_processor

This is the application that helps me post appropriately compressed versions of my lecture recordings with as little manual intervention as possible. The program reads a variety of configuration settings from a config file whose contents look like this:

dir=d:
crf=24
server=my.server.name
user=myusername
passwd=mypassword
folder=~/public_html/somesubdirectory

The lines of the configuration file are name-value pairs. The configuration setting appears to the left of the equal sign, and its corresponding value appears to the right. Here are what the various parameters mean:

  • dir: where the original mp4 files appear, and where the reduced ones will be written
  • crf: the constant rate factor, between 18 and 24, with higher values leading to tighter compression
  • server: the ip address or host name of where the files should be posted
  • user: the username for the account for logging into the server
  • passwd: the password for the account for logging into the server
  • folder: where the files should be placed on the remote server

The code that appears below first reads this configuration file. It then lists all the mp4 files in the dir directory and asks the user for than name of the compressed file to produce from each. It then compresses the files. After that, it uploads them to the remote directory. Once that is done, it asks the user if he or she wants to delete the original and compressed files.

 

Sometimes it helps to remember the particular use case for which code was written. In my situation, I record several mp4s of lectures each day. I wanted a way to compress these further and then upload them automaticaly to our department web server. I’d then like to minimize disk usage by at least having the option to delete the original and / or compressed files once the compressed files have been copied to the remote server. This code does that.

Like I said, this was an extremely specialized application. But, from it, you can see how to use ffmpeg to compress files, how to list the contents of directories, how to remove files, and how to copy files to a remote server. I’m sure you have an application in mind that could make use of these lessons. Happy coding!

About Ray Klump

Professor and chair of Mathematics and Computer Science Director, Master of Science in Information Security Lewis University http://online.lewisu.edu/ms-information-security.asp, http://online.lewisu.edu/resource/engineering-technology/articles.asp, http://cs.lewisu.edu. You can find him on Google+.

2 thoughts on “Compressing mp4 files, listing directory contents, and copying files to a remote server in Python

  1. daveclark966
    August 24, 2019 at 4:16 am

    To compress MP4, a professional MP4 compressor, like Avdshare Video Converter, is needed.

  2. Craig Coffman
    June 19, 2019 at 7:32 pm

    This is awesome. It is technically over my head, but this is something I am looking to do on a project. I was not sure how to describe it to my developer in ‘his language.’ This is a great outline for him to follow.

    A question about the FTP. Is it possible to upload to YouTube as well? Much like you mention, deleting is a great way to reduce overhead. I would like a cold storage as backup, but YouTube to be the ‘live’ setting for viewing. While this is perhaps beyond the scope of your tutorial, it seems like this should be possible, correct? I am not expecting you to update the code, but if it is possible I feel I can set him on it.

    Thank you in advance and for your write up.

    – Craig Coffman

Leave a Reply

Your email address will not be published. Required fields are marked *