Learning Python: argparse module

If you are engaged in processing and analyzing data using Python, then sooner or later you will have to go beyond the Jupyter Notebook, converting your code into scripts that can be run using command line tools. This is where the argparse module comes in handy. For newbies who are used to Jupyter Notebook, such a move means having to leave the comfort zone and move to a new environment. The material, the translation of which we publish today, was written in order to facilitate such a transition.

Argparse module

Argparse module

The argparse module can be compared to the forces of nature, which erected mountain peaks towering above the clouds. Thanks to this module in scripts it becomes possible to work with what would be hidden from the code of these scripts without using it.

It should be noted that argparse is a recommended module of the standard Python library for use with command line arguments. I did not manage to find a good argparse guide for beginners, so I decided to write such a guide myself.

Life outside Jupyter Notebook

When I first encountered argparse in a Python script that I needed for a project I was doing in my spare time, I thought, “What is this mysterious construction?”. After that, I quickly transferred the code to Jupyter Notebook, but such a move turned out to be irrational.

I needed to be able to just run the script, and not work with it using Jupyter Notebook tools. A standalone script that used the argparse module would be much easier to use, working on it would be easier than relying on the capabilities of Jupyter Notebook. However, then I was in a hurry, and when I looked at the documentation on argparse, I could not immediately grasp its essence, so I did not use the original version of the script.

Since then, I figured out with argparse and I really liked this module. Now I consider it really vital. At the same time to master it is not so difficult.

Why do we need the argparse module?

The argparse module allows you to parse the arguments passed to the script when it is started from the command line, and allows you to use these arguments in the script. That is, it’s about the fact that this module allows you to provide the script with some data at the time of its launch, and the script can use this data during the execution of its code. The argparse module is a means by which you can establish communication between the author of the program and the person who uses it, for example, between you when you write a script today and you, when you launch it tomorrow, passing something to it.

Using argparse means that, if necessary, change the behavior of the script or if it is necessary to transfer some data to it, if this is provided by the script's author, the user does not need to edit the program code. As a result, scripts gain a certain level of flexibility.

Example

Suppose you want to write a script to convert video files into regular images using the OpenCV library . In order for the script to solve this problem, it needs to know the place where the video files are stored, and the place where you need to place the finished images. That is, it needs information about two folders, the paths to which, which is not very convenient, can be hard-coded in the script code, or, which is much better, you can let the user specify the script by entering them as command-line arguments when the script is run. In order to equip the script with such an opportunity, we will need the argparse module. Here’s what the script section might look like (let's call this script videos.py ), where the command-line arguments are parsed:

 # videos.py import argparse parser = argparse.ArgumentParser(description='Videos to images') parser.add_argument('indir', type=str, help='Input dir for videos') parser.add_argument('outdir', type=str, help='Output dir for image') args = parser.parse_args() print(args.indir)

Here, at the beginning of the file, the argparse module is imported. Then, using the argparse.ArgumentParser() construct, a parser object is created with its description. Next, using the parser.add_argument() method, the parser.add_argument() variable is indir , in which it is planned to record the path to the folder with video files. This indicates that it has a string type, and also sets reference information about it. After this, in the same way, the outdir variable is outdir , into which the path to the folder will be placed, in which the script will have to place the images created on the basis of video files. At the next step, the result of parsing command line arguments is placed in the args variable. What is passed to the script at startup will now be available as the indir and outdir properties of the args object. Now you can work with these values. In this case, we simply output to the console what is passed to the script in the argument indir .

Here's how to run this script from the command line:

 python videos.py /videos /images

Note that the lines /videos and /images do not need to be enclosed in quotes. The script launched in this way will output the line /videos to the terminal, which will confirm the possibility of using the arguments passed to it in your code. This is the magic of argparse in action.

Command line argument parsing magic

Details about argparse

We have just reviewed a simple example of working with argparse. Now let's discuss some of the details regarding argparse.

▍ Positional arguments

The type construction parser.add_argument('indir', type=str, help='Input dir for videos') from the videos.py script videos.py used to create a positional argument (positional argument). When calling a script, the order of specifying such arguments is important. So, the first argument passed to the script becomes the first positional argument, the second argument the second positional argument.

What happens if the script is run without any arguments at all by running the python videos.py command in the terminal?

In this case, the following error message will be displayed:

 videos.py: error: the following arguments are required: indir, outdir

As a result, it turns out that in order to run a script that provides for the use of positional arguments, such arguments must always be specified when it is run.

▍ Optional arguments

What happens when our script starts with the python videos.py --help command python videos.py --help ?

In response, information about him will be displayed. This is exactly the information about the positional arguments that we specified when describing the corresponding variables:

 usage: videos.py [-h] indir outdir Videos to images positional arguments: indir       Input dir for videos outdir      Output dir for image optional arguments: -h, --help  show this help message and exit

The script told us a lot of interesting things about what it expects from the user, and help is an example of an optional argument. Please note that --help (or -h ) is the only standard optional argument that we can use when working with argparse, but if you need other optional arguments, you can create them yourself.

Optional arguments are created in the same way as positional ones. The main difference between the teams of their creation is that when specifying the names of such arguments, these names begin with a sequence of characters -- , or, for short forms of arguments, with the symbol - . For example, an optional argument can be created like this:

 parser.add_argument('-m', '--my_optional')

Here is an example of how to create and use optional arguments. Pay attention to the fact that we, describing an optional argument here, indicated its type as int . That is, it is an integer. In this situation, you can use other types of Python.

 # my_example.py import argparse parser = argparse.ArgumentParser(description='My example explanation') parser.add_argument(   '--my_optional',   type=int,   default=2,   help='provide an integer (default: 2)' ) my_namespace = parser.parse_args() print(my_namespace.my_optional)

The argument, described as - --my_optional , is available in the program as a property of the my_namespace object with the name my_optional .

Optional arguments can be assigned values that they will have by default. In our case, if the script my_example not given any value, the number 2 will be written to it, which will be output to the console. In order to set the value of this argument during the script launch, you can use the following construction:

 python my_example.py  --my_optional=3

What else can you use argparse for?

The argparse module can be used when developing Python applications that are planned to be packaged in Docker containers. So, for example, if, when launching an application packed in a container, it needs to pass command line arguments, then this can be described at the container assembly stage in the Dockerfile using the RUN instruction. To run scripts during container execution, you can use the CMD or ENTRYPOINT . Details about the Dockerfile files can be found here .

Results

We looked at the basic ways of working with the argparse module, using which you can equip your scripts with the ability to accept and process command line arguments. It should be noted that the possibilities of argparse do not end there. For example, the use of the parameter nargs when describing arguments allows you to work with argument lists, while the choices parameter allows you to specify sets of values that can take arguments. In fact, now, having mastered the basic capabilities of argparse, you can easily study this module more deeply using the documentation for it.

If you are used to working with Jupyter Notebook and want to move away from this practice, then here and here - materials on working with variable environments. Here is the material on the tool, repo2docker , which allows you to convert Jupyter Notebook repositories into Docker images.

Dear readers! How do you work with command line arguments in Python scripts?

Source: https://habr.com/ru/post/440654/