Python - Argparse
Author(s) | Helena Rasche |
Editor(s) | Bazante Sanders |
Tester(s) | Donny Vrins |
OverviewQuestions:Objectives:
How do I make a proper command line script
How do I use argparse?
What problems does it solve?
Requirements:
Learn how sys.argv works
Write a simple command line program that sums some numbers
Use argparse to make it nicer.
Time estimation: 30 minutesLevel: Intermediate IntermediateLast modification: Oct 18, 2022
argparse
is an argument parsing library for Python that’s part of the stdlib. It lets you make command line tools significantly nicer to work with.
AgendaIn this tutorial, we will cover:
Unlike previous modules, this lesson won’t use a Jupyter/CoCalc notebook, and that’s because we’ll be parsing command lines! You’ll need to open a code editor on your platform of choice (nano
, vim
, emacs
, VSCode are all options) and use the following blocks of code to construct your command line tool.
sys.argv
In the coding world, whenever you run a Python script on the command line, it has a special variable available to it named argv
. This is a list of all of the arguments used when you run a command line program.
Hands-on: Print out argv
- Create / open the file
run.py
in your text editor of choice- There we’ll create a simple Python script that:
- imports
sys
, the system module needed to access argv.- Prints out
sys.argv
import sys print(sys.argv)
Run this with different command line arguments:
python run.py python run.py 1 2 3 4 python run.py --help
QuestionWhat did you notice about the output? There are two main points.
- The name of the script (
run.py
) is included as the first value every time.- All of the arguments are passed as strings, no numbers.
Simple tasks
Let’s sum up all of the numbers passed on the command line. We’ll do this by hand, and then we’ll replace it with argparse
to see how much effort that saves us.
Hands-on: Hands-onUpdate your script to sum up every number passed to it on the command line.
It should handle:
- 1 or more numbers
- nothing (and maybe print out a message?)
- invalid values (print out an error message that the value couldn’t be processed.)
Hints:
- Skip the program name
- Use
try
andexcept
to try converting the string to a number.QuestionHow does your updated script look?
import sys result = 0 if len(sys.argv) == 1: print("no arguments were supplied") else: for arg in sys.argv[1:]: try: result += float(arg) except: print(f"Could not parse {arg}") print(result)
Argparse
Argparse saves us a lot of work, because it can handle a number of things for us!
- Ensures that the correct number of arguments are provided (and provide a nice error message otherwise)
- Ensure that the correct types of arguments are provided (no strings for a number field)
- Provide a help message describing your program
Argparse is used as follows. First we need to import it
import argparse
And then we can define a ‘parser’ which will parse our command line. Additionally we can provide a description field which tells people what our tool does:
parser = argparse.ArgumentParser(description='Process some integers.')
And finally we can define some arguments that are available. Just like we have arguments to functions, we have arguments to command lines. These come in two flavours:
- required (without a
--
) - optional “flags” (prefixed with
--
)
Here we have an argument named ‘integers’, which validates that all input values are of the type int
. nargs
is the number of arguments, +
means ‘1 or more’. And we have some help text as well:
parser.add_argument('integers', type=int, nargs='+',
help='an integer for the accumulator')
We can also define an optional flag, here it’s called --sum
. Here it goes to a destination named ‘accumulate’, the name we’ll use to access the value of this argument. It has an action of ‘store_const’ which just tracks if the flag was supplied or not.
The const
attribute is set to sum
, which is actually the function sum()
, this is what the value will be if we run the command with --sum
. Otherwise it will default
to the function max()
. We again have some help text to tell us how it behaves
parser.add_argument('--sum', dest='accumulate', action='store_const',
const=sum, default=max,
help='sum the integers (default: find the max)')
Finally we parse the arguments, which reads sys.argv
and processes it according to the above rules. The output is stored in args
.
args = parser.parse_args()
We have two main variables we can use now:
args.integers # A list of integers.
args.accumulate # Actually a function!
Using argparse
Let’s go back to our script, and replace sys
with argparse.
Hands-on: Replacing argv.
Given the following script, replace the use of
argv
with argparse.import sys result = 0 if len(sys.argv) == 1: print("no arguments were supplied") else: for arg in sys.argv[1:]: try: result += float(arg) except: print(f"Could not parse {arg}") print(result)
You should have one argument: numbers (type=float)
And print out the sum of those numbers.
QuestionHow does your final script look?
import argparse parser = argparse.ArgumentParser(description='Sum some numbers') parser.add_argument('integers', type=float, nargs='+', help='a number to sum up.') args = parser.parse_args() print(sum(args.integers))
Try running the script with various values
python run.py python run.py 1 3 5 python run.py 2 4 O python run.py --help
Wow that’s a lot simpler! We have to learn how argparse
is invoked but it handles a lot of cases for us:
- No arguments provided
- Responding to
--help
- Raising an error for invalid values
--help
is even written for us, without us writing any special code to handle that case! This is why you need to use argparse
:
- It handles a lot of cases and input validation for you
- It produces a nice
--help
text that can help you if you’ve forgotten what your tool does - It’s nice for users of your scripts! They don’t have to read the code to know how it behaves if you document it well.
There is a lot of documentation in the argparse
module for all sorts of use cases!
Why Argparse?
Using argparse can be a big change to your tool but there are some benefits to using it!
- Standardised interface to your tool that’s familiar to everyone who uses command line tools
- Automatic Help page
- Automatic Galaxy Tools?
Generating Automatic Galaxy Tools (Optional)
With the argparse2tool
project, and eventually pyGalGen
which will be merged into planemo
, you can generate Galaxy tools automatically from argparse
based Python scripts.
Hands-on: Generate a Galaxy tool wrapper from your script
Write out the python script to a file named
main.py
import argparse parser = argparse.ArgumentParser(description='Sum some numbers') parser.add_argument('integers', type=float, nargs='+', help='a number to sum up.') args = parser.parse_args() print(sum(args.integers))
Create a virtual environment, just in case: ``
python -m venv .venv . .venv/bin/activate
Install
argparse2tool
via pip:pip install argparse2tool
Generate the tool interface:
Input: CommandPYTHONPATH=$(argparse2tool) python main.py --generate_galaxy_xml
Output: Galaxy XML<tool name="main.py" id="main.py" version="1.0"> <description>Sum some numbers</description> <stdio> <exit_code range="1:" level="fatal"/> </stdio> <version_command><![CDATA[python main.py --version]]></version_command> <command><![CDATA[python main.py #set repeat_var_1 = '" "'.join([ str($var.integers) for $var in $repeat_1 ]) "$repeat_var_1" > $default]]></command> <inputs> <repeat title="repeat_title" min="1" name="repeat_1"> <param label="a number to sum up." value="0" type="float" name="integers"/> </repeat> </inputs> <outputs> <data name="default" format="txt" hidden="false"/> </outputs> <help><![CDATA[TODO: Write help]]></help> </tool>
Key points
If you are writing a command line script, no matter how small, use argparse.
--help
is even written for us, without us writing any special code to handle that caseIt handles a lot of cases and input validation for you
It produces a nice
--help
text that can help you if you’ve forgotten what your tool doesIt’s nice for users of your scripts! They don’t have to read the code to know how it behaves if you document it well.
Frequently Asked Questions
Have questions about this tutorial? Check out the FAQ page for the Foundations of Data Science topic to see if your question is listed there. If not, please ask your question on the GTN Gitter Channel or the Galaxy Help ForumFeedback
Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.
Citing this Tutorial
- , 2022 Python - Argparse (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-argparse/tutorial.html Online; accessed TODAY
- Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012
@misc{data-science-python-argparse, author = "Helena Rasche", title = "Python - Argparse (Galaxy Training Materials)", year = "2022", month = "10", day = "18" url = "\url{https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-argparse/tutorial.html}", note = "[Online; accessed TODAY]" } @article{Batut_2018, doi = {10.1016/j.cels.2018.05.012}, url = {https://doi.org/10.1016%2Fj.cels.2018.05.012}, year = 2018, month = {jun}, publisher = {Elsevier {BV}}, volume = {6}, number = {6}, pages = {752--758.e1}, author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning}, title = {Community-Driven Data Analysis Training for Biology}, journal = {Cell Systems} }
Funding
These individuals or organisations provided funding support for the development of this resource
A number of our employees contribute directly to the Galaxy Training Network and seek to make our higher education learning materials more accessible to a wider audience through the GTN platform. avans.nl
Do you want to extend your knowledge? Follow one of our recommended follow-up trainings: