Python Scripts vs. Modules

Show how importing code without dunder main causes it to run.

What You’ll Learn

What are Python scripts?
What are Python modules?
How to use __name__ == "__main__"

What You’ll Need

Basic understanding of Python
Access to Python version 3.6+

If you have worked on an infrastructure team, you probably have run across someone who built scripts for the team to use. The scripts may have done things like verify connectivity or make some simple change that the team is required to do often. The term “script” is often used casually to refer to an executable file that accomplishes a simple task. Scripts are denoted by the programming language used to write them, such as a Python script, a standalone executable file that uses Python to accomplish the task. Other common scripting languages are Bash and Perl.

Some of the attributes of a Python script are:

Standalone file (or directory)
Purpose-built for the business need
Not imported into a larger framework or library
Often executable with command-line parameters for users who don’t need to understand how it works

Some of the benefits of building a Python script are:

Easier to maintain because it is focused on a business purpose and often not used by other code bases
Helps many other engineers who may not need or want to code but can use the script
Often easier to write and put into production than more comprehensive solutions

Some of the pitfalls of building a Python script are:

If the business purpose changes, they are often difficult to maintain and end up not being used anymore.
They can be difficult to reuse or extend for use cases that are adjacent or similar, depending on the design.
They encourage a small number of maintainers; if those maintainers leave the company, the script goes to waste.

Python modules are Python files that are written with a modular programming mindset. Rather than putting everything in one big file, the files are broken up into smaller and more focused files called “modules.” To a new Python developer, both a module and a script may look alike because they are both Python files (.py) and have Python code to execute. A whole course could be written on Python package management and modules, but this tutorial will focus on the practical pieces with respect to modules versus scripts in the life of an infrastructure engineer.

Some of the attributes of a Python module are:

If the code in the module is imported, it will not execute anything.
Each module is not meant to be run on its own but is imported into another executable Python file, often called “main.py.”
The code is often broken up into a series of functions or classes that have descriptive names and parameters, so they are easy to identify when imported.

Some of the benefits of building a Python module are:

Simpler to maintain and easier to delegate development to other teammates
Easier to reuse code across multiple projects (some functions or features can be designed to be independent of the use case like authentication or logging)
Testing is often easier; each module can be tested individually with mock inputs and expected outputs.

Some of the pitfalls of building a Python module are:

Newer developers might struggle to follow the code because it is broken up across multiple files or directories.
Introducing import statements also introduces the problem of pathing and importing the right module.

Whether you are building a Python script or a Python module, it is important to know the value of using if __name__ == "__main__": in your code. First, let us look at a simple example.

First Script Example

In your text editor, open up a new file called my_script.py with the following contents:

# Build report
print("Gathering Data")
data = 42
second_data = 1337
print("Building report!")
data_report = [data, second_data]
print("Report Built.")
print("The report is ")
print(data_report)

# Send report
print("Loading Report")
print("...")
print("...")
print("Report Sent")

Try running the code above. You should get the following output:

$ python my_script.py
Gathering Data
Building report!
Report Built.
The report is
[42, 1337]
Loading Report
...
...
Report Sent
$

The code is simple and printing mock output for the sake of simplicity. You can imagine much more complex code that accomplished the same thing with API calls and leveraging Python’s many libraries.

When people are first learning Python, this is often how they write their code, all in one file, with sequential statements executing from top to bottom. They may use comments to indicate where a new logical piece begins, but nothing is abstracted into functions or classes because it just works.

Module Example

Now imagine that you are given the script above and asked to reuse the code in a new report-building Python package. You have multiple teams’ scripts that you need to compile into a single report.

Open a new file in the same directory as the my_script.py file, calling the new file my_module.py with the following contents:

from my_script import data, second_data

new_data = 9999
compiled_data = [data, second_data, new_data]
print("Now building Compiled Report")
print(compiled_data)
print("...")
print("Now sending Compiled Report")

Run the my_module.py module. The output should look like this:

$ python my_module.py
Gathering Data
Building report!
Report Built.
The report is
[42, 1337]
Loading Report
...
...
Report Sent
Now building Compiled Report
[42, 1337, 9999]
...
Now sending Compiled Report
$

Note that even though we were merely importing the two variables data and second_data, because my_script was written as one long sequential set of instructions, the importation also executed the other lines of code (including the print statements, for example).

New Script with main

One huge thing missing in my_script.py is using if __name__ == "__main__": in the code and abstracting the execution into classes or functions.

Create a new Python file in the same location as the other ones called my_script_with_main.py, with the following contents:

def build_report():
    print("Gathering Data")
    data = 42
    second_data = 1337
    print("Building report!")
    data_report = [data, second_data]
    print("Report Built.")
    print("The report is ")
    return data_report

def send_report():
    print("Loading Report")
    print("...")
    print("...")
    print("Report Sent")

if __name__ == "__main__":
    print(build_report())
    send_report()

Run the script and note that the output is the same as before, but now we have the code within functions and the functions being called under if __name__ == "__main__":. By changing the script to have this new structure, it will no longer execute all the lines of code when it is imported.

Let’s see it in action with the new structure. Change the import statement on my_module to point to the new script:

from my_script_with_main import build_report, send_report

def compile_report():
    new_data = [9999]
    imported_data = build_report()
    compiled_data = imported_data + new_data
    print("Now building Compiled Report")
    print(compiled_data)
    print("...")
    print("Now sending Compiled Report")

if __name__ == "__main__":
    compile_report()

The output should look like this:

$ python my_module.py
Gathering Data
Building report!
Report Built.
The report is
Now building Compiled Report
[42, 1337, 9999]
...
Now sending Compiled Report
$

Note that it all works as intended now. There is no extra report sent because the send_report() function was never called from my_module.py.

Conclusion