Show how importing code without dunder main causes it to run.
__name__ == "__main__"
If you have worked on an infrastructure team, you probably have run across someone who built scripts for the team to use. The scripts may have done things like verify connectivity or make some simple change that the team is required to do often. The term “script” is often used casually to refer to an executable file that accomplishes a simple task. Scripts are denoted by the programming language used to write them, such as a Python script, a standalone executable file that uses Python to accomplish the task. Other common scripting languages are Bash and Perl.
Some of the attributes of a Python script are:
Some of the benefits of building a Python script are:
Some of the pitfalls of building a Python script are:
Python modules are Python files that are written with a modular programming mindset. Rather than putting everything in one big file, the files are broken up into smaller and more focused files called “modules.” To a new Python developer, both a module and a script may look alike because they are both Python files (.py) and have Python code to execute. A whole course could be written on Python package management and modules, but this tutorial will focus on the practical pieces with respect to modules versus scripts in the life of an infrastructure engineer.
Some of the attributes of a Python module are:
Some of the benefits of building a Python module are:
Some of the pitfalls of building a Python module are:
Whether you are building a Python script or a Python module, it is important to know the value of using if __name__ == "__main__":
in your code. First, let us look at a simple example.
In your text editor, open up a new file called my_script.py
with the following contents:
# Build report
print("Gathering Data")
data = 42
second_data = 1337
print("Building report!")
data_report = [data, second_data]
print("Report Built.")
print("The report is ")
print(data_report)
# Send report
print("Loading Report")
print("...")
print("...")
print("Report Sent")
Try running the code above. You should get the following output:
$ python my_script.py
Gathering Data
Building report!
Report Built.
The report is
[42, 1337]
Loading Report
...
...
Report Sent
$
The code is simple and printing mock output for the sake of simplicity. You can imagine much more complex code that accomplished the same thing with API calls and leveraging Python’s many libraries.
When people are first learning Python, this is often how they write their code, all in one file, with sequential statements executing from top to bottom. They may use comments to indicate where a new logical piece begins, but nothing is abstracted into functions or classes because it just works.
Now imagine that you are given the script above and asked to reuse the code in a new report-building Python package. You have multiple teams’ scripts that you need to compile into a single report.
Open a new file in the same directory as the my_script.py
file, calling the new file my_module.py
with the following contents:
from my_script import data, second_data
new_data = 9999
compiled_data = [data, second_data, new_data]
print("Now building Compiled Report")
print(compiled_data)
print("...")
print("Now sending Compiled Report")
Run the my_module.py
module. The output should look like this:
$ python my_module.py
Gathering Data
Building report!
Report Built.
The report is
[42, 1337]
Loading Report
...
...
Report Sent
Now building Compiled Report
[42, 1337, 9999]
...
Now sending Compiled Report
$
Note that even though we were merely importing the two variables data
and second_data
, because my_script
was written as one long sequential set of instructions, the importation also executed the other lines of code (including the print statements, for example).
One huge thing missing in my_script.py
is using if __name__ == "__main__":
in the code and abstracting the execution into classes or functions.
Create a new Python file in the same location as the other ones called my_script_with_main.py
, with the following contents:
def build_report():
print("Gathering Data")
data = 42
second_data = 1337
print("Building report!")
data_report = [data, second_data]
print("Report Built.")
print("The report is ")
return data_report
def send_report():
print("Loading Report")
print("...")
print("...")
print("Report Sent")
if __name__ == "__main__":
print(build_report())
send_report()
Run the script and note that the output is the same as before, but now we have the code within functions and the functions being called under if __name__ == "__main__":
. By changing the script to have this new structure, it will no longer execute all the lines of code when it is imported.
Let’s see it in action with the new structure. Change the import statement on my_module
to point to the new script:
from my_script_with_main import build_report, send_report
def compile_report():
new_data = [9999]
imported_data = build_report()
compiled_data = imported_data + new_data
print("Now building Compiled Report")
print(compiled_data)
print("...")
print("Now sending Compiled Report")
if __name__ == "__main__":
compile_report()
The output should look like this:
$ python my_module.py
Gathering Data
Building report!
Report Built.
The report is
Now building Compiled Report
[42, 1337, 9999]
...
Now sending Compiled Report
$
Note that it all works as intended now. There is no extra report sent because the send_report()
function was never called from my_module.py
.
The main takeaway from this series of examples is to show the difference between a script and a module, and the importance of if __name__ == "__main__":
even if you don’t think your script needs it. You don’t know if later on down the road someone will want to import your code and reuse pieces of it, even if it was not originally designed to be imported as a module.