4. Developing Python applications¶
In the section Running code on ZeroCloud we saw some
examples of running basic Python programs. This tutorial will cover Python
application development in more depth, including more detailed descriptions
of the directives in the application template (zapp.yaml
). It will also
cover how to include third party Python libraries in your zapp.
4.1. Python application template¶
The most covenient way to build Python applications on ZeroCloud is use to
utilities build into the zpm
tool.
To create a new application, simply run:
$ zpm new --template python
Created './zapp.yaml'
Created './.zapp'
Created './.zapp/tox.ini'
Notice that this creates a couple of files and a directory. zapp.yaml
is
this application template, and most of time you’ll be modifying this file to
make changes to your application config. .zapp/
is “hidden” directory which
contains extra artifacts to assist with bundling your application, including
tox.ini
which is used to download and cache third-party (pure) Python
dependencies. For the most part, you will not need to directly change anything
in the .zapp/
directory, although some corner cases <link> require this.
The default zapp.yaml
should look something like this:
# This describes the type of application. Bundling and deployment
# behavior can vary between application types.
project_type: python
# This section describes the runtime behavior of your zapp: which
# groups of nodes to create and which nexe to invoke for each.
execution:
# Your application can consist of multiple groups. This is typically
# used for map-reduce style jobs. This is a list of groups, so
# remember to add "-" infront of each group name.
groups:
# Name of this group. This is used if you need to connect groups
# with each other.
- name: ""
# The NaCl executable (nexe) to run on the nodes in this group.
path: file://python2.7:python
# Command line arguments for the nexe.
args: ""
# Input and output devices for this group.
devices:
- name: python2.7
- name: stdout
# Meta-information about your zapp.
meta:
Version: ""
name: "myapp"
Author-email: ""
Summary: ""
help:
# Short description of your zapp. This is used for auto-generated
# help.
description: ""
# Help for the command line arguments. Each entry is a two-tuple
# with an option name and an option help text.
args: []
# Files to include in your zapp. Your can use glob patterns here, they
# will be resolved relative to the location of this file.
bundling: []
Let’s look at each section in a bit more detail:
# This describes the type of application. Bundling and deployment
# behavior can vary between application types.
project_type: python
The project_type
directive simply indicates that this is a Python project.
This is so zpm
understands what to do exactly for things like application
bundling, which are specific to the project type.
# This section describes the runtime behavior of your zapp: which
# groups of nodes to create and which nexe to invoke for each.
execution:
# Your application can consist of multiple groups. This is typically
# used for map-reduce style jobs. This is a list of groups, so
# remember to add "-" infront of each group name.
groups:
# Name of this group. This is used if you need to connect groups
# with each other.
- name: ""
# The NaCl executable (nexe) to run on the nodes in this group.
path: file://python2.7:python
# Command line arguments for the nexe.
args: ""
# Input and output devices for this group.
devices:
- name: python2.7
- name: stdout
The execution
section can contain one or more groups
. We call
them “groups” because certain configurations can result in the creation of many
ZeroVM instances, as is the case with
MapReduce applications on ZeroCloud. Each
group must define a name
which is unique among all of the groups.
path
defines the base image to use for execution. The format of this field
is as follows: file://<image-name>:<exe-name>
In this example, we indicate that the python2.7
base image shall be used,
and from that, we execute the python
binary contained within that image.
args
is used to supply additional arguments to the python
executable.
In most cases, we will simply invoke a Python script by settings args to
something like foo.py
, but you can supply additional positional arguments
as well, just as if you were running a Python script from a command line
(python foo.py arg1 arg1 etc.
).
devices
defines which I/O devices are to be made available to the ZeroVM
instances in this group, and how the devices should be configured. See
I/O Devices for more detail.
# Meta-information about your zapp.
meta:
Version: ""
name: "myapp"
Author-email: ""
Summary: ""
The meta
section simply contains metadata about the application, and should
be pretty self-explanatory. The only required property here is name
, which
is used to contstruct the .zapp
file when zpm bundle
is called. For
example, if the name
is foo
, then zpm bundle
will bundle the
application as foo.zapp
.
help:
# Short description of your zapp. This is used for auto-generated
# help.
description: ""
# Help for the command line arguments. Each entry is a two-tuple
# with an option name and an option help text.
args: []
The help
section is deprecated. You can ignore it for now.
# Files to include in your zapp. Your can use glob patterns here, they
# will be resolved relative to the location of this file.
bundling: []
The bundling
section defines which files/directories within the project
directory should be included in the zapp at bundle time. You can include
individual files in this way, or entire directories.
4.2. Including third party (pure) Python dependencies¶
zapp.yaml
has an optional directive called dependencies
. In this
section you can list third party Python dependencies, which will be fetched
from PyPI. Note that third party Python code
must be pure Python. Here are a few examples:
dependencies: [
"pngcanvas",
]
In this example, we declare the pngcanvas library as a dependency. This is the simplest and most typical example.
Here is a more complicated example:
dependencies: [
"pngcanvas",
["glibc", "glibc", "pyglibc"],
["purepng", "png"],
]
In this example, we declare pngcanvas, glibc, and purepng as dependencies.
Because glibc
installs both a glibc.py
module and a pyglibc
package, we specify both of those in the tail of the list.
Similarly, we also want to include
purepng. The difference here is that
while the package name on PyPI is purepng
, the only Python module installed
is simply called png
.
If you don’t know which modules/packages to include from a given Python
package, you can either look at the setup.py
(a glibc
example:
https://github.com/zyga/python-glibc/blob/1097a1e5d1e243f08a4872fdb0f088c3c019bc12/setup.py#L35-36)
or have a look at the .zapp/.zapp/venv/lib/python2.7/site-packages
directory after you run zpm bundle
(which will install and cache the
dependencies you specify). This may take some trial and error and multiple
zpm bundle --refresh-deps
commands (see
below). Fortunately, you won’t need
to do this often.
Note
The dependency management feature of zpm
could be the target of future
improvement. The initial implementation works for a lot of cases, but may
be inefficient for more complex corner cases and varied Python packaging
configurations. If you run into a case which doesn’t work, or otherwise
have problems with or questions about this feature, please file a bug
report.
4.2.1. Refreshing dependencies¶
Dependencies are cached in the .zapp/
directory so that zpm
doesn’t
redundantly re-fetch dependencies each time you call zpm bundle
. However,
at times you will need to add/remove dependencies, and therefore refresh the
cached Python packages. To do this, you can simply run:
$ zpm bundle --refresh-deps
This will clear the cache, re-fetch all dependencies per the dependencies
directive in the zapp.yaml
, and bundle your zapp as usual.
Tip
To double-check if a change in dependencies is reflected correctly in your
zapp, you can use tar tf <myapp>.zapp
to check the contents of the
archive.
4.2.2. Exceptional cases¶
Some Python packages on PyPI may specify an external download location.
The BitVector library is one such
example. This can cause problems when zpm bundle
is called. Here is an
excerpt of one such error:
Downloading/unpacking BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
Could not find any downloads that satisfy the requirement BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
Some externally hosted files were ignored (use --allow-external BitVector to allow).
To workaround this, modify your .zapp/tox.ini
and add a custom
install_command
to the [testenv:venv]
section:
[tox]
toxworkdir={toxinidir}/.zapp
envlist = venv
skipsdist = true
[testenv:venv]
deps = -r{toxinidir}/deps.txt
install_command = pip install
--allow-external BitVector
{opts} {packages}
Similar errors can occur if the external source is unverified, which can result
in errors like the following. (Indeed, adding the
--allow-external BitVector
is not enough to successfully install this
specific dependency.)
Downloading/unpacking BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
Could not find any downloads that satisfy the requirement BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
Some insecure and unverifiable files were ignored (use --allow-unverified BitVector to allow).
In this case, the final step for this workaround is to further edit the
.zapp/tox.ini
and adding the suggested --allow-unverified
option to the
pip install
command:
[tox]
toxworkdir={toxinidir}/.zapp
envlist = venv
skipsdist = true
[testenv:venv]
deps = -r{toxinidir}/deps.txt
install_command = pip install
--allow-external BitVector
--allow-unverified BitVector
{opts} {packages}
After making these changes, run zpm bundle -r
and everything should work
correctly.