4. Developing Python applications

In the section Running code on ZeroCloud we saw some examples of running basic Python programs. This tutorial will cover Python application development in more depth, including more detailed descriptions of the directives in the application template (zapp.yaml). It will also cover how to include third party Python libraries in your zapp.

4.1. Python application template

The most covenient way to build Python applications on ZeroCloud is use to utilities build into the zpm tool.

To create a new application, simply run:

$ zpm new --template python
Created './zapp.yaml'
Created './.zapp'
Created './.zapp/tox.ini'

Notice that this creates a couple of files and a directory. zapp.yaml is this application template, and most of time you’ll be modifying this file to make changes to your application config. .zapp/ is “hidden” directory which contains extra artifacts to assist with bundling your application, including tox.ini which is used to download and cache third-party (pure) Python dependencies. For the most part, you will not need to directly change anything in the .zapp/ directory, although some corner cases <link> require this.

The default zapp.yaml should look something like this:

# This describes the type of application. Bundling and deployment
# behavior can vary between application types.
project_type: python

# This section describes the runtime behavior of your zapp: which
# groups of nodes to create and which nexe to invoke for each.
execution:

  # Your application can consist of multiple groups. This is typically
  # used for map-reduce style jobs. This is a list of groups, so
  # remember to add "-" infront of each group name.
  groups:

      # Name of this group. This is used if you need to connect groups
      # with each other.
    - name: ""

      # The NaCl executable (nexe) to run on the nodes in this group.
      path: file://python2.7:python

      # Command line arguments for the nexe.
      args: ""

      # Input and output devices for this group.
      devices:
      - name: python2.7
      - name: stdout

# Meta-information about your zapp.
meta:
  Version: ""
  name: "myapp"
  Author-email: ""
  Summary: ""

help:
  # Short description of your zapp. This is used for auto-generated
  # help.
  description: ""

  # Help for the command line arguments. Each entry is a two-tuple
  # with an option name and an option help text.
  args: []

# Files to include in your zapp. Your can use glob patterns here, they
# will be resolved relative to the location of this file.
bundling: []

Let’s look at each section in a bit more detail:

# This describes the type of application. Bundling and deployment
# behavior can vary between application types.
project_type: python

The project_type directive simply indicates that this is a Python project. This is so zpm understands what to do exactly for things like application bundling, which are specific to the project type.

# This section describes the runtime behavior of your zapp: which
# groups of nodes to create and which nexe to invoke for each.
execution:

  # Your application can consist of multiple groups. This is typically
  # used for map-reduce style jobs. This is a list of groups, so
  # remember to add "-" infront of each group name.
  groups:

      # Name of this group. This is used if you need to connect groups
      # with each other.
    - name: ""

      # The NaCl executable (nexe) to run on the nodes in this group.
      path: file://python2.7:python

      # Command line arguments for the nexe.
      args: ""

      # Input and output devices for this group.
      devices:
      - name: python2.7
      - name: stdout

The execution section can contain one or more groups. We call them “groups” because certain configurations can result in the creation of many ZeroVM instances, as is the case with MapReduce applications on ZeroCloud. Each group must define a name which is unique among all of the groups.

path defines the base image to use for execution. The format of this field is as follows: file://<image-name>:<exe-name> In this example, we indicate that the python2.7 base image shall be used, and from that, we execute the python binary contained within that image.

args is used to supply additional arguments to the python executable. In most cases, we will simply invoke a Python script by settings args to something like foo.py, but you can supply additional positional arguments as well, just as if you were running a Python script from a command line (python foo.py arg1 arg1 etc.).

devices defines which I/O devices are to be made available to the ZeroVM instances in this group, and how the devices should be configured. See I/O Devices for more detail.

# Meta-information about your zapp.
meta:
  Version: ""
  name: "myapp"
  Author-email: ""
  Summary: ""

The meta section simply contains metadata about the application, and should be pretty self-explanatory. The only required property here is name, which is used to contstruct the .zapp file when zpm bundle is called. For example, if the name is foo, then zpm bundle will bundle the application as foo.zapp.

help:
  # Short description of your zapp. This is used for auto-generated
  # help.
  description: ""

  # Help for the command line arguments. Each entry is a two-tuple
  # with an option name and an option help text.
  args: []

The help section is deprecated. You can ignore it for now.

# Files to include in your zapp. Your can use glob patterns here, they
# will be resolved relative to the location of this file.
bundling: []

The bundling section defines which files/directories within the project directory should be included in the zapp at bundle time. You can include individual files in this way, or entire directories.

4.2. Including third party (pure) Python dependencies

zapp.yaml has an optional directive called dependencies. In this section you can list third party Python dependencies, which will be fetched from PyPI. Note that third party Python code must be pure Python. Here are a few examples:

dependencies: [
    "pngcanvas",
]

In this example, we declare the pngcanvas library as a dependency. This is the simplest and most typical example.

Here is a more complicated example:

dependencies: [
    "pngcanvas",
    ["glibc", "glibc", "pyglibc"],
    ["purepng", "png"],
]

In this example, we declare pngcanvas, glibc, and purepng as dependencies.

Because glibc installs both a glibc.py module and a pyglibc package, we specify both of those in the tail of the list.

Similarly, we also want to include purepng. The difference here is that while the package name on PyPI is purepng, the only Python module installed is simply called png.

If you don’t know which modules/packages to include from a given Python package, you can either look at the setup.py (a glibc example: https://github.com/zyga/python-glibc/blob/1097a1e5d1e243f08a4872fdb0f088c3c019bc12/setup.py#L35-36) or have a look at the .zapp/.zapp/venv/lib/python2.7/site-packages directory after you run zpm bundle (which will install and cache the dependencies you specify). This may take some trial and error and multiple zpm bundle --refresh-deps commands (see below). Fortunately, you won’t need to do this often.

Note

The dependency management feature of zpm could be the target of future improvement. The initial implementation works for a lot of cases, but may be inefficient for more complex corner cases and varied Python packaging configurations. If you run into a case which doesn’t work, or otherwise have problems with or questions about this feature, please file a bug report.

4.2.1. Refreshing dependencies

Dependencies are cached in the .zapp/ directory so that zpm doesn’t redundantly re-fetch dependencies each time you call zpm bundle. However, at times you will need to add/remove dependencies, and therefore refresh the cached Python packages. To do this, you can simply run:

$ zpm bundle --refresh-deps

This will clear the cache, re-fetch all dependencies per the dependencies directive in the zapp.yaml, and bundle your zapp as usual.

Tip

To double-check if a change in dependencies is reflected correctly in your zapp, you can use tar tf <myapp>.zapp to check the contents of the archive.

4.2.2. Exceptional cases

Some Python packages on PyPI may specify an external download location. The BitVector library is one such example. This can cause problems when zpm bundle is called. Here is an excerpt of one such error:

 Downloading/unpacking BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
   Could not find any downloads that satisfy the requirement BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
   Some externally hosted files were ignored (use --allow-external BitVector to allow).

To workaround this, modify your .zapp/tox.ini and add a custom install_command to the [testenv:venv] section:

 [tox]
 toxworkdir={toxinidir}/.zapp
 envlist = venv
 skipsdist = true

 [testenv:venv]
 deps = -r{toxinidir}/deps.txt
 install_command = pip install
                   --allow-external BitVector
                   {opts} {packages}

Similar errors can occur if the external source is unverified, which can result in errors like the following. (Indeed, adding the --allow-external BitVector is not enough to successfully install this specific dependency.)

 Downloading/unpacking BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
   Could not find any downloads that satisfy the requirement BitVector (from -r /home/user1/projects/foo/.zapp/deps.txt (line 1))
   Some insecure and unverifiable files were ignored (use --allow-unverified BitVector to allow).

In this case, the final step for this workaround is to further edit the .zapp/tox.ini and adding the suggested --allow-unverified option to the pip install command:

 [tox]
 toxworkdir={toxinidir}/.zapp
 envlist = venv
 skipsdist = true

 [testenv:venv]
 deps = -r{toxinidir}/deps.txt
 install_command = pip install
                   --allow-external BitVector
                   --allow-unverified BitVector
                   {opts} {packages}

After making these changes, run zpm bundle -r and everything should work correctly.