Automating Work with Tapis Actors¶
Introduction to Tapis Actors¶
What is Tapis?¶
- Tapis is TACC’s Application Programming Interface
Science-as-a-Service platform
Web services that provide access to TACC resources
Can provide access to other resources
Supports common file and compute job operations
Used by TACC Portals
More information about Tapis and its current and future capabilities is available at: The Tapis Project
What is Tapis Actors?¶
Tapis Actors is a form of serverless computing like AWS Lambda, Firebase Functions, or OpenFaas, but are tailored to support the needs of research computing.
What is the use case?¶
- I need X to happen when Y occurs.
I need to load a JSON file into a database when it is uploaded to my $WORK
I need to send an email to my supervisor when this analysis detects something interesting
I need to launch the next stage in workflow when the current stage completes successfully
I need to generate and email a report when the time is 1:00 PM each day
I need to compute a value N when given an input P
The Tapis Actors API lets you deploy functions or services that accomplish these (and many other) use cases.
- Tapis Actors also lets you:
- Avoid lockin
Built as Docker containers
Underlying platform is free and open-source
- Deploy any code
Write new function code in any language
Bring in legacy software and binaries
- Run at any scale
Automatically scale up when needed
Scale to zero when not
Tapis Actors can run standalone, be composed into complex workflows, or integrated with external third-party platforms. They serve as connecting threads amongst the complex systems in our commercial and research computing ecosystems.
What is an Actor?¶
For our purposes, an Actor is a container-based function-as-a-service deployed on a software platform called Abaco, where it will follow the actor model of concurrent computation.
- In response to a message it receives, an Actor can
Make local decisions
Create more actors
Send more messages
Determine how to respond to the next message received.
Actors may modify their own private state, but can only affect each other indirectly through messaging. The actor model is characterized by concurrency of computation within and among actors, dynamic actor creation, requirement for actor addresses in messages, and interaction only through direct message passing.
In the Tapis Actors implementation, each actor registered in the system is associated with a Docker image. Actor containers are executed in response to messages posted to their inbox, which itself is given by a URI exposed via the system. In the process of executing each actor container state, logs and execution statistics are collected.
Typically, functions performed by actors are quick and require little processing power. Use cases with more substantial run times or resource requirements are usually best addressed using Tapis Apps.
How Does an Actor Work?¶
The function an actor performs is specified as the default command in a Docker container. Code for this function is written based around a handful of core assumptions:
The message is passed via an environment variable
MSG
Supplemental environment variables can be specified when the actor is deployed
Parameters can be provided alongside the message, which are passed as additional environment variables
The execution environment is read-only and unpriveleged
Inbound network connections are disallowed
Outbound network connections are unrestricted
The execution environment is destroyed when the function has completed
STDERR
andSTDOUT
are captured by the Abaco platform for later review
Depending on the configuration of your specific Tapis Actors tenant, the following
additional assumptions may apply. For the tacc.prod
tenant running at TACC, they
absolutely do.
A Tapis access token is available via an environment variable
The TACC
$WORK
filesystem is mounted and writeable at/work
Code will run as your TACC-default user and group ID
Workflow¶
The workflow for bworking with Actors will be covered in detail in this tutorial, but briefly, is as follows:
Write code and package into a Docker container
Push the container to a public container registry (DockerHub)
Register an actor to use the container
Send a message to the actor
Verify execution by inspecting the logs
(Optional) Update container or actor
(Optional) Share the actor with other users
(Optional) Delete the actor
Learn More¶
For a full reference guide to actors, see the Tapis Actors online documentation.
Set Up Your Environment¶
Building, deploying, managing, and using Tapis Actors is comprehensively supported via the Tapis CLI and a local installation of Docker. In this section, we will cover preparation of your working environment.
Prerequisites¶
You will need the following to work with Tapis Actors:
A terminal emulator and/or SSH client
Local Development Environment¶
Many people develop and use Actors right on their local laptop computer using the Tapis CLI and Docker Desktop
Note
You can also install the latest Tapis CLI from source but we recommend using the version available on PyPi.
Configure the Tapis CLI¶
Tapis CLI must be configured to talk to the Tapis APIs and DockerHub. The
auth init
command is an interactive workflow that guides you through that
process.
$ tapis auth init --interactive
Configure Tapis API access:
===========================
+---------------+--------------------------------------+----------------------------------------+
| Name | Description | URL |
+---------------+--------------------------------------+----------------------------------------+
| 3dem | 3dem Tenant | https://api.3dem.org/ |
| a2cps | Acute to Chronic Pain Signatures | https://api.a2cps.org/ |
| bridge | Bridge | https://api.bridge.tacc.cloud/ |
| designsafe | DesignSafe | https://agave.designsafe-ci.org/ |
| iplantc.org | CyVerse Science APIs | https://agave.iplantc.org/ |
| irec | iReceptor | https://irec.tenants.prod.tacc.cloud/ |
| portals | Portals Tenant | https://portals-api.tacc.utexas.edu/ |
| sd2e | SD2E Tenant | https://api.sd2e.org/ |
| sgci | Science Gateways Community Institute | https://sgci.tacc.cloud/ |
| tacc.prod | TACC | https://api.tacc.utexas.edu/ |
| vdjserver.org | VDJ Server | https://vdj-agave-api.tacc.utexas.edu/ |
+---------------+--------------------------------------+----------------------------------------+
Enter a tenant name [tacc.prod]: tacc.prod
Verify SSL connections [Y/n]: Y
tacc.prod username: vaughn
tacc.prod password for vaughn: <password>
Container registry access:
--------------------------
Registry Url [https://index.docker.io]:
Registry Username [mwvaughn]: mwvaughn
Registry Password []: <password>
Registry Namespace [sd2e]: mwvaughn
+--------------------+----------------------------------+
| Field | Value |
+--------------------+----------------------------------+
| tenant_id | tacc.prod |
| username | vaughn |
| api_key | k_Mt8PGe_e1T4fSFpnvfSkVOyIQa |
| access_token | cbbb521f1f4df4ad278d6dbf30168812 |
| expires_at | Wed Aug 25 13:21:00 2021 |
| verify | True |
| registry_url | https://index.docker.io |
| registry_username | mwvaughn |
| registry_password | p******d |
| registry_namespace | mwvaughn |
+--------------------+----------------------------------+
Now, confirm that Tapis CLI is properly configured by retrieving your
user data from the tapis profiles
service.
$ tapis profiles show me
+--------------+------------------------+
| Field | Value |
+--------------+------------------------+
| first_name | Matthew |
| last_name | Vaughn |
| email | vaughn@tacc.utexas.edu |
| mobile_phone | |
| phone | |
| username | vaughn |
+--------------+------------------------+
Using a VM¶
If you find that your local system does not support the Tapis CLI or Docker, it is possible to use a virtual machine. Please do feel free to reach out to us at TACC for assistance.
Build Hello World Actor¶
Let us build our hello-world actor!
Create a New Actor¶
The function of an actor is exposed as the default command in a Docker container. Here, we will create an actor from an existing Docker container image called tacc/hello-world:1.0 available on Docker Hub. The default command for this container simply prints the message “Hello, World” or the message sent to it, which will be captured in the actor logs.
Create the actor as:
$ tapis actors create --repo tacc/hello-world:1.0 \
-n hello-world-actor \
-d "Test actor that says Hello, World"
+----------------+-----------------------------+
| Field | Value |
+----------------+-----------------------------+
| id | NN5N0kGDvZQpA |
| name | hello-world-actor |
| owner | taccuser |
| image | tacc/hello-world:1.0 |
| lastUpdateTime | 2021-07-14T22:25:06.171534 |
| status | SUBMITTED |
| cronOn | False |
+----------------+-----------------------------+
The --repo
flag points to the Docker Hub repo on which this actor is based,
the -n
flag and -d
flag attach a human-readable name and description to
the actor.
The resulting actor is assigned an id: NN5N0kGDvZQpA
. The actor id can be
queried by:
$ tapis actors show NN5N0kGDvZQpA
+-----------------+-----------------------------------+
| Field | Value |
+-----------------+-----------------------------------+
| id | NN5N0kGDvZQpA |
| name | hello-world-actor |
| description | Test actor that says Hello, World |
| owner | taccuser |
| image | tacc/hello-world:1.0 |
| createTime | 2021-09-21T20:05:10.738Z |
| lastUpdateTime | 2021-09-21T20:05:10.738Z |
| gid | 859336 |
| link | |
| privileged | False |
| queue | default |
| stateless | True |
| status | READY |
| statusMessage | |
| token | True |
| uid | 859336 |
| useContainerUid | False |
| webhook | |
| cronOn | False |
| cronSchedule | None |
+-----------------+-----------------------------------+
Above, you can see the plain text name, description that were passed on the command line. In addition, you can see the “status” of the actor is “READY”, meaning it is ready to receive and act on messages. Finally, you can list all actors visible to you with:
$ tapis actors list
+---------------+-------------------+----------+-----------------------------+----------------------------+--------+-------+
| id | name | owner | image | lastUpdateTime | status | cronOn|
+---------------+-------------------+----------+-----------------------------+----------------------------+--------+-------+
| NN5N0kGDvZQpA | hello-word-actor | taccuser | tacc/hello-world:1.0 | 2021-07-14T22:25:06.171Z | READY | False |
+---------------+-------------------+----------+-----------------------------+----------------------------+--------+-------+
Submit a Message to the Actor¶
Next, let’s craft a simple message to send to the reactor. Messages can be plain text or in JSON format. When using the python actor libraries as in the example above, JSON-formatted messages are made available as python dictionaries.
# Submit the message to the actor
$ tapis actors submit -m "Hello, World" NN5N0kGDvZQpA
+-------------+---------------+
| Field | Value |
+-------------+---------------+
| executionId | N4xQ5WM5Np1X0 |
| msg | Hello, World |
+-------------+---------------+
The id of the actor (N4xQ5WM5Np1X0
) was used on the command line to specify
which actor should receive the message. In response, an “execution id”
(N4xQ5WM5Np1X0
) is returned. An execution is a specific instance of an actor.
List all the executions for a given actor as:
$ tapis actors execs list NN5N0kGDvZQpA
+---------------+----------+
| executionId | status |
+---------------+----------+
| N4xQ5WM5Np1X0 | COMPLETE |
+---------------+----------+
Show detailed information for the execution with:
$ tapis actors execs show -v NN5N0kGDvZQpA N4xQ5WM5Np1X0
{
"actorId": "NN5N0kGDvZQpA",
"apiServer": "https://api.tacc.utexas.edu",
"cpu": 121748743,
"exitCode": 0,
"finalState": {
"Dead": false,
"Error": "",
"ExitCode": 0,
"FinishedAt": "2021-07-14T22:32:45.602Z",
"OOMKilled": false,
"Paused": false,
"Pid": 0,
"Restarting": false,
"Running": false,
"StartedAt": "2021-07-14T22:32:45.223Z",
"Status": "exited"
},
"id": "N4xQ5WM5Np1X0",
"io": 176,
"messageReceivedTime": "2021-07-14T22:32:37.051Z",
"runtime": 1,
"startTime": "2021-07-14T22:32:44.752Z",
"status": "COMPLETE",
"workerId": "JABKl4BeDwXJD",
"_links": {
"logs": "https://api.tacc.utexas.edu/actors/v2/NN5N0kGDvZQpA/executions/N4xQ5WM5Np1X0/logs",
"owner": "https://api.tacc.utexas.edu/profiles/v2/sgopal",
"self": "https://api.tacc.utexas.edu/actors/v2/NN5N0kGDvZQpA/executions/N4xQ5WM5Np1X0"
}
}
We can see here that the above execution has already completed.
Check the Logs for an Execution¶
An execution’s logs will contain whatever was printed to STDOUT / STDERR by the actor. In our demo actor, we just expect the actor to print the message passed to it.
$ tapis actors execs logs NN5N0kGDvZQpA N4xQ5WM5Np1X0
Logs for execution N4xQ5WM5Np1X0
Actor received message: Hello, World
In a normal scenario, the actor would then act on the contents of a message to, e.g., kick off a job, perform some data management, send messages to other actors, or more.
Run Synchronously¶
The previous message submission (with tapis actors submit
) was an
asynchronous run, meaning the command prompt detached from the process after
it was submitted to the actor. In that case, it was up to us to check the execution
to see if it had completed and manually print the logs.
There is also a mode to run actors synchronously using tapis actors run
,
meaning the command line stays attached to the process awaiting a response after
sending a message to the actor.
Delete and Update an Actor¶
Actors can be deleted with the following:
$ tapis actors delete NN5N0kGDvZQpA
+----------+-------------------+
| Field | Value |
+----------+-------------------+
| deleted | ['NN5N0kGDvZQpA'] |
| messages | [] |
+----------+-------------------+
This will delete the actor and any associated executions.
Actors can also be updated with the tapis actors update
command to make changes once created.
Need help? Ask your questions using the [TACCSTER 2021 Slack Workspace] using the #tutorial-automating-work-at-tacc-with-tapis-actors channel.
Slackbot Actor¶
In this section, we will create an actor that sends a message to a Slack Channel using a webhook provided by Slack. A pre-requisite requirement is an active webhook for your Slack workspace.
Writing Customized Actors¶
Recall that Tapis Actors are essentially bits of code wrapped in lightweight environments called containers.
When we deployed the hello-world
Actor in the previous section, we used a pre-built Docker container (tacc/hello-world:1.0
).
However, a more common use case is to write custom code that runs whenever the Actor executes. To do this, we must first build a
Docker image using our custom actor.py
, push the Docker image to a public image registry (DockerHub), and instruct Tapis to
use our custom image when executing the container.
If you are interested in learning more about Docker containers, please attend our deep dive Docker container session tomorrow (Friday Sept. 24th), where we will cover custom image creation, common use cases, and more. The documentation for this session can be found here .
Initialize a New Actor slackbot-actor
¶
$ mkdir slackbot_actor
$ cd slackbot_actor
$ touch Dockerfile actor.py
$ find -L .
.
./Dockerfile
./actor.py
Edit the Actor Source Code in actor.py
¶
An example of a functional actor that says sends a slack message is:
# actor.py
"""Forward message from Actor inbox to Slack"""
from agavepy.actors import get_context
import requests
import os
import simplejson as json
def post_to_slack(message: str):
"""Forward string type `message` to Slack channel"""
webhook_url = os.environ.get('SLACK_WEBHOOK')
print("Actor sending message to Slack: {0}".format(message))
response_from_slack = requests.post(
webhook_url, data=json.dumps({'text': message}),
headers={"Content-type": "application/json"})
print("Response from Slack:")
print(response_from_slack)
def main():
"""Main entrypoint"""
context = get_context()
message = context['raw_message']
post_to_slack(message)
if __name__ == '__main__':
main()
Here inject the necessary environment variable SLACK_WEBHOOK
set from tapis actors create
command.
How do we get the webhook into our Actor? We don’t. Instead of embedding it in the underlying container, it is defined as an environment variable in
the Actor. We must set SLACK_WEBHOOK
while creating the actor. This can be done using Tapis CLI.
In the tapis actors create
command, we can pass it as an environment variables via the -e flag.
Write the Dockerfile¶
The first step to building our customized Actor is to build a customized Docker image.
To do this, we write a Dockerfile, which is
a list of instructions for building our customized environment.
The below Dockerfile constructs an image containing a Python3 runtime
environment, Python package dependencies installed by pip, and our actor.py
script.
The pip-installed requirements are agavepy, requests
simplejson python libraries, which are
available through
PyPi.
# pull base image, which provides a Python3 runtime
FROM python:3.6
# install package dependencies using pip
RUN pip3 install agavepy simplejson requests
# add our custom Python script
ADD actor.py /actor.py
# command to run the python script
CMD ["python", "/actor.py"]
Build Docker Container¶
We can now use our Dockerfile to build a custom Docker image:
Note
In the below command, make sure to replace taccuser
with your DockerHub username.
# Build and tag the image
$ docker build -t taccuser/slackbot-actor:1.1 .
Sending build context to Docker daemon 4.096kB
Step 1/5 : FROM python:3.6
...
Successfully built b0a76425e8b3
Successfully tagged taccuser/slackbot-actor:1.1
# Push the tagged image to Docker Hub
$ docker push taccuser/slackbot-actor:1.1
The push refers to repository [docker.io/taccuser/slackbot-actor]
...
1.1: digest: sha256:67cc6f6f00589d9ae83b99d779e4893a25e103d07e4f660c14d9a0ee06a9ddaf size: 1995
Create the Actor¶
We pass the SLACK_WEBHOOK
as an environment variable during the time of actor creation.
$ tapis actors create --repo taccuser/slackbot-actor:1.1 \
-n slackbot-actor \
-d "Send a message containing text to Slack channel" \
-e SLACK_WEBHOOK="https://hooks.slack.com/services/${XXXsecretXtokenXXX}"
+----------------+----------------------------+
| Field | Value |
+----------------+----------------------------+
| id | ww15Ex5oLxJ6b |
| name | slackbot-actor |
| owner | taccuser |
| image | taccuser/slackbot-actor:1.1|
| lastUpdateTime | 2021-08-24T14:31:58.248860 |
| status | SUBMITTED |
+----------------+----------------------------+
$ tapis actors show ww15Ex5oLxJ6b
+-----------------+-------------------------------------------------+
| Field | Value |
+-----------------+-------------------------------------------------+
| id | ww15Ex5oLxJ6b |
| name | slackbot-actor |
| description | Send a message containing text to Slack channel |
| owner | taccuser |
| image | taccuser/slackbot-actor:1.1 |
| createTime | 2021-09-21T20:27:05.613Z |
| lastUpdateTime | 2021-09-21T20:27:05.613Z |
| gid | 859336 |
| link | |
| privileged | False |
| queue | default |
| stateless | True |
| status | READY |
| statusMessage | |
| token | True |
| uid | 859336 |
| useContainerUid | False |
| webhook | |
| cronOn | False |
| cronSchedule | None |
+-----------------+-------------------------------------------------+
we can see the “status” of the actor is “READY”, meaning it is ready to receive and act on messages.
Finally, you can list all actors visible to you with:
$ tapis actors list
+---------------+---------------+----------+-----------------------------+----------------------------+--------+
| ww15Ex5oLxJ6b | slackbot-actor| taccuser | taccuser/slackbot-actor:1.1 | 2021-08-25T14:04:42.819Z | READY |
+---------------+---------------+----------+-----------------------------+----------------------------+--------+
Submit a Message to the Actor¶
# Submit the message to the actor
$ tapis actors submit -m "Hello, Slack!" ww15Ex5oLxJ6b
+-------------+---------------+
| Field | Value |
+-------------+---------------+
| executionId | EjO6yw03GKRmR |
| msg | Hello, Slack |
+-------------+---------------+
Let us grab the executionId from here to track the progress of the actor.
List Executions of Actor¶
The above execution has already completed. Show detailed information for the execution with:
$ tapis actors execs show ww15Ex5oLxJ6b EjO6yw03GKRmR
+-----------+-----------------------------+
| Field | Value |
+-----------+-----------------------------+
| actorId | ww15Ex5oLxJ6b |
| apiServer | https://api.tacc.utexas.edu |
| id | EjO6yw03GKRmR |
| status | COMPLETE |
| workerId | EbQByMAXeMVPa |
+-----------+-----------------------------+
Check the Logs for an Execution¶
In our slackbot-actor, we expect the actor to print the message passed to it and notify on the slack channel.
$ tapis actors execs logs ww15Ex5oLxJ6b EjO6yw03GKRmR
Logs for execution EjO6yw03GKRmR
Actor sending message to Slack: Hello, Slack!
Finally check your Slack channel to find your message!
Send a Message Between Actors¶
Introduction¶
While standalone Actors are useful, one can also network multiple Actors to generate more complex workflows. Actors have the ability to send messages to other Actors, allowing developers to chain together workflow steps, each contained in a separate Actor.
In this section of the tutorial, we will deploy a simple
upstream-messenger
Actor that sends a message to the
hello-world-actor
that we deployed in the previous section. We will
demonstrate that the downstream hello-world-actor
runs after it receives
the message from upstream-messenger
.
Create a New Actor Named upstream-messenger
¶
First, let’s write the code that our new Actor will run on execution. Instead of manually writing these files, we can simply adapt one of the provided Actor templates. To view all available Actor templates, issue:
$ tapis actors init -L
+--------------------+--------------------+------------------------------------------------------------+----------+
| id | name | description | level |
+--------------------+--------------------+------------------------------------------------------------+----------+
| default | Default | Basic code and configuration skeleton | beginner |
| echo | Echo | Echo message | beginner |
| hello_world | Hello World | Say Hello, World! | beginner |
| sd2e_base | sd2e_base | Default reactor context for docker://sd2e/reactors:python3 | beginner |
| tacc_reactors_base | tacc_reactors_base | Default actor context for docker://sd2e/reactors:python3 | beginner |
+--------------------+--------------------+------------------------------------------------------------+----------+
Each template is a project directory for a different type of Actor. For
this Actor, let’s use the default
template:
$ tapis actors init --template default --actor-name upstream-messenger
+-------+---------------------------------------------------------------+
| stage | message |
+-------+---------------------------------------------------------------+
| setup | Project path: ./upstream_messenger |
| setup | CookieCutter variable name=upstream-messenger |
| setup | CookieCutter variable project_slug=upstream_messenger |
| setup | CookieCutter variable docker_namespace=taccuser |
| setup | CookieCutter variable docker_registry=https://index.docker.io |
| clone | Project path: ./upstream_messenger |
+-------+---------------------------------------------------------------+
$ cd upstream_messenger
$ find -L .
.
./requirements.txt
./Dockerfile
./project.ini
./message.jsonschema
./default.py
./.gitignore
./secrets.jsonsample
./config.yml
We see that the tapis actors init
command has initialized an Actor
project directory for us, and it already contains many files that we
could have written by hand such as the Dockerfile
or the Python
source code in default.py
.
The only file that was not provided by the template is secrets.json
.
Let’s make an empty one now:
echo '{}' > secrets.json
Edit Actor Source¶
The Actor we just created doesn’t do much; it just says “hello world,”
like the hello-world-actor
we deployed previously. Let’s change its
behavior so it does something more interesting, like message another
Actor. We will use the Python API command
sendMessage
to implement this. Using your favorite text editor, edit the
default.py
script so it looks like:
import os
from agavepy.actors import get_context, get_client
def main():
"""Main entrypoint"""
context = get_context()
m = context['raw_message']
print("Actor received message: {}".format(m))
# Get an active Tapis client
client = get_client()
# Pull in the downstream Actor ID from the environment
downstream_actor_id = context['DOWNSTREAM_ACTOR_ID']
# alternatively:
# downstream_actor_id = os.environ['DOWNSTREAM_ACTOR_ID']
# Using our Tapis client, send a message to the downstream Actor
message = 'greetings, hello-world-actor!'
print("Sending message '{}' to {}".format(message, downstream_actor_id))
response = client.actors.sendMessage(actorId=downstream_actor_id, body={"message": message})
print("Successfully triggered execution '{}' on actor '{}'".format(response['executionId'], downstream_actor_id))
if __name__ == '__main__':
main()
All we’ve done is add a block of code that calls the Tapis/Agave API so that it sends a message to another Actor. Notice that we are mimicking the CLI workflow from before:
Action |
CLI |
Python API |
---|---|---|
Get an authenticated
Tapis client
|
tapis auth init
|
client = get_client()
|
Using the client,
send message to an
Actor
|
tapis actors submit
|
client.actors.sendMessage()
|
Using the client,
submit HPC job to
Tapis Application
|
tapis jobs submit
|
client.jobs.submit()
|
In fact, the CLI is making the same calls to the Python API under the hood!
Notice that we haven’t actually defined which Actor ID we want to
send the message to. Per best practice, we’ve chosen not to “hard code”
the Actor ID into default.py
, but rather read it from the Actor
environment, which we access via context['DOWNSTREAM_ACTOR_ID']
or
alternatively os.environ['DOWNSTREAM_ACTOR_ID']
. To set the
DOWNSTREAM_ACTOR_ID
, we need only define it in the Actor environment
when we deploy in the next step. The downstream Actor is the
hello-world-actor
we deployed previously, and we can retrieve its ID
using the CLI:
$ tapis actors list
+---------------+--------------------+-------+-------------------------------+--------------------------+--------+--------+
| id | name | owner | image | lastUpdateTime | status | cronOn |
+---------------+--------------------+-------+-------------------------------+--------------------------+--------+--------+
| MqqbarbazBB8x | hello-world-actor | eho | tacc/hello-world:latest | 2021-08-24T19:13:44.036Z | READY | False |
+---------------+--------------------+-------+-------------------------------+--------------------------+--------+--------+
We will need this Actor ID (MqqbarbazBB8x
in my case, yours will be
different) when we deploy in the next section.
Deploy Actor¶
Our new upstream-messenger
Actor is now ready to deploy. Just like
before, we want to:
Build the Docker image
Push the Docker image
Register the Docker image as a new Actor
Remember to replace the DOWNSTREAM_ACTOR_ID
with the appropriate
Actor ID from above, and the placeholder taccuser
with your
DockerHub username.
$ docker build -t taccuser/upstream-messenger:0.0.1 .
$ docker push taccuser/upstream-messenger:0.0.1
$ tapis actors create --repo taccuser/upstream-messenger:0.0.1 \
-n upstream-messenger \
-d "Sends message to another actor" \
-e DOWNSTREAM_ACTOR_ID=MqqbarbazBB8x
+----------------+-----------------------------------+
| Field | Value |
+----------------+-----------------------------------+
| id | MDfoobar7AOwx |
| name | upstream-messenger |
| owner | taccuser |
| image | taccuser/upstream-messenger:0.0.1 |
| lastUpdateTime | 2021-08-26T20:33:20.320620 |
| status | SUBMITTED |
| cronOn | False |
+----------------+-----------------------------------+
If deployment was successful, we should now see our new Actor:
$ tapis actors list
+---------------+--------------------+-------+-----------------------------------+--------------------------+--------+--------+
| id | name | owner | image | lastUpdateTime | status | cronOn |
+---------------+--------------------+-------+-----------------------------------+--------------------------+--------+--------+
| MqqbarbazBB8x | hello-world-actor | eho | tacc/hello-world:latest | 2021-08-24T19:13:44.036Z | READY | False |
| MDfoobar7AOwx | upstream-messenger | eho | taccuser/upstream-messenger:0.0.1 | 2021-08-24T20:23:07.619Z | READY | False |
+---------------+--------------------+-------+-----------------------------------+--------------------------+--------+--------+
$ tapis actors show -v MDfoobar7AOwx
{
"id": "MDfoobar7AOwx",
"name": "upstream-messenger",
"description": "Sends message to another actor",
"owner": "eho",
"image": "enho/upstream-messenger:0.0.1",
"createTime": "2021-09-21T20:35:33.39Z0",
"lastUpdateTime": "2021-09-21T20:35:33.39Z0",
"defaultEnvironment": {
"DOWNSTREAM_ACTOR_ID": "MqqbarbazBB8x"
},
"gid": 859336,
"hints": [],
"link": "",
"mounts": [],
"privileged": false,
"queue": "default",
"stateless": true,
"status": "READY",
"statusMessage": " ",
"token": true,
"uid": 859336,
"useContainerUid": false,
"webhook": "",
"cronOn": false,
"cronSchedule": null,
"cronNextEx": null,
"_links": {
"executions": "https://api.tacc.utexas.edu/actors/v2/MDfoobar7AOwx/executions",
"owner": "https://api.tacc.utexas.edu/profiles/v2/eho",
"self": "https://api.tacc.utexas.edu/actors/v2/MDfoobar7AOwx"
}
}
Send Message to upstream-messenger
Using CLI¶
Once the upsteam_messenger
Actor is READY, we can trigger a new
execution by sending it a message:
$ tapis actors submit -m 'hello, upstream-messenger!' MDfoobar7AOwx
+-------------+----------------------------+
| Field | Value |
+-------------+----------------------------+
| executionId | MDanexec7AOwx |
| msg | hello, upstream-messenger! |
+-------------+----------------------------+
As usual, we check the status of the execution, and show the logs when it finishes:
$ tapis actors execs show MDfoobar7AOwx MDanexec7AOwx
+-----------+-----------------------------+
| Field | Value |
+-----------+-----------------------------+
| actorId | MDfoobar7AOwx |
| apiServer | https://api.tacc.utexas.edu |
| id | MDanexec7AOwx |
| status | COMPLETE |
| workerId | wZvworker1KmQ |
+-----------+-----------------------------+
$ tapis actors execs logs MDfoobar7AOwx MDanexec7AOwx
Actor received message: hello, upstream-messenger!
Sending message 'greetings, hello-world-actor!' to MqqbarbazBB8x
Successfully triggered execution '5P7foobarrrA6' on actor 'MqqbarbazBB8x'
Check Execution of Downstream hello-world-actor
¶
The goal of this tutorial was to send a message to
upstream-messenger
and have it trigger an execution on
hello-world-actor
. Let’s check the status of the execution and inspect
the logs:
$ tapis actors execs show MqqbarbazBB8x 5P7foobarrrA6
+-----------+-----------------------------+
| Field | Value |
+-----------+-----------------------------+
| actorId | MqqbarbazBB8x |
| apiServer | https://api.tacc.utexas.edu |
| id | 5P7foobarrrA6 |
| status | COMPLETE |
| workerId | DJPworkerzKlN |
+-----------+-----------------------------+
$ tapis actors execs logs MqqbarbazBB8x 5P7foobarrrA6
Logs for execution 5P7foobarrrA6
Actor received message: hello, hello-world-actor!
Conclusion¶
Congratulations! We have successfully deployed a workflow that sends a message between two Actors. Of course, real-world multi-Actor workflows will send much more useful information than “hello, world.” In practice, messages contain file paths, names of analyses to run, and other metadata. It is also possible for one Actor to send messages to multiple other Actors, allowing for a single action such as a file upload to trigger many downstream processes, such as file management, running analyses, logging, and more.
Deploying a Sequencing Pipeline¶
Introduction¶
Thus far, we have demonstrated several useful features of actors:
Deployment, execution, and handling using Tapis CLI
Logging to 3rd party services (Slack in our example)
Inter-Actor messaging
In this section of the tutorial, we will demonstrate this toolbox in a realistic example: a basic pipeline for validating DNA sequencing data.

We will leverage a few additional Actor capabilities beyond what we have covered previously:
- Actors can interact with TACC filesystems such as
/work
using the Tapis API We won’t cover this in depth today, but if you’re interested in Tapis Files, please consult the Tapis documentation
- Actors can interact with TACC filesystems such as
Actors can submit HPC jobs, by submitting to Tapis Applications
Reusing as much of our previous work as possible, pipeline deployment will consist of the following steps:
Deploy a Tapis App
eho-fastqc-0.11.9
that runs the FastQC tool as HPC jobDeploy new actor
fastqc-launcher
Change
upstream-messenger
so it uploads a fastq data file from the web to TACC Stockyard filesystemRe-deploy
upstream-messenger
so it sends the path to data file to thefastqc-launcher
Since deploying Tapis Apps is outside the scope of this tutorial, I have already completed the first step. If you are interested in learning how to deploy Tapis Applications, I encourage you to consult one of our in-depth Tapis tutorials .
Deploy a New Actor fastqc-launcher
¶
Here, we will create an Actor that launches an submits an HPC job to a Tapis App. Recall that Tapis Apps are wrappers around computationally intensive processes. They are similar to Actors in that they can be managed using the Tapis API, but unlike Actors, they are HPC jobs under the hood.
Similar to the upstream-messenger
, this new Actor (which we will call fastqc-launcher
) leverages the active Tapis client
to interact with the Tapis ecosystem. In this case, we will submit to a
Tapis Application instead of messaging another Actor. We instantiate a new
Actor as before:
tapis actors init --template default --actor-name fastqc-launcher
cd fastqc_launcher
echo '{}' > secrets.json
We edit the Actor source code in default.py
so it resembles:
import os
from agavepy.actors import get_context, get_client
def main():
context = get_context()
fastq_uri = context['raw_message']
print("Actor received message: {}".format(fastq_uri))
# Usually, one would perform some input validation before submitting
# a job to a Tapis App. Here, we simply validate that the path looks
# like a Tapis/Agave URI
assert fastq_uri.startswith('agave://')
# Get an active Tapis client
client = get_client()
# Using our Tapis client, submit a job to Tapis App eho-fastqc-0.11.9
body = {
"name": "fastqc-test",
"appId": "eho-fastqc-0.11.9",
"archive": False,
"inputs": {
"fastq": "agave://eho.work.storage/{}".format(os.path.basename(fastq_uri))
}
}
response = client.jobs.submit(body=body)
print("Successfully submitted job {} to Tapis App {}".format(response['id'], response['appId']))
if __name__ == '__main__':
main()
We can deploy this new Actor as usual, by building, pushing, and registering the custom Docker image as a new Actor:
$ docker build -t taccuser/fastqc-launcher:0.0.1 .
$ docker push taccuser/fastqc-launcher:0.0.1
$ tapis actors create --repo taccuser/fastqc-launcher:0.0.1 \
-n fastqc-launcher \
-d "Submits job to FastQC Tapis App"
Edit upstream-messenger
Source¶
Using your favorite text editor, edit the default.py
for upstream-messenger
so it looks like:
import os
from agavepy.actors import get_context, get_client
import requests
def main():
"""Main entrypoint"""
context = get_context()
m = context['raw_message']
print("Actor received message: {}".format(m))
# Get an active Tapis client
client = get_client()
# Pull in the downstream Actor ID from the environment
downstream_actor_id = context['DOWNSTREAM_ACTOR_ID']
# alternatively:
# downstream_actor_id = os.environ['DOWNSTREAM_ACTOR_ID']
# Using our Tapis client,
# upload our fastq file to TACC Stockyard using
url = "https://raw.githubusercontent.com/eho-tacc/fastqc_app/main/tests/data_R1_001.fastq"
systemId = 'eho.work.storage'
files_resp = client.files.importData(
fileName='example.fastq',
filePath='/',
systemId=systemId, urlToIngest=url)
# Using our Tapis client, send the message containing file path
# to the downstream Actor
message = "agave://{}/{}".format(systemId, files_resp['path'])
print("Sending message '{}' to {}".format(message, downstream_actor_id))
response = client.actors.sendMessage(actorId=downstream_actor_id, body={"message": message})
print("Successfully triggered execution '{}' on actor '{}'".format(response['executionId'], downstream_actor_id))
if __name__ == '__main__':
main()
Re-deploy Actor upstream-messenger
¶
Our Actor upstream-messenger
is still configured to send messages to hello-world-actor
.
We would instead like it to send messages to our new actor fastqc-launcher
, so we must
update it with a new DOWNSTREAM_ACTOR_ID
. Instead of deleting and deploying a new
Actor, we can instead:
Build and push an updated Docker image
Update the
DOWNSTREAM_ACTOR_ID
variable usingtapis actors update
$ docker build -t enho/upstream-messenger:0.0.2 .
$ docker push enho/upstream-messenger:0.0.2
$ tapis actors update --repo taccuser/upstream-messenger:0.0.2 \
-e DOWNSTREAM_ACTOR_ID=$FASTQC_LAUNCHER_ID \
MDfoobar7AOwx
+----------------+-----------------------------------+
| Field | Value |
+----------------+-----------------------------------+
| id | MDfoobar7AOwx |
| name | upstream-messenger |
| owner | taccuser |
| image | taccuser/upstream-messenger:0.0.2 |
| lastUpdateTime | 2021-08-26T20:33:20.320620 |
| status | SUBMITTED |
| cronOn | False |
+----------------+-----------------------------------+
Test the Pipeline¶
Send Message to upstream-messenger
Using CLI¶
Once the upsteam_messenger
Actor is READY, we can trigger a new
execution by sending it a message:
$ tapis actors submit -m 'hello, FastQC pipeline!' MDfoobar7AOwx
+-------------+----------------------------+
| Field | Value |
+-------------+----------------------------+
| executionId | MDanexec7AOwx |
| msg | hello, FastQC pipeline! |
+-------------+----------------------------+
As usual, we check the status of the execution, and show the logs when it finishes:
$ tapis actors execs show MDfoobar7AOwx MDanexec7AOwx
+-----------+-----------------------------+
| Field | Value |
+-----------+-----------------------------+
| actorId | MDfoobar7AOwx |
| apiServer | https://api.tacc.utexas.edu |
| id | MDanexec7AOwx |
| status | COMPLETE |
| workerId | wZvworker1KmQ |
+-----------+-----------------------------+
$ tapis actors execs logs MDfoobar7AOwx MDanexec7AOwx
Actor received message: hello, FastQC pipeline!
Sending message 'greetings, hello-world-actor!' to MqqbarbazBB8x
Successfully triggered execution '5P7foobarrrA6' on actor 'MqqbarbazBB8x'
Check File Upload¶
Using the Tapis CLI, we can check that the upstream-messenger
created the expected file:
$ tapis files show agave://eho.work.storage/example.fastq
+--------------+-------------------------------+
| Field | Value |
+--------------+-------------------------------+
| name | example.fastq |
| path | /work/06634/eho/example.fastq |
| lastModified | 20 seconds ago |
| length | 64431 |
| permissions | READ_WRITE |
| mimeType | application/octet-stream |
| type | file |
+--------------+-------------------------------+
Check Execution of Downstream fastqc-launcher
¶
Let’s check the status of the execution and inspect the logs:
$ tapis actors execs logs MqqbarbazBB8x wKoAJD5NykAKN
Logs for execution wKoAJD5NykAKN
Actor received message: agave://eho.work.storage/example.fastq
Successfully submitted job 6c9d5842-XXXX-XXXX-XXXX-f07b1f73b948-007 to Tapis App eho-fastqc-0.11.9
Check Job Submitted to FastQC Tapis App¶
Finally, let’s check that our test job using the FastQC Tapis App successfully finished. We can check the status of this job using the CLI:
$ tapis jobs show 6c9d5842-XXXX-XXXX-XXXX-f07b1f73b948-007
+--------------------+---------------------------------------------------------------------------------+
| Field | Value |
+--------------------+---------------------------------------------------------------------------------+
| accepted | 2021-09-22T21:39:06.147Z |
| appId | eho-fastqc-0.11.9 |
| appUuid | 4765625137153839596-XXXXXXXX-0001-005 |
| archive | False |
| archivePath | eho/archive/jobs/job-XXXXXXXX-XXXX-XXXX-afbf-f07b1f73b948-007 |
| archiveSystem | None |
| blockedCount | 0 |
| created | 2021-09-22T21:39:06.152Z |
| ended | 4 minutes ago |
| failedStatusChecks | 0 |
| id | 6c9d5842-3066-XXXX-XXXX-f07b1f73b948-007 |
| lastStatusCheck | 4 minutes ago |
| lastStatusMessage | Transitioning from status CLEANING_UP to FINISHED in phase ARCHIVING. |
| lastUpdated | 2021-09-22T21:40:52.194Z |
| maxHours | 0.5 |
| memoryPerNode | 1.0 |
| name | fastqc-test |
| nodeCount | 1 |
| owner | eho |
| processorsPerNode | 1 |
| remoteEnded | 4 minutes ago |
| remoteJobId | 8493758 |
| remoteOutcome | FINISHED |
| remoteQueue | normal |
| remoteStarted | 2021-09-22T21:39:42.702Z |
| remoteStatusChecks | 2 |
| remoteSubmitted | 5 minutes ago |
| schedulerJobId | None |
| status | FINISHED |
| submitRetries | 0 |
| systemId | eho.stampede2.execution |
| tenantId | tacc.prod |
| tenantQueue | aloe.jobq.tacc.prod.submit.DefaultQueue |
| visible | True |
| workPath | /scratch/06634/eho/eho/job-XXXXXXXX-XXXX-XXXX-XXXX-f07b1f73b948-007-fastqc-test |
+--------------------+---------------------------------------------------------------------------------+
We can also download and inspect the job outputs:
$ tapis jobs outputs download 6c9d5842-XXXX-XXXX-XXXX-f07b1f73b948-007
+-------------+-------+
| Field | Value |
+-------------+-------+
| downloaded | 11 |
| skipped | 0 |
| messages | 5 |
| elapsed_sec | 59 |
+-------------+-------+
$ cd 6c9d5842-3066-XXXX-XXXX-f07b1f73b948-007
$ cat ./*.err
# ...
Started analysis of example.fastq
$ cat ./*.out
Analysis complete for example.fastq