# Getting Started

The Worker (opens new window) is a piece of python code that runs workflows sent from AWF. You can run the worker code on any machine, whether it is a server in the cloud, a HPC cluster, or even your laptop!

💡 TIP

Setting up a worker is helpful for testing new workflows that are under development. You can run the worker code on your local machine and pick up your workflow runs for testing. You can make sure the workflow runs smoothly prior to publishing it on the main production site!

This section will help you set up your pc/laptop to temporarily work as a worker for the AWF system. This great for local debugging of your workflow. When the workflow development is done, the workflow will be pushed to production and a a worker will be set up on-premise or in the cloud to run your workflow for others.

# 🔨 Installation

There are some required installations that you will need in order to run the worker code:

git (opens new window)

This can be checked by running the following command:

git --version

In case you do not have experience with git and setting up repositories (or just need a refresh), this is a good place to look: Gitlab Getting Started (opens new window).

miniconda (opens new window)

This can be checked by running the following command:

conda -V

Anaconda/miniconda allows you to run different versions of python and create dedicated environments to manage your python dependencies.

# Clone Worker

Go to the directory you want to clone the main worker repository down to. This command will allow you to clone (download) the worker:

git clone git@gitlab.arup.com:awf/awf-worker.git

# 🔧 Worker Configuration File

You will need to configure the worker to register with AWF and describe the types of jobs it can run.

All of the worker configuration is managed through a single configuration file.

🙌 IMPORTANT

The worker/templates directory of the repository contains a config.example.xml file. Rename this to config.xml and place it in the worker directory. This is your worker configuration file.

This file contains configuration settings that are necessary to get the worker running, divided in 4 sections:

<general>
<authentication>
<applications>
<folders>

# General Configuration

The <general> section contains 3 tags that need to be filled in:

<worker_id> - A unique ID to identify the worker, given to you by an AWF admin. See Worker ID & API Key.
<worker_profile> - The profile of the worker. You can call this anything, but it should match the profile on your workflow configuration.
<api_key> - A special API key for your worker, given to you by an AWF admin. See Worker ID & API Key.

<general>
    <!-- Unique worker_id -->
    <worker_id></worker_id>

    <!-- The worker profile -->
    <worker_profile></worker_profile>

    <!-- Master API url -->
    <api_url>https://staging.awf.arup.com/api/</api_url>

    <!-- Disable proxy for master api domain -->
    <api_no_proxy>true</api_no_proxy>

    <!-- api key for authentication with the master -->
    <api_key></api_key>

    <!-- Maximum workflows to run simultaneously -->
    <max_workflows>1</max_workflows>
</general>

# Worker ID & API Key

💡 TIP

You will need Admin Access in order to register your worker.

This section will illustrate how to obtain the <worker_id> and <api_key> from the Admin Page (opens new window) which are both required for your worker configuration file.

Go to the Admin Page (opens new window). On the left panel there is a Workers section, click on the ➕ADD button.
The auto-generated Id on the top of the page is your worker id. This should be pasted between the <worker_id> tags in your worker configuration file.
The label field is the name of your worker. Our team likes to name AWF workers after Transformers (opens new window), so it would be great if you could stick to that convention! Pick a name that has not been used before and use it to fill in the label field. The heartbeat entries are correct by default.
The location should be the id of your Arup computer. You can find this in the top bar if you look at "my pc" in the file explorer.
Finally, the worker profile is used to communicate what kind of workflows your worker can run. For debugging purposes it's best if your new workflow can only be picked up by your worker, so pick something unique and add it to the profile field and between the <worker_id> tags in your config.xml. When setting up the workflow you'll need to fill in the same profile.

# Authentication Configuration

    <!--
        AUTHENTICATIONS
        This section contains authentication information for activities such as
        Speckle api key, arup compute api key, Arup HPC service api key etc
     -->
    <authentications>
        <authentication>
            <name></name>
            <key></key>
        </authentication>
    </authentications>

The authentication section is used to save authentication keys for several of the services that can be used from AWF. The application section defines the location of an application that is used by your workflow.

# Applications Configuration

    <!--
        APPLICATIONS
        This sections contains the applications that the worker can execute.
    -->
    <applications>
        <application>
            <!-- Application name -->
            <name>APPLICATION_$name_$version</name>
            <!-- Full path to application -->
            <target></target>
        </application>
    </applications>

# Folder Configuration

    <!--
        FOLDERS
        Various folder locations that are required for the worker
    -->
    <folders>
        <!-- Root folder where the workflows will be executed in -->
        <run>c:\awf_worker\runs</run>

        <!-- Root folder where the dump files will be placed in -->
        <dump>c:\awf_worker\dumps</dump>

        <!-- Root folder where the awf-models are located -->
        <models>c:\awf_worker\models</models>

    </folders>

The worker needs specific folders to run. These can be filled in in the folders section. To make these folders you can run the following command:

mkdir C:\awf_worker\runs
mkdir C:\awf_worker\dumps
mkdir C:\awf_worker\models

The paths shown are a suggestion but you can use whatever location you like.

The runs folder will be used to contain the data for each of the workflow runs you do. It will contain data such as the workflow input, output and work directory.
The dumps folder contains the data dumps the worker makes in case of a failed workflow run.
The models folder contains python repositories. When the worker runs a workflow that uses one or more python scripts it will automatically download the repository containing the required scripts.

That's it!

# PBS Configuration

For workers that run on HPC Clusters with PBS installed there are some additional configuration needed to allow it to run PBS workflows

<!-- PBS Cluster configuration described here -->
    <pbs>
        <max_nodes>max number of nodes to be used by awf</max_nodes>
        <cpu_per_node>number of cpu's per compute node</cpu_per_node>
        <type>openpbs|torque</type>
    </pbs>

# 👷‍♂️ Start Worker

The first step will be to set the worker conda environment (opens new window) and install all python dependencies. See the Clone Worker section to download the worker code to your local machine.

Next step would be to setup a specific python environment using Anaconda. This is really easy to do and can be done with the following three commands:

conda create -n awf-worker python=3.7      # Creates a python environment named awf-worker
conda activate awf-worker                  # Activates environment. Use this command every time you restart
pip install -r requirements.txt            # Install python dependencies to that environment

That's it! If the environment is running correctly, you should see the command line is now prefixed with:

(awf-worker)

You can now run your worker from your local machine from the directory you pulled the worker into using:

(awf-worker) python -m worker

If successful you should see the following output in the command line:

AWF - Worker - INFO - Initializing worker
AWF - Worker - INFO - Validating worker configuration
AWF - Worker - INFO - Starting worker with profile: test and version: v1.9.0
AWF - Worker - INFO - Registering with master @ https://staging.awf.arup.com/api/
AWF - Worker.WorkflowManager - INFO - Initializing WorkflowManager
AWF - Worker - INFO - Initializing complete

# 🐳 Start Worker with Docker

Alternatively, you can use Docker (opens new window) to build the worker environment. This section assumes that the user is familiar with Docker and already has Docker Desktop installed on their machine.

First step is to build the worker image. Simply run:

docker build --build-arg TYPE=python --build-arg ENV=staging -t awf-worker .

This command builds the image with two build arguments:

TYPE - The type of worker to build (currently only python is supported, but more types will be supported in the future)
ENV - The AWF environment to connect to (options are dev, staging, and prod)

🙌 IMPORTANT

Building the worker with Docker will use a default Worker Configuration File located in worker/templates/python.xml.

This is not the same configuration file that you may have edited in the Worker Configuration File section.

Second step is to run the worker image that was built:

docker run --rm -it awf-worker

← Asynchronous Workflow runs Introduction →