A Quick Guide to Developing Steps for Relay

Blog post cover

Expanding the Relay Ecosystem

Relay has a substantial library of external services and tools — as of March 2021 there are 60 integrations in our Github organization. Each integration repo can contain multiple triggers, containers that receive webhook payloads from other services, and steps, which Relay executes to get stuff done in your workflow. Some quick work with fd (see our post on CLI design for more on the awesome Rust *nix tool replacements!) shows this means there are about 150 different “atoms” of functionality in Relay’s current ecosystem, from creating JIRA issues to sending MS Teams notifications.

But what if that’s not enough? What if there’s a new service that’s not currently supported or a new API for an existing service that’s key to the workflow you’re trying to build? This post will walk you through the process of developing a new step, with an emphasis on using the same tooling and workflow that the Relay team uses. We’ll focus on step containers here since most triggers can use the generic webhook receiver in our standard library; if you need a custom trigger, check out the full developer documentation

Laziness is a Virtue

Before you start, it’s worth asking whether you need to do anything at all. Laziness, after all, is one of the three main virtues of a programmer, according to Perl’s Larry Wall (the other two: impatience and hubris). Laziness in this case means:

  • Asking around to see if anyone else is working on a similar step - maybe you can combine forces! Check on the Puppet Community slack and Relay Github issues. Feel free to file a new issue with your request if you don’t see a pre-existing one; it’s a great way to coordinate work and make your plans more widely visible.
  • Even if the answer’s no, there’s one more escape hatch for the lazy. The relaysh/core step containers accept an inputFile parameter, so you can pass in a shell or Python script. The integration documentation goes into this method in more detail. It might not be suitable if you have complex requirements, but if you can get started with a minimum of effort, it could save you time and energy.

Diving in

If laziness fails, it’s time to dive in. The steps we’ll follow are:

  • Setting up - creating the working repository to contain your code and associated metadata
  • Developing and testing - getting your code working in isolation as part of a workflow
  • Publishing - making your step available to the world

Fair warning: this section will get pretty far into the deep end of the techie pool, assuming you’re familiar with git/Github, Dockerfiles, and running containers.

Setting up

For existing integrations which you want to extend with a new step, for example to perform an additional action against a new API endpoint, the starting point is to fork-and-clone the existing repo from the relay-integrations namespace on Github. For new integrations, use the relay-integrations/template as your starting point — you can use the Clone this repo as template feature in Github to make this super easy.

In either case, make a new subdirectory under steps/ to contain your code. Our naming convention is that a step should be named noun-verb, like action-create. This maps to a container image that’s named, for an example step in the Bolt integration that runs a plan, bolt-step-plan-run. The initial contents can come from the step template directory, and should look like this:

├── steps               # subdirectory for containerized steps
│  └── noun-verb        # rename this to your own step's name
|     ├── README.md     # detail about how to use this step
│     ├── Dockerfile    # needed to build the container
│     ├── step.sh       # entrypoint (plus any additional code)
│     └── step.yaml     # step metadata

Developing and testing

The code you’re writing will act as the entrypoint to a container that the Relay service executes as part of a workflow. As such, there are really two parts to the task: to get variable data from the workflow, and then use that data do to whatever you’re trying to accomplish in the step.

Relay has an internal API called the metadata service, which presents a REST endpoint to step containers. It serves up dynamic information from workflow steps’ spec sections and receives outputs from steps as well as events from trigger containers. If you’re working in shell, use the ni utility, which should be automatically available on your $PATH. We also have SDKs available for Python and Go.

In order to shorten the development loop for writing, testing, and debugging step code, the relay CLI tool has a built-in version of the metadata service that you can run locally. This will allow your step entrypoint code to run unmodified from the shell, rather than requiring a container build, push, and execute cycle up to the Relay service.

See the detailed developer documentation for the details of the YAML mock-up data. After you’ve constructed the YAML input, start up the metadata service with:

relay dev metadata --input test-metadata.yaml --run 1 --step first-step

The last line of output contains an environment variable that you’ll need to copy-and-paste into the shell where you’re debugging your entrypoint:

No command was supplied, awaiting requests. Set environment with:
export METADATA_API_URL='http://:VeRyLoNgJWT[::]:59025'

This is the environment variable that the Python and Go SDKs, as well as the ni shell tool, use to communicate with the metadata service in Relay; setting it locally will point your code at the mock service. Running your code should result in HTTP GETs against the /spec path for each parameter and HTTP PUTs against the /output path if the step sets output variables.

Once you’re successfully interacting with the metadata service, the rest is… kind of up to you, really! There are some great examples of step code in the relay-integrations repos; I’m especially proud of the ones we’ve built for change events to PagerDuty (Python) and running kubectl from Relay.

As you’re developing your code, keep in mind that the step metadata should stay in sync with it. In particular, it’s important to add json schema for the spec section of workflows that use your step. This schema documents what’s effectively the “API” for your container, because it defines the names and data types of the inputs that it accepts. The Relay service uses this metadata to validate workflows and build the Library UI for users, so incomplete or inaccurate schema can prevent people from making use of your hard work!

Publishing

Now that your step code is running on your laptop, the next step is getting into production. A simple Dockerfile that sets your code as the entrypoint could look like this one, which updates a FireHydrant.io incident timeline:

FROM relaysh/core:latest-python
COPY timeline-update.py /relay/timeline-update.py
CMD ["python3", "/relay/timeline-update.py"]

If docker build . works, and using the same relay dev metadata service works when you point a local container execution at it, it’s probably time to publish. Steps in the relay-integrations Github organization use Dockerhub autobuilds to build and push new container images when commits land in the Github repo — so if you’ve forked an existing integration, just send in a pull request and we’ll get your new step added. For completely new integrations, you’ll need to push your image to a public container registry. Dockerhub is our default, but Relay workflows can pull from any registry as long as the image: field in the workflow includes its url.

Once your image is published, try it out in a minimalist workflow on the Relay service. Create a new workflow with only your new step and only the parameters your step expects. For the FireHydrant.io example earlier, this could look like:

description: Minimal example to update a FireHydrant incident timeline
parameters:
  incidentID:
    description: "The numerical FH incident ID to update (default: 1)"
    default: 1
steps:
- name: timeline-update
  image: relaysh/firehydrant-step-timeline-update
  spec:
    apiKey: !Secret apiKey
    incidentID: !Parameter incidentID
    message: "Relay ran a workflow and fixed the problem"

In order for this to run successfully, Relay will prompt you to fill in the Secret value for the API key, which will need to come from the FireHydrant app:

Relay prompting for a missing secret

Once you enter the API key value, the workflow should run successfully:

Relay running a single update step successfully

If you need to debug further, note that Relay saves any output from STDOUT or STDERR to the logs that are visible in the workflow run, so feel free to go wild with the print statements! On the back end, our engineers can also trace step execution from container pull to user execution, so if you get stuck, don’t hesitate to ping us on Slack.

The final step is to let us know that your new step is working so we can add it to the Library on the website and the app itself. If the step does what it says on the tin and its metadata is set up correctly, this is a pretty lightweight process: just file a pull request against the repository for existing integrations or a top-level Github issue for new integrations and we’ll turn ourselves inside out to get things working.

Conclusion

If you’ve made it this far - congratulations! And profuse thanks from the Relay team. Nurturing an ecosystem is a delicate task, and our goal is to make working with Relay as simple and rewarding as possible. We built Relay because we think it’s worth the effort it takes to make automation more accessible. Spreading knowledge that once was tribal, democratizing data that used to be hidden - these are goals that not only feel right, they produce better outcomes for the business.