Galaxy and Celery
Contributors
Author(s) | Mira Kuntz |
Editor(s) | Helena Rasche |
last_modification Last modification: Jul 14, 2022
Can you eat it
.pull-left[ Celery is an asynchronous distributed task queue.
It consists of:
- Application that sends tasks
- to a broker with queues
- Worker(s) that execute the tasks
- A Result backend to store the task results
It’s written in Python but supports multiple other languages via webhooks. ]
.pull-right[ ]
So many features
.pull-left[
- Celeryd – the daemonized version of Celery
- CeleryBeat a scheduler for repeated tasks
- Flower, a monitoring interface for
- Showing tasks, queues and workers
- Prometheus + Grafana Integration ]
.pull-right[ ]
How Celery Can Improve Galaxy
.pull-left[ The Problem
The Galaxy Server should respond quickly to every request. Gunicorn Workers should not do everything.
The Solution
Queue asynchronous tasks with Celery (happens currently internal on the same node)
The Improvement
Send the tasks to an external Celery-Cluster ]
.pull-right[ ]
What is Celery Used For
- Recalculating Disk Usage
- Purging Datasets
- Changing Datatypes
- Preparing compressed downloads (histories, etc.)
- Creating PDFs for Galaxy workflow reports
- Cleaning up short term storage
.pull-left [
What we need
- A Galaxy Server, configured to use an external MQ and external Celery Workers
- A properly set up broker (for example the UseGalaxy.eu RabbitMQ Ansible Role)
- A Celery-Cluster on external node with the full Galaxy repo, branch -> dev, which has run an initialisation (run.sh) and run the Celery Worker above task definition folder ]
.pull-right[ ]
.pull-left[ ] .pull-right[
- Configuration in
galaxy.yml
file - Documentation primarily within codebase and developer, not admin-focused
celery_broker
can be set, but defaults toamqp_internal_connection
- Contributions are very welcome ;)
- But: Celery, Flower and RabbitMQ are well documented ]
.pull-left[ ]
.pull-right[
What’s currently easy
- A Galaxy instance
- A RabbitMQ instance
- Running Celery via Gravity ]
To Be Completed
- Securing Celery documentation for Galaxy Admins
- Running Celery on another node, outside of Gravity (maybe.)
- The correct Telegraf configuration for monitoring
- Dashboards