DALiuGE

Scientific Workflow Graph Execution Framework

Memory Enabled

No need to write intermediate files..

Unlike many other workflow systems, With DALiuGE you can use memory to transfer data from one application workflow component to the next.

... at all

That even works across multiple computers in a cluster.

Flexible Workflow Execution

DALiuGE Engine

To actually execute a workflow it has to be submitted to an instance of the DALiuGE engine. The system supports local and remote submissions to engines running on single machines, SLURM or Kubernetes clusters.

Workflows can be nested

A workflow can import another workflow using a sub-workflow construct and thus patterns or a complete set of reduction steps can be re-used. A sub-workflow construct can also use a submission application to launch the enclosed workflow on a remote platform, like e.g. a SLURM or HPC cluster.

Dask or MPI Anyone?

Run whatever you like

DALiuGE does not prevent your code using other parallelization technologies. It supports running MPI and Dask components and even has a Dask compatibility mode.

External and internal parallelism

The DALiuGE scatter and gather constructs can be seen as explicit external paralellism of the workflow graph. Components calling Dask or MPI are internally parallel and that paralellism is not controlled by DALiuGE.

Reproducibility

Batteries included

The DALiuGE system automatically maintains a hash-tree to track reproducibility on various levels. This allows for instance to verify whether multiple runs of a workflow produce the same results, whether the same software had been used, or whether a workflow using alternate components is still producing the same results.

Seven reproducibility tenets

  • Rerun
  • Repeat
  • Recompute
  • Reproduce
  • Replicate (scientifically)
  • Replicate (computationally)
  • Replicate (totally)
For more information see Formal Definition and Implementation of Reproducibility Tenets for Computational Workflows, Pritchard, N.; Wicenec, A., 2024

Parallelism

Scatter, Gather, Loop and Branch

DALiuGE supports high-level workflow constructs to aid the construction of complex parallel workflows without cluttering the layout, while still exposing the parallelism.

Un-rolling, Partitioning and Scheduling

The DALiuGE Translator un-rolls the constructs and produces a Directed Acyclic Graph (DAG), which can easily grow to many thousands or millions of actual components if the degree of parallelism is very high. The translator can also partition and statically schedule such a DAG and map it onto a cluster of computers.

Use Existing Code

Automatic workflow component generation

A stand-alone tool generates palettes of workflow components from any installed Python module, ready to be used in EAGLEπ to construct your workflows. Newly developed code does not need to import anything specific to DALiuGE or use any special decorators. Just write your functions, classes and methods.

Wide support

The tool traverses the module tree and inspects all sub-modules, classes and functions and generates JSON descriptions of them. Existing code can be plain Python to full PyBind11 modules. Standard packages like Numpy, scipy or astropy can all be used without any code changes or adjustments. DALiuGE also supports direct C-library components.

Data Activated

Data as workflow components

In a DALiuGE workflow data is represented by workflow nodes, which are instantiated as DALiuGE data components at run-time. These components implement state machines and also autonomously trigger their consumer applications resulting in the extreme scalability of DALiuGE. Thanks to this autonomous behaviour there is also no central control required to execute a workflow.

I/O transparent

Data components are performing I/O on the underlying data payload and the payload can thus be exchanged without the application components needing to know about that: Just change a memory component to a S3 component without any other change to the workflow. Out-of-the-box DALiuGE supports the most common data components, including Files, Memory, StreamingMemory, Plasma and S3.

EAGLEπ

Intuitive UI

Visual workflow development: No need to code. Just drag your components onto the canvas and connect them to build-up your workflow logic! Since EAGLEπ does not actually touch or even know about the code, this can also be done before the code really exists and the workflow graph can be used as part of the software design work.

Share and collaborate

Workflows can be shared between and developed by multiple people with full version control using built-in GitHub and GitLab functionality.
© Copyright 2025