async-graph-data-flow#

PyPI version Supported Python versions

async-graph-data-flow is a Python library for executing asynchronous functions that pass data along a directed acyclic graph (DAG).

Features#

  • Functions organized as a graph 🕸

Your asynchronous functions are nodes in the DAG. Each node yields data to its destination nodes.

  • Let data flow along the graph 🥂

It’s like how champagne flows along a champagne tower. Graph execution continues as long as there’s still data between two connected nodes.

  • Customizable start nodes 🧨

By default, graph execution begins with nodes that have no incoming nodes, but you can choose to start the graph execution from any nodes.

  • Data flow statistics

Utilities are available to keep track of data volumes at each node and optionally log such info at a regular time interval.

  • Exception handling 💥

Choose whether to halt execution at a specific node or any node.

  • Lightweight 🪶

The source code is only about 400 lines!

  • Single-machine Usage 💻

We love Big Data™ and distributed computing, though deep down we all know that practically we accomplish a ton of work on single machines without those big guns.

  • Pure Python 🐍

The library is built on top of asyncio from the Python standard library, with no third-party dependencies.

Download and Install#

pip install async-graph-data-flow

Usage#

Start with Quickstart, and then get inspired by More Examples. Don’t forget to check out the API Reference as well.

Under the Hood#

async-graph-data-flow chains asynchronous functions together with a Queue instance between two functions in the graph. A queue keeps track of the data items yielded from a source node and feeds them into its destination node.

License#

BSD 3-Clause License.

Authors#

This library is authored by Samuel Asirifi, Gbolahan Okerayi, and Jackson Lee at Civis Analytics.

Table of Contents#