Agent-Based Quasi-Stable State Petri Nets

What is AB-QSSPN

Agent-Based Quasi-Stable State Petri Nets is a tool for simulation of living cells in spatial structures. Each cell is represented with the Quasi-Steady State Petri Net that integrates dynamic regulatory network expressed with a Petri net and Genome Scale Metabolic Network (GSMN) where linear programming is used to explore the steady-state metabolic flux distributions in the whole-cell model.

Combination of Petri net and GSMN has already been used in simulations of single cells, but we present extension to the model and an architecture to simulate populations of millions of interacting cells organized in spatial structures which can be used to model tumour growth or formation of tuberculosis lesions. The crucial element of this model is the ability of the adjacent cells to communicate by sharing tokens in certain places which represents the ability of cells to communicate by producing and detecting in the environment substances such as cytokines and chemokines.

To make the simulation of such a huge model possible we use the Spark framework and organize the computation in an agent based “think like a vertex” fashion as in Pregel like systems. In the cluster we introduce a special kind of per node caching to speed up computation of the steady-state metabolic flux.

We demonstrate capability of our tool by simulating communication of liver cells through FGF19 cytokine during homeostatic response to cholesterol burst. Our approach can be used for mechanistic modelling of the emergence of multicellular system behaviour from interaction between genome and environment.

The current distribution is available at ab-qsspn.v2.0.0.tgz.

Our experiments

The AB-QSSPN can be run on different infrastructures ranging from more expensive highly performant clusters to commodity hardware with relatively low specs. We have carried out our experiments with AB-QSSPN in two different environments.

Clusters of various size hosted in cloud environment. We have have successfully run simulations on clusters of 17 and 33 computer which we rented from the Microsoft Azure cloud. We used the Standard\_D2\_v2 instances with 2 cores per node and 7GB.
A cluster composed of commodity hardware. We built our cluster out of 30 various machines with quite low spec (2 to 4 GB RAM) and managed to have a 200-step simulation of a system with one million cells (cube 100x100x100).

We demonstrate the capability of our improved architecture by simulating communication of liver cells through FGF19 cytokine during homeostatic response to cholesterol burst. The model has 130 places and 163 transitions in regulatory network part (see the visialisation of the regulatory net and the Snoopy file from which it was generated). The Snoopy file contains annotations and can be transformed with a Python script into the qsspn file that our tool uses together with the linear model of whole-cell metabolism, which in our case contains 2539 reactions of 777 metabolites.

Running Inside a Docker Container

Instead of installing all the requried libraries on your own, feel free to use abqsspn/abqsspn-singlenode:v2 Docker image. It has a single node installation of HDFS and Spark with all the necessary libraries and requires minimal setup. If you have never used Docker before, we strongly encourage you to spend a few minutes on learning the basics from the Docker docs.

Pull the ab-qsspn image with the following command: docker pull abqsspn/abqsspn-singlenode:v2. When the download is complete, start the container with: docker run -it --rm -p 8080 -p 8081 -p 4040 abqsspn/abqsspn-singlenode:v2 /bin/bash.

In the shell of the container, first you need to start the background deamons with ./start-deamons.sh (if you are not returned to prompt after 5 seconds press enter).

There is a provided script for running the simulation ./run.sh <cube_size> <steps_no> <comm_type> <neigh_type> where the possible values for comm_type are shuffle, forever, degrading, no_comm and the possible values for neigh_type are immediate6, immediate26, none. Example ussage is: run.sh 10 40 shuffle immediate6. After the whole run there will be checkpointed system state in /tmp/checkpoints dir. Every checkpoint is a CSV. Every row has a cell identifier in the first column and boolean variables denoting the Petri Net marking in the next columns.

Running in a Cluster

The novel feature of AB-QSSPN is its capability to utilize distributed computational environment in order to simulate multi-cellular environments. You may choose to deploy AB-QSSPN to a regular Spark cluster or keep it in dockerized environment, the steps for doing that are quite similar:

Make sure your spark cluster is working as intended. Try running spark-shell and perform some RDD operations.
Deploy qsspn on every worker node. If you are familiar with ansible framework you might find /ab-qsspn/ansible handy. If you have not used ansible before you can still use following instructions but it might be also a good idea to take a look at ansible documentation first. Running ansible-playbook worker-setup.yml will install thrift and qsspn on all nodes in parallel. You might need to first install ansible with apt-get install ansible on the machine, from which you want to bootstrap your cluster. To inform ansible, which hosts it should work with you can either add [workers] section to /etc/ansible/hosts (recommended) or directly replace workers in - hosts: workers in worker-setup.yml with your list of hosts. Your cluster hosts might not accesible directly, for instance if they are behind NAT because they all share a single public IP address. In such a case you can run ansible from one of the hosts, however this might not be straightforward if your hosts are using older OS. You can also populate your .ssh/config specifying NAT ports on the public IP address for each host. Please note that depending on your exact configuration you might want to use --ask-pass or --ask-sudo-pass options when running ansible.
Distribute precomuted metabolism to every worker node. The default path is /tmp/cache.txt. It is likely that all the contents of /tmp are by default cleaned when a node restarts. Therefore, you might want to copy it there whenever you start QSSPN. You might also modify this default path in qsspn proxy, however currently, that would require that you recompile the project.
Run qsspn on every worker node. During the spark jobs, the worker nodes will expect to have the qsspn server and proxy running locally. The worker nodes will try and connect to localhost and the job will fail if qsspn agents are not working properly. Therefore, you need to have both qsspn server (./scripts/qsspn_runner.py -m server) and qsspn proxy (./scripts/qsspn_runner.py -m proxy) running on every worker node. Qsspn servers are not used anymore so they will not perform any computation, however, legacy code in qssn proxy is still asserting the server is up and running. If you are familiar with pssh, you might want to use /ansible/roles/pssh/tasks/main.yml to install it and then use it to execute commands on all workers simultanously. If you have not used pssh before you can study pssh documentation. The tool is very straightforwad to use and offers support for setting timeout, providing passwords and redirecting the output from all hosts.
Submit the spark job. In order to assure all the required jars are available, it's possible to build so-called fat jar via sbt assembly. It will include all the dependancies and is therefore freely shareable across the nodes. It takes three arguments: size of the cube, number of steps, and HDFS path to where checkpoints should be stored. The job constructs a 3D cube of cells of the specified size and simulates the desired number of steps.

The spark workers will always seek for QSSPN proxy at localhost:9090, so if you're running qsspn make sure this port is exposed and forwarded properly. If the spark worker is also running inside a container, you will need to either find a way to wire the ports somehow or modify the scala source so that it tries to connect elsewhere. In theory, you could even have only one central qsspn server and proxy and let all the spark workers connect to it over the network, but that will have a huge negative impact on performance.

It should be possible to set up a cluster using the same docker image. This is however not encouraged, because if you intend to use your whole cluster you will probably care about performance, and having an extra layer of containers doesn't help. Having said that, there are two things to take care of. First, the docker image runs HDFS in standalone mode by default. In order to have cluster-wide consistent HDFS, you will need to supply proper config file at $HADOOP_CONF_DIR/hdfs-site.xml. Second, you will need to wire all the containers so that they are aware of each other. Edit the slaves files in both Hadoop and Spark config dirs to match your network layout and startup the cluster. As HDFS and Spark nodes connect with each other with multiple protocols, consult the Hadoop and Spark documentation for the list of ports that have to be exposed on the containers.

Different Variants of the Simulation

AB-QSSPN allows to run different variants of simulation allowing for different graphs structure and communication model. If the program is run as described above it will default to a version where each cell has 6 neighbours and FGF19 is used for communication and is assumed to be a signal living forever. However, user can provide additional boolean parameters, which by default are set to false:

communicatorsLiveOneTurn: if true substances used as comunicator deteriorates after one turn
turnOffUsedCommunicators: should be set to false as it is still an experimantal feature
extraCommunicationPass: should be set to false as it is still an experimantal feature
extendedNeghbourhood
: if true each cell communicates with 26 neighbours
noCommunication: if true cells will not communicate

Additionally, folder spark-precise_communication contains a version that allows for a more advanced communication model. In this model determines if it needs to obtain a substance (FGF19) from one of the neighbours. Then extra communication steps are taken to distribute existing substances (FGF19) according to demand and supply. Currently it is recommended to only use following tested configurations:

forever (6 neighbours): no additional parameters are provided
degrading (6 neighbours) - true: only one additional parameter is provided
forever (26 neighbours) - false false false true: 4 additional parameters are provided
degrading (26 neighbours) - true false false true: 4 additional parametes are provided
no communication - false false false false true
: 5 additional parameters are provied
precise communication
: version from spark-precise_communication should be used

Precomputing metabolism

Precomputed metabolism is stored in cache.txt in the root directory. Script metabolism-precomputer/metabolism-precomputer.sh allows for distributed computation of metabolism. Thanks to precomputed metabolism each node can have its own copy of all metabolism inputs and corresponding outputs. This signifficantly speeds up AB-QSSPN simulations as cells often request metabolism for the same input values. Metabolism-precomputer requires its cpp counterpart to be compiled, which happens as a part of QSSPN build process. Additionaly, Spark is leveraged to distribute computation over the cluster and therefore is also required. Three arguments must be supplied for metabolism-precomputer to run successfully:

SPARK_MASTER: hostname of spark master, port 7077 is assumed
SPARK_SLAVES_FILE: path to file containing slaves' hostnames
MEMORY_LIMIT: current implementation is calling multiple instance of cpp programs on each host. Providing this limit helps avoid overcommitment of memory. It is advised to set it to floor( slave_memory / 2 ).

Additionally it might be necessary to provide a custom path to QSSPN in metabolism-precomputer.sc. The default path assumes that SparkLP repository resides in user home directory.

Metabolism-precomputer stores logs in mp.log. After successful completion each slave should have precomputed metabolism saved in tmp/cache.txt.

Input models, output data

All QSSPN workers require the same model to operate on. The documentation for QSSPN model format may be found at the QSSPN home page. An example model is included with the software in QSSPN/QSSPNclientBuf2/models/ directory and will be used by default. Another model may be selected by putting the model files in the mentioned directory and running qsspn_runner.py with --model MODEL_NAME flag (run qsspn_runner.py -h for details on usage).

The sfba file specifies the metabolism, while the qsspn file specifies the regualtory part and how it interacts with the metabolism. The spept contains the regulatory Petri net in the Snoopy format with node annotations that define the interaction of metabolic and regulatory parts. The spept can be edited in Snoopy and then spept2qsspn script can be used to generate the qsspn from it. In the given model the cells communicate via the FGF19 place, which is hardcoded in this version of the tool. In the future, the set of places that are exposed to adjacent cells will be configurable.

All the output is written to the stdout, and we suggest redirecting it to a file. During normal usage, one should only pay attention to the messages logged by biosim.Runner and biosim.Main: they inform about which batch of simulation steps is being run and which cells reached the observed state. For example:

INFO spark.Runner: Starting simulation with 10 steps        # first batch started
INFO biosim.Main: Saved world checkpoint after step 10      # first batch ended, checkpoint saved
INFO spark.Runner: Starting simulation with 10 steps        # second batch started
INFO spark.Runner: Homeostasis log:                         # information on which cells reached BA_HOMEOSTASIS
(582910,1)        # cell with id 582910 reached it in the first step in this batch (so in 11 step from the start)
(981362,3)        # cell with id 981362 reached it in the third step in this batch (so in 13 step from the start)

In the example usage presented in our paper, we count how many cells reached BA_HOMEOSTASIS in every step and compare the results with the same calculation without communication (by running multiple QSSPN simulations).

About the Authors

The first version of AB-QSSPN model was first proposed in a paper AB-QSSPN: Integration of agent-based simulation of cellular populations with quasi-steady state simulation of genome scale intercellular networks published by Wojciech Ptak, Andrzej M. Kierzek, and Jacek Sroka at the 37th International Conference, PETRI NETS 2016. The first implementation was created by Wojciech Ptak and reuses previous QSSPN software, originally created by Andrzej M. Kierzek. The current improved version was implemented by Kamil Kędzia.