A Python Package for Visualizing Categorical Data Over Time

PyCatFlow is a visualization tool which allows the representation of temporal developments, based on categorical data. The tool was conceptualized by Marcus Burkhardt and implemented in collaboration with Herbert Natta (@herbertmn). It is inspired by the Rankflow visualization tool develped by Bernhard Rieder.


PyCatFlow is available on PyPi:

$ pip3 install pycatflow

Alternatively you can download the repository and install the package by running the install routine. Make sure to install the requirements as well:

pip3 install -r requirements.txt
python3 install

Additional Requirements: The visualization and export is based on the drawSvg package that in turn requires cairo to be installed as an external requirement. Platform-specific instructions for installing cairo are available on the cairo homepage.

On macOS cairo can be installed easily using homebrew:

$ brew install cairo

Basic usage

The visualization library provides many functionalities for adjusting the visual output. A simple use case is however as follows:

import pycatflow as pcf

# Loading and parsing data:
data = pcf.read_file("sample_data_ChatterBot_Requirements.csv", columns="column", nodes="items", categories="category", column_order="column order")

# Generating the visualization
viz = pcf.visualize(data, spacing=20, width=800, maxValue=20, minValue=2)

The code and sample data are provided in the example folder. The data contains annual snapshots of requirements of the ChatterBots framework developed and maintained by Gunther Cox.

Running the above code creates this visualization:

Sample Visualization

Credits & License

Cite as: Marcus Burkhardt, and Herbert Natta. 2021. “PyCatFlow: A Python Package for Visualizing Categorical Data over Time”. Zenodo.

The package is released under MIT License.