About

Just as attrs and dataclasses use type hints to simplify data type definition, scinexus uses them to simplify writing best-practice scientific algorithms.

scinexus (pronounced 'sigh-nexus') is a Python framework for rapid development of data processing applications. It enables interoperability between apps through defined data types, allowing development of scientific domain app ecosystems (for examples see cogent3 and piqtree).

Many scientific problems require repeating calculations across many files or database records. Such tasks suit data-level parallelism on multi-core CPUs, but writing robust, maintainable code for them is often tedious and quickly becomes complex.

With scinexus apps, you can use a functional programming style when developing your application. Combined with scinexus app composition, this greatly simplifies your programming logic making it easier to understand and thus easier to explain. And as we know

Quote

If the implementation is easy to explain, it may be a good idea.

-- Tim Peters, "Zen of Python"

What you get

Type checking at composition time
Durable computing¹
Greatly simplified data level parallel execution
Automated logging
Automated citation tracking
Checkpointing via data stores
Customisable experience (progress bars², parallelisation³, data store representations etc..)

Standalone utilities

scinexus also provides generally useful utilities for developers of data analysis applications. Utilities for file IO, parallel execution, and progress tracking are usable independently of the app framework.

Get started

Install scinexus -- see Installing from PyPI
Build algorithms -- see How to write apps
Build applications for others -- see Why composable apps?
Use existing apps -- see Composing apps

The `scinexus` origin story

The app infrastructure code was originally developed within cogent3, where it accumulated over seven years of development, testing, and real-world use in computational genomics before being extracted into scinexus. The design is mature and has underpinned analyses in published studies.

We acknowledge here that many members of the cogent3 community contributed to the code that now lives here, including @GavinHuttley, @rmcar17, @Nick-Foto, @KatherineCaley, @fredjaya, and @khiron.

Failures are automatically recorded as NotCompleted records which get propagated and stored in data stores. These records record salient details that help you identify the cause of the failure. ↩
tqdm is the default because of its robustness in notebooks, but you can choose rich. ↩
The default is Python’s standard library multiprocessing module. If you're using Jupyter Notebooks, however, it's recommended that you use loky. This is an installation option and configuration is easy. ↩

About

What you get

Standalone utilities

Get started

The scinexus origin story

The `scinexus` origin story