Skip to content

Data stores

API for open_data_store and data store backends.

scinexus.io.open_data_store(base_path, suffix=None, limit=None, mode=READONLY, **kwargs)

returns DataStore instance of a type specified by the path suffix

Parameters:

Name Type Description Default
base_path str | Path

path to directory or db

required
suffix str | None

suffix of filenames

None
limit int | None

the number of matches to return

None
mode str | Mode

opening mode, either r, w, a as per file opening modes

READONLY

scinexus.data_store.DataStoreABC

Bases: ABC

Abstract base class for DataStore

mode abstractmethod property

string that references datastore mode, override in subclass constructor

source abstractmethod property

string that references connecting to data store, override in subclass constructor

__contains__(identifier)

whether relative identifier has been stored

md5(unique_id) abstractmethod

Parameters:

Name Type Description Default
unique_id str

name of data store member

required

Returns:

Type Description
md5 checksum for the member, if available, None otherwise

write_bib(dest_path)

Write stored citations as a BibTeX .bib file.

write_citations(*, data)

Write citations to the data store. Subclasses should override.

scinexus.data_store.DataStoreDirectory

Bases: DataStoreABC

data store backed by a directory on the filesystem

mode property

string that references datastore mode, override in subclass constructor

source property

path that references the data store

drop_not_completed(*, unique_id=None)

remove not-completed records from the directory

Parameters:

Name Type Description Default
unique_id str | None

if provided, only drop the record with this identifier, otherwise drop all not-completed records

None

md5(unique_id)

Parameters:

Name Type Description Default
unique_id str

name of data store member

required

Returns:

Type Description
md5 checksum for the member, if available, None otherwise

read(unique_id)

reads data corresponding to identifier

write(*, unique_id, data)

writes a completed record ending with .suffix

Parameters:

Name Type Description Default
unique_id str

unique identifier

required
data str

text data to be written

required

Returns:

Type Description
a member for this record
Notes

Drops any not-completed member corresponding to this identifier

write_bib(dest_path)

Write stored citations as a BibTeX .bib file.

write_not_completed(*, unique_id, data)

writes a not completed record as json

Parameters:

Name Type Description Default
unique_id str

unique identifier

required
data str

text data to be written

required

Returns:

Type Description
a member for this record

scinexus.data_store.ReadOnlyDataStoreZipped

Bases: DataStoreABC

read-only data store backed by a zip archive

__contains__(identifier)

whether relative identifier has been stored

drop_not_completed(*, unique_id=None)

not supported on read-only zip data stores

md5(unique_id)

Parameters:

Name Type Description Default
unique_id str

name of data store member

required

Returns:

Type Description
md5 checksum for the member, if available, None otherwise

read(unique_id)

reads data corresponding to identifier from the zip archive

write_bib(dest_path)

Write stored citations as a BibTeX .bib file.

scinexus.sqlite_data_store.DataStoreSqlite

Bases: DataStoreABC

data store backed by a SQLite database

locked property

returns if lock_pid is NULL or doesn't exist.

logs property

returns all log records

mode property

string that references datastore mode, override in override in subclass constructor

not_completed property

returns database records of type NotCompleted

record_type property writable

class name of completed results

source property

string that references connecting to data store, override in subclass constructor

__contains__(identifier)

whether relative identifier has been stored

__del__()

close the db connection when the object is deleted

close()

close the database connection

drop_not_completed(*, unique_id=None)

remove not-completed records from the database

Parameters:

Name Type Description Default
unique_id str | None

if provided, only drop the record with this identifier, otherwise drop all not-completed records

None

lock()

if writable, and not locked, locks the database to this pid

md5(unique_id)

Parameters:

Name Type Description Default
unique_id str

name of data store member

required

Returns:

Type Description
md5 checksum for the member, if available, None otherwise

read(unique_id)

identifier string formed from Path(table_name) / identifier

unlock(force=False)

remove a lock if pid matches. If force, ignores pid. ignored if mode is READONLY

write_bib(dest_path)

Write stored citations as a BibTeX .bib file.

scinexus.data_store.DataMemberABC

Bases: ABC

Abstract base class for DataMember

A data member is a handle to a record in a DataStore. It has a reference to its data store and a unique identifier.

__eq__(other)

to check equality of members and check existence of a member in a list of members

scinexus.data_store.DataMember

Bases: DataMemberABC

Generic DataMember class, bound to a data store. All read operations delivered by the parent.

__eq__(other)

to check equality of members and check existence of a member in a list of members

scinexus.data_store.set_summary_display(func)

Set the function used to display data store summaries.

Parameters:

Name Type Description Default
func Callable[..., Any] | None

A callable with signature func(data, *, name) -> Any where data is a dict or list[dict] and name identifies the summary method (e.g. "describe"). Pass None to clear.

required

scinexus.data_store.get_summary_display()

Return the currently registered summary display function, or None.

scinexus.data_store.set_id_from_source(func)

Register a custom function for extracting unique IDs from data objects.

The registered function is consulted as the default by :meth:AppBase.as_completed and :meth:WriterApp.apply_to to derive a unique identifier for each input, and by :class:NotCompleted to normalise the source= keyword on error records. Pass None to clear the registration and restore the built-in :func:get_unique_id.

Parameters:

Name Type Description Default
func Callable[..., Any] | None

A callable taking a single data object and returning a string identifier (or None if no identifier can be extracted). The callable must be picklable if scinexus apps will be executed in parallel via loky / MPI.

required
Notes

Per-call overrides via the id_from_source keyword on :meth:as_completed and :meth:apply_to still take precedence over the registered function. Register before constructing apps for the cleanest behaviour.

scinexus.data_store.get_id_from_source()

Return the active unique-ID extractor.

Returns the function previously passed to :func:set_id_from_source, or :func:get_unique_id if nothing has been registered.

scinexus.data_store.get_unique_id(name)

strips any format suffixes from name

scinexus.data_store.get_data_source(data)