System Collection

Container for a set of systems for a given thermodynamic state (e.g., constant temperature, function of composition).

The purpose of SystemCollection is to load a set of systems and access SystemProperties to retrieve molecular dynamics properties as a function of composition.
  • This container first discovers molecular systems based on directory structure and input parameters, creating a list of SystemMetadata objects.

  • Then topology and energy properties can be calculated as function of composition.

  • Additionally, this object is used to calculating Excess, Simulation, and Ideal properties.

class kbkit.systems.collection.SystemCollection(systems: list[SystemMetadata], molecules: list[str], charges: dict[str, int] | None = None)[source]

Bases: object

Registry of discovered molecular systems with semantic access patterns.

Stores and organizes SystemMetadata objects by name and kind, enabling reproducible filtering, indexing, and iteration across pure and mixture systems.

Parameters:
  • systems (list[SystemMetadata]) – List of discovered systems to register.

  • molecules (list[str]) – List of global unique molecules present in all systems.

  • charges (dict[str, int], optional) – Optional charge dictionary for ions. If provided, enables electrolyte basis.

classmethod load(base_path: str | None = None, base_systems: list[str] | None = None, pure_path: str | None = None, pure_systems: list[str] | None = None, rdf_dir: str = '', start_time: int = 10000, include_mode: str = 'npt', charges: dict[str, int] | None = None) SystemCollection[source]

Construct a SystemCollection object from discovered systems.

Parameters:
  • pure_path (str or Path) – Path to pure component directory.

  • pure_systems (list[str]) – List of pure systems to include.

  • base_path (str or Path) – Path to base system directory.

  • base_systems (list[str], optional) – Explicit list of system names to include.

  • rdf_dir (str, optional) – Explicit directory name that contains rdf files.

  • start_time (int, optional) – Start time for time-averaged properties.

  • include_mode (str, optional) – Optional string to filter files (.edr, .gro, .top) if multiple are found of a given type.

  • charges (dict[str, int], optional) – Optional charge dictionary for ions.

Returns:

Registry object containing global molecules and list of SystemMetadata.

Return type:

SystemCollection

property residue_molecules: list[str]

Raw MD residue basis (unique residues from topology).

property residue_counts: ndarray

(N_systems, N_residues) mole fractions in residue basis.

Type:

np.ndarray

property residue_x: ndarray

(N_systems, N_residues) mole fractions in residue basis.

Type:

np.ndarray

property electrolyte_basis: dict[str, ndarray]

Build electrolyte basis.

  • new_molecules: neutral molecules + salts.

  • new_N: counts in new basis.

  • new_x: mole fractions in new basis.

  • nu: stoichiometric matrix (residue x salts) Returns None if no charges.

property electrolyte_molecules: list[str]

List of molecule names for electrolyte basis (neutral molecules + salts).

property electrolyte_x: ndarray

Mole fractions for electrolyte basis.

property nu: ndarray

Stoichiometric matrix (residue basis x salts) if charges provided.

property molecules: list[str]

The global order of molecules used for vectorized properties.

Type:

list[str]

get_mol_index(mol: str) int[source]

Get index of molecule in molecules.

property n_i: int

Number of components present.

Type:

int

property n_sys: int

Number of compositions.

Type:

int

property x: ndarray

Returns (N_systems, N_molecules) array of mole fractions, follows the order of self.molecules.

Type:

np.ndarray

property units: dict[str, str]

Master dictionary mapping energy properties to their default units.

Type:

dict[str, str]

property pures: list[SystemMetadata]

Returns a list of Metadata objects for systems where is_pure() is True.

Type:

list[SystemMetadata]

property mixtures: list[SystemMetadata]

Returns a list of Metadata objects for systems where is_pure() is False.

Type:

list[SystemMetadata]

get_units(name: str) str[source]

Get default units for a given energy property.

Parameters:

name (str) – Name of property to get units of.

Returns:

Units of desired property.

Return type:

str

get(name: str, units: str | None = None, avg: bool = True, time_series: bool = False) ndarray | list[source]

Vectorized getter for system properties with unit support via Pint.

Parameters:
  • name (str) – The name of the property (e.g., ‘Density’, ‘Potential’).

  • units (str, optional) – The target unit string for Pint conversion.

  • avg (bool, default False) – If True, returns the mean value for each system. If False, returns the full time-series.

  • time_series (bool, optional) – Returns both times and values if True (default: False).

Returns:

Vectorized property of all systems in collection.

Return type:

np.ndarray | list

has_all_required_pures() bool[source]

Check that collection has required pure components for excess properties calculation.

simulated_property(name: str, units: str | None = None, avg: bool = True) ndarray[source]

Extract raw values directly from MD simulation (EDR files).

Returns:

Values as simulated in the MD engine.

Return type:

np.ndarray

pure_property(name: str, units: str | None = None, avg: bool = True) ndarray[source]

Extract pure component properties.

Parameters:
  • name (str) – Property name (e.g., ‘Density’, ‘Volume’).

  • units (str, optional) – Target units for conversion.

  • avg (bool, default True) – Return time-averaged values.

Returns:

Pure component property values with metadata.

Return type:

np.ndarray

ideal_property(name: str, mixing_rule: Literal['linear', 'volume_weighted'] = 'linear', units: str | None = None, avg: bool = True) ndarray[source]

Calculate ideal mixing property using specified mixing rule.

Linear mixing rule:

\[\bar{P} = \sum_i x_i P_i^{pure}\]

Volume-weighted mixing rule:

\[\bar{P} = \sum_i \left(\frac{x_i}{P_i^{pure}} \right)^{-1}\]
where:
  • \(x_i\) is the mole fraction of molecule \(i\)

  • \(P_i\) is the pure component property

  • \(\bar{P}\) is the ideal property according to the mixing rule

Parameters:
  • name (str) – Property name.

  • mixing_rule ({"linear", "volume_weighted"}, default "linear") – Mixing rule to apply.

  • units (str, optional) – Target units.

  • avg (bool, default True) – Use time-averaged values.

Returns:

Ideal property values for each mixture composition.

Return type:

np.ndarray

excess_property(name: str, mixing_rule: Literal['linear', 'volume_weighted'] = 'linear', units: str | None = None, avg: bool = True) ndarray[source]

Calculate excess property: Excess = Real - Ideal.

Parameters:
  • name (str) – Property name.

  • mixing_rule ({"linear", "volume_weighted"}, default "linear") – Mixing rule for ideal calculation.

  • units (str, optional) – Target units.

  • avg (bool, default True) – Use time-averaged values.

Returns:

Excess property values.

Return type:

np.ndarray

Notes

For a given property, \(P\), the excess property, \(P^{E}\), is calculated according to:

\[P^{E} = P - \bar{P}\]
where:
  • \(x_i\) is the mole fraction of molecule \(i\)

  • \(P\) is the property directly from simulation

  • \(\bar{P}\) is the ideal property according to the mixing rule

property results: dict[str, PropertyResult]

Dictionary of PropertyResult with mapped names and values.

Returns:

Mapped property result objects for properties.

Return type:

dict[str, PropertyResult]

timeseries_plotter(system: str, start_time: int = 0) TimeseriesPlotter[source]

Create a TimeseriesPlotter for visualizing time series data for a given system.

Parameters:
  • system (str) – System to use for visualizing timeseries.

  • start_time (int) – Initial time for plotting.

Returns:

Plotter instance for computing simulation energy properties.

Return type:

TimeseriesPlotter