discotime.datasets.utils module
- class discotime.datasets.utils.DataConfig(*, batch_size: int = 32, n_time_bins: int = 20, discretization_scheme: str = 'number', discretization_grid: list[float] | None = None, max_time: float | None = None)[source]
Bases:
objectConfiguration class for data modules.
- class discotime.datasets.utils.LabelDiscretizer(scheme: str | None = None, n_bins: int | None = None, *, cut_points: Iterable[int | int64 | float | float64] | None = None, max_time: int | int64 | float | float64 | None = None)[source]
Bases:
objectDiscretize continous time/event pairs.
The class can either learn a discretization grid from the training data using one of the built-in discretization schemes, or the user can supply an iterable with cut points.
Implementation heavily inspired by pycox.preprocessing.label_tranform [1].
[1]: Kvamme, Håvard, Ørnulf Borgan, and Ida Scheel. “Time-to-event prediction with neural networks and Cox regression.” arXiv preprint arXiv:1907.00825 (2019).
- fit_transform(time: Iterable[int | int64 | float | float64], event: Iterable[int | int64]) tuple[numpy.ndarray[Any, numpy.dtype[numpy.integer]], numpy.ndarray[Any, numpy.dtype[numpy.integer]]][source]
- transform(time: Iterable[int | int64 | float | float64], event: Iterable[int | int64]) tuple[numpy.ndarray[Any, numpy.dtype[numpy.integer]], numpy.ndarray[Any, numpy.dtype[numpy.integer]]][source]
- class discotime.datasets.utils.LitSurvDataModule[source]
Bases:
LightningDataModule- property config: DataConfig
- property lab_transformer: LabelTransformer
- class discotime.datasets.utils.SurvDataset(features: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], event_time: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], event_status: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], discretizer: LabelTransformer)[source]
Bases:
DatasetAssemble a survival dataset for discrete-time survival analysis.
A discrete time survival dataset \(\mathfrak{D}\) is a set of \(n\) tuples \((t_{i}, \delta_{i}, \mathbf{x}_{i})\) where \((t_i = \min \{T_i, C_i\})\) is the event time, \(\delta_{i} \in \{0, ..., m\}\) is the event indicator (with \((\delta_i = 0)\) defined as censoring), and \(\mathbf{x}_{i} \in \mathbb{R}^d\) is a \(d\)-dimensional vector of time-independent predictors or covariates.
- Parameters:
features – time-independent features.
event_time – follow-up time (continuous).
event_status – event indicator (0=censored, 1/2/…=competing risks).
discretizer – discretizer that follows the
LabelTransformerprotocol that convert continuous time/event tuples to their respective discretized versions. Typically this would beLabelDiscretizerunless a custom discretization object is used.