sc_dataset module#

class sc_dataset.SCDataset(path: str | bytes | PathLike)[source]#

Bases: Dataset

__init__(path: str | bytes | PathLike) None[source]#

Create a dataset from the h5ad processed data. Use the preprocessing/preprocess.py script to create the h5ad train, test, and validation files.

Parameters:

path (Union[str, bytes, os.PathLike]) – Path to the h5ad file.

__getitem__(index: int) Tuple[Tensor, Tensor][source]#
Parameters:

index (int)

Returns:

Gene expression, Cluster label Tensor tuple.

Return type:

Tuple[torch.Tensor, torch.Tensor]

__len__() int[source]#
Returns:

Number of samples (cells).

Return type:

int

sc_dataset.get_loader(file_path: str | bytes | PathLike, batch_size: int | None = None) DataLoader[source]#

Provides an IterableLoader over a scRNA-seq Dataset read from given h5ad file.

Parameters:
  • file_path (Union[str, bytes, os.PathLike]) – Path to the h5ad file.

  • batch_size (Optional[int]) – Training batch size. If not specified, the entire dataset is returned at each load.

Returns:

Iterable data loader over the dataset.

Return type:

DataLoader