Release Notes

Version: v1.2.0 Release Date: April 10, 2025

Changelog

This version introduces a comprehensive refactor of the project structure and the cell analysis task pipelines, significantly improving modularity, maintainability, and usability. Major updates include:

1. Data Processing Module Refactor

  • Added dedicated data processing scripts for each model.

  • Introduced a unified data handling base class DataHandler, which standardizes the data workflow:

    • read_h5ad(): Reads .h5ad files and performs preprocessing.

    • process(): Placeholder for model-specific data processing logic.

    • make_dataset(): Converts an AnnData object into a PyTorch-compatible Dataset.

    • make_dataloader(): Builds a PyTorch DataLoader with distributed training support.

  • Model-specific data handlers are organized under the dataset/ directory and inherit from DataHandler.

2. Loader Class Standardization

  • Unified structure, naming conventions, and output format across all model loader classes.

  • Refined the responsibility of loaders to only handle model loading and embedding extraction.

  • Moved all data processing logic from loaders to the corresponding DataHandler classes.

  • Defined a common interface in the base loader class to enforce consistent implementation across all subclasses.

3. Task Module Refactor

  • Cell Annotation Task Refactor: Improved task execution logic and introduced a unified script interface to run different models consistently.

  • Cell Embedding Task: Added new analysis scripts for evaluating cell embeddings, with support for multiple evaluation metrics.

  • Gene Regulatory Task: Added dedicated analysis scripts and refactored the gene regulatory network evaluation logic for better performance and clarity.

  • Drug Sensitivity Task: Improved task execution logic and introduced a unified script interface to run different models consistently.