In Crunch 3, your task is to design a gene panel that best distinguishes dysplasia regions from noncancerous mucosa regions in colon tissue affected by Inflammatory Bowel Disease (IBD). Using provided H&E images annotated by pathologists and single-cell RNA sequencing (scRNA-Seq) data, you will rank 18,615 protein-coding genes based on their ability to discriminate between these disease states.

If you participated in Crunch 1 or Crunch 2, you may leverage your previously developed models to make gene expression predictions on the annotated regions and design your gene panel based on these predictions. If not, you can design your gene panel from scratch using biological insights or other approaches.

Additionally, you are required to:

Provide a justification for how you constructed your gene panel.
three submissions from other participants based on their justifications.

X (Inputs Data)

H&E Images and Annotations

First H&E Image: Includes only noncancerous mucosa (already provided in Crunch 1 and Crunch 2).
Second H&E Image: Entire colon tissue section including both dysplasia and noncancerous mucosa regions (UC9_I-crunch3-HE.tif).
Associated Files:
- Nucleus Segmentation Masks.
- Tissue Region Masks with Annotations (UC9_I-crunch3-HE-dysplasia-ROI.tif):
  - 0: Other tissue regions.
  - 1: Noncancerous mucosa.
  - 2: Dysplasia.

Single-cell RNA-Seq Data

Dataset: Crunch3_scRNAseq.h5ad.
Content: Gene expression data for 18,615 protein-coding genes from colon tissue samples with and without dysplasia.
Cell Metadata (adata.obs):
- Cell Type: adata.obs["annotation"].
- Individual: adata.obs["individual"].
- Disease Status: adata.obs["status"] (Normal, Unaffected tissue, Polyp, Adenocarcinoma).
- Dysplasia Status: adata.obs["dysplasia"] (y, n, or ND).

Expression Data

Normalized Counts: adata.X (log1p-normalized).
Raw Counts: Available in adata.layers["counts"]

Y (Targets)

Gene Ranking

Rank all 18,615 protein-coding genes from 1 (best discriminator) to 18,615 (worst), based on their ability to distinguish between dysplasia and noncancerous mucosa regions.

Including genes associated with different biological functions can enhance your gene panel and will be considered in the

Submit Your First Model