---
library_name: peft
license: other
license_name: health-ai-developer-foundations
license_link: https://developers.google.com/health-ai-developer-foundations/terms
base_model: google/medgemma-4b-it
tags:
  - medgemma
  - lora
  - medical-ai
  - retinal-oct
  - ophthalmology
  - hai-def
  - macular-degeneration
datasets:
  - zacharielegault/Kermany2017-OCT
language:
  - en
pipeline_tag: image-text-to-text
---

# MedGemma Retinal OCT LoRA

**Retinal disease classification adapter fine-tuned on the Kermany 2018 OCT dataset using MedGemma 4B.**

Classifies Optical Coherence Tomography (OCT) images into 4 categories: CNV, DME, Drusen, or Normal retina.

## Model Details

| Property | Value |
|----------|-------|
| **Base Model** | [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it) |
| **Method** | LoRA (Low-Rank Adaptation) |
| **Task** | Multi-class retinal disease classification (4 classes) |
| **Modality** | Optical Coherence Tomography (OCT) |
| **Framework** | PyTorch + HuggingFace Transformers + PEFT |

## Training Dataset

**[Kermany 2018 Retinal OCT](https://huggingface.co/datasets/zacharielegault/Kermany2017-OCT)** — 84K retinal OCT images across 4 classes.

Reference: Kermany et al. 2018, *Cell* - "Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning"

- **Train samples:** 10,000 (curated subset from 84K)
- **Validation samples:** 1,000

### Class Distribution

| Label | Description |
|-------|-------------|
| CNV | Choroidal Neovascularization — abnormal blood vessel growth beneath the retina. Hallmark of wet AMD requiring anti-VEGF treatment. |
| DME | Diabetic Macular Edema — fluid accumulation in the macula from leaking retinal vessels. Shows retinal thickening and cystoid spaces. |
| DRUSEN | Drusen — yellow deposits beneath the RPE. Hallmark of dry age-related macular degeneration. |
| NORMAL | Normal retina — well-defined retinal layers, intact foveal contour, no fluid or pathology. |

## Training Configuration

### LoRA Parameters

| Parameter | Value |
|-----------|-------|
| Rank (r) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target Modules | all-linear |
| Task Type | CAUSAL_LM |
| Trainable Params | 1.38B / 5.68B (24.3%) |

### Hyperparameters

| Parameter | Value |
|-----------|-------|
| Epochs | 1 |
| Per-device Batch Size | 1 |
| Gradient Accumulation Steps | 8 (effective batch = 8) |
| Learning Rate | 2e-4 |
| LR Scheduler | Linear with warmup |
| Warmup Ratio | 0.03 |
| Max Grad Norm | 0.3 |
| Precision | bfloat16 |
| Gradient Checkpointing | Enabled |
| Seed | 42 |

### Infrastructure

| Property | Value |
|----------|-------|
| GPU | NVIDIA L4 (24 GB VRAM) |
| Cloud Platform | [Modal](https://modal.com) serverless GPU |
| Training Time | ~60-90 minutes |

## Prompt Format

**Input:**
> Analyze this retinal OCT scan and classify the finding.

**Output:**
> This retinal OCT scan shows **Diabetic Macular Edema**.
>
> Diabetic Macular Edema (DME). Fluid accumulation in the macula due to leaking retinal blood vessels in diabetic retinopathy. OCT shows retinal thickening and intraretinal cystoid spaces.

## Usage

```python
from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image

base_model_id = "google/medgemma-4b-it"
adapter_id = "efecelik/medgemma-retinal-oct-lora"

processor = AutoProcessor.from_pretrained(base_model_id)
model = AutoModelForImageTextToText.from_pretrained(
    base_model_id, torch_dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_id)

image = Image.open("retinal_oct.jpg").convert("RGB")
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "Analyze this retinal OCT scan and classify the finding."}
    ]}
]

inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True,
    return_dict=True, return_tensors="pt", images=[image]
).to(model.device)

output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
```

## Intended Use

This adapter is part of the **MedVision AI** platform built for the [MedGemma Impact Challenge](https://www.kaggle.com/competitions/med-gemma-impact-challenge). It is designed for:

- **Medical education**: Helping students learn OCT interpretation and retinal pathology recognition
- **Clinical decision support**: Assisting ophthalmologists with retinal disease screening
- **Research**: Exploring fine-tuned medical VLMs for ophthalmic imaging

## Limitations

- **Not for clinical diagnosis.** This model is for educational and research purposes only.
- **Limited pathologies:** Only 4 categories. Many retinal conditions (glaucoma, retinal detachment, vein occlusion) are not covered.
- **Curated subset:** Trained on 10K of 84K available images for training efficiency.
- **Single epoch:** Trained for 1 epoch; further training may improve performance.

## Citation

```bibtex
@article{kermany2018identifying,
  title={Identifying medical diagnoses and treatable diseases by image-based deep learning},
  author={Kermany, Daniel S and Goldbaum, Michael and Cai, Wenjia and others},
  journal={Cell},
  volume={172},
  number={5},
  pages={1122--1131},
  year={2018},
  publisher={Elsevier}
}
```

## Disclaimer

This model is for **educational and research purposes only**. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.