--- library_name: peft license: other license_name: health-ai-developer-foundations license_link: https://developers.google.com/health-ai-developer-foundations/terms base_model: google/medgemma-4b-it tags: - medgemma - lora - medical-ai - retinal-oct - ophthalmology - hai-def - macular-degeneration datasets: - zacharielegault/Kermany2017-OCT language: - en pipeline_tag: image-text-to-text --- # MedGemma Retinal OCT LoRA **Retinal disease classification adapter fine-tuned on the Kermany 2018 OCT dataset using MedGemma 4B.** Classifies Optical Coherence Tomography (OCT) images into 4 categories: CNV, DME, Drusen, or Normal retina. ## Model Details | Property | Value | |----------|-------| | **Base Model** | [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it) | | **Method** | LoRA (Low-Rank Adaptation) | | **Task** | Multi-class retinal disease classification (4 classes) | | **Modality** | Optical Coherence Tomography (OCT) | | **Framework** | PyTorch + HuggingFace Transformers + PEFT | ## Training Dataset **[Kermany 2018 Retinal OCT](https://huggingface.co/datasets/zacharielegault/Kermany2017-OCT)** — 84K retinal OCT images across 4 classes. Reference: Kermany et al. 2018, *Cell* - "Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning" - **Train samples:** 10,000 (curated subset from 84K) - **Validation samples:** 1,000 ### Class Distribution | Label | Description | |-------|-------------| | CNV | Choroidal Neovascularization — abnormal blood vessel growth beneath the retina. Hallmark of wet AMD requiring anti-VEGF treatment. | | DME | Diabetic Macular Edema — fluid accumulation in the macula from leaking retinal vessels. Shows retinal thickening and cystoid spaces. | | DRUSEN | Drusen — yellow deposits beneath the RPE. Hallmark of dry age-related macular degeneration. | | NORMAL | Normal retina — well-defined retinal layers, intact foveal contour, no fluid or pathology. | ## Training Configuration ### LoRA Parameters | Parameter | Value | |-----------|-------| | Rank (r) | 16 | | Alpha | 32 | | Dropout | 0.05 | | Target Modules | all-linear | | Task Type | CAUSAL_LM | | Trainable Params | 1.38B / 5.68B (24.3%) | ### Hyperparameters | Parameter | Value | |-----------|-------| | Epochs | 1 | | Per-device Batch Size | 1 | | Gradient Accumulation Steps | 8 (effective batch = 8) | | Learning Rate | 2e-4 | | LR Scheduler | Linear with warmup | | Warmup Ratio | 0.03 | | Max Grad Norm | 0.3 | | Precision | bfloat16 | | Gradient Checkpointing | Enabled | | Seed | 42 | ### Infrastructure | Property | Value | |----------|-------| | GPU | NVIDIA L4 (24 GB VRAM) | | Cloud Platform | [Modal](https://modal.com) serverless GPU | | Training Time | ~60-90 minutes | ## Prompt Format **Input:** > Analyze this retinal OCT scan and classify the finding. **Output:** > This retinal OCT scan shows **Diabetic Macular Edema**. > > Diabetic Macular Edema (DME). Fluid accumulation in the macula due to leaking retinal blood vessels in diabetic retinopathy. OCT shows retinal thickening and intraretinal cystoid spaces. ## Usage ```python from transformers import AutoProcessor, AutoModelForImageTextToText from peft import PeftModel from PIL import Image base_model_id = "google/medgemma-4b-it" adapter_id = "efecelik/medgemma-retinal-oct-lora" processor = AutoProcessor.from_pretrained(base_model_id) model = AutoModelForImageTextToText.from_pretrained( base_model_id, torch_dtype="bfloat16", device_map="auto" ) model = PeftModel.from_pretrained(model, adapter_id) image = Image.open("retinal_oct.jpg").convert("RGB") messages = [ {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": "Analyze this retinal OCT scan and classify the finding."} ]} ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", images=[image] ).to(model.device) output = model.generate(**inputs, max_new_tokens=256) print(processor.decode(output[0], skip_special_tokens=True)) ``` ## Intended Use This adapter is part of the **MedVision AI** platform built for the [MedGemma Impact Challenge](https://www.kaggle.com/competitions/med-gemma-impact-challenge). It is designed for: - **Medical education**: Helping students learn OCT interpretation and retinal pathology recognition - **Clinical decision support**: Assisting ophthalmologists with retinal disease screening - **Research**: Exploring fine-tuned medical VLMs for ophthalmic imaging ## Limitations - **Not for clinical diagnosis.** This model is for educational and research purposes only. - **Limited pathologies:** Only 4 categories. Many retinal conditions (glaucoma, retinal detachment, vein occlusion) are not covered. - **Curated subset:** Trained on 10K of 84K available images for training efficiency. - **Single epoch:** Trained for 1 epoch; further training may improve performance. ## Citation ```bibtex @article{kermany2018identifying, title={Identifying medical diagnoses and treatable diseases by image-based deep learning}, author={Kermany, Daniel S and Goldbaum, Michael and Cai, Wenjia and others}, journal={Cell}, volume={172}, number={5}, pages={1122--1131}, year={2018}, publisher={Elsevier} } ``` ## Disclaimer This model is for **educational and research purposes only**. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.