Fine-tuned BLIP on Chest X-rays (Indiana University)

This repository contains a fine-tuned BLIP (Bootstrapped Language-Image Pretraining) model trained on the Chest X-rays (Indiana University) dataset.
The model is adapted for visionโ€“language tasks in the medical imaging domain, particularly chest X-ray understanding.


๐Ÿง  Model Description

  • Base model: BLIP (Bootstrapped Language-Image Pretraining)
  • Fine-tuning domain: Medical imaging
  • Modality: Visionโ€“Language (Image + Text)
  • Target data: Chest X-ray images (frontal & lateral views)

The goal of fine-tuning is to adapt BLIP to better capture radiological visual patterns and associated semantic information from chest X-ray images.


๐Ÿ“Š Dataset Information

The model is fine-tuned using the Chest X-rays (Indiana University) dataset.

Dataset Source

Image Preprocessing Pipeline

Original images were provided in raw DICOM format. Each image was converted to PNG with the following preprocessing steps:

  1. Outlier clipping

    • The top and bottom 0.5% of DICOM pixel values were clipped
    • Purpose: eliminate extremely dark or bright pixel outliers
  2. Intensity normalization

    • DICOM pixel values were linearly scaled to the 0โ€“255 range
  3. Resizing

    • Images were resized so that the shorter side is 2048 pixels
    • This was done to comply with Kaggle dataset size limits
  4. View classification

    • Each image was manually classified into:
      • Frontal chest X-ray
      • Lateral chest X-ray
Downloads last month
3
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support