---
library_name: transformers
datasets:
- allenai/qasc
base_model:
- microsoft/deberta-v3-base
pipeline_tag: text-classification
---

# DRM-DeBERTa-v3-Base-qasc

This model is a fine-tuned version of `microsoft/deberta-v3-base` trained on the QASC dataset.

This model is a part of the artifact release for the research paper: **Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking**.

**Paper:** [https://arxiv.org/abs/2505.23117](https://arxiv.org/abs/2505.23117)  \
**Repository:** [https://github.com/yophis/decom-renorm-merge](https://github.com/yophis/decom-renorm-merge)


## Uses

The model can be loaded as follows:

```python
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_id = "yophis/DRM-DeBERTa-v3-Base-qasc"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token

# Load the model
model = AutoModelForSequenceClassification.from_pretrained(model_id, device_map="auto")
model.config.pad_token_id = model.config.eos_token_id

# Input template
input_text = "Question: {formatted_question} Context: {combinedfact}"
```


## Training Details

### Training Data

We finetune the model on [QASC](https://huggingface.co/datasets/allenai/qasc) dataset.

## Training Hyperparameters

- **Learning Rate:** 1e-4
- **Weight Decay:** 0.0
- **Training Steps:** 50000
- **Batch Size:** 1024
- **Precision:** bf16 mixed precision

## Citation

If you find this model useful, please consider citing our paper:

```bibtex
@article{chaichana2025decom,
  title={Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking},
  author={Chaichana, Yuatyong and Trachu, Thanapat and Limkonchotiwat, Peerat and Preechakul, Konpat and Khandhawit, Tirasan and Chuangsuwanich, Ekapol},
  journal={arXiv preprint arXiv:2505.23117},
  year={2025}
}
```

Please also cite QASC and the original DeBERTa model.