Instructions to use yophis/DRM-DeBERTa-v3-Base-qasc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yophis/DRM-DeBERTa-v3-Base-qasc with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yophis/DRM-DeBERTa-v3-Base-qasc")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("yophis/DRM-DeBERTa-v3-Base-qasc") model = AutoModelForSequenceClassification.from_pretrained("yophis/DRM-DeBERTa-v3-Base-qasc") - Notebooks
- Google Colab
- Kaggle
metadata
library_name: transformers
datasets:
- allenai/qasc
base_model:
- microsoft/deberta-v3-base
pipeline_tag: text-classification
DRM-DeBERTa-v3-Base-qasc
This model is a fine-tuned version of microsoft/deberta-v3-base trained on the QASC dataset.
This model is a part of the artifact release for the research paper: Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking.
Paper: https://arxiv.org/abs/2505.23117
Repository: https://github.com/yophis/decom-renorm-merge
Uses
The model can be loaded as follows:
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_id = "yophis/DRM-DeBERTa-v3-Base-qasc"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
# Load the model
model = AutoModelForSequenceClassification.from_pretrained(model_id, device_map="auto")
model.config.pad_token_id = model.config.eos_token_id
# Input template
input_text = "Question: {formatted_question} Context: {combinedfact}"
Training Details
Training Data
We finetune the model on QASC dataset.
Training Hyperparameters
- Learning Rate: 1e-4
- Weight Decay: 0.0
- Training Steps: 50000
- Batch Size: 1024
- Precision: bf16 mixed precision
Citation
If you find this model useful, please consider citing our paper:
@article{chaichana2025decom,
title={Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking},
author={Chaichana, Yuatyong and Trachu, Thanapat and Limkonchotiwat, Peerat and Preechakul, Konpat and Khandhawit, Tirasan and Chuangsuwanich, Ekapol},
journal={arXiv preprint arXiv:2505.23117},
year={2025}
}
Please also cite QASC and the original DeBERTa model.