--- library_name: transformers datasets: - allenai/qasc base_model: - microsoft/deberta-v3-base pipeline_tag: text-classification --- # DRM-DeBERTa-v3-Base-qasc This model is a fine-tuned version of `microsoft/deberta-v3-base` trained on the QASC dataset. This model is a part of the artifact release for the research paper: **Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking**. **Paper:** [https://arxiv.org/abs/2505.23117](https://arxiv.org/abs/2505.23117) \ **Repository:** [https://github.com/yophis/decom-renorm-merge](https://github.com/yophis/decom-renorm-merge) ## Uses The model can be loaded as follows: ```python import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer model_id = "yophis/DRM-DeBERTa-v3-Base-qasc" # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(model_id) tokenizer.pad_token = tokenizer.eos_token # Load the model model = AutoModelForSequenceClassification.from_pretrained(model_id, device_map="auto") model.config.pad_token_id = model.config.eos_token_id # Input template input_text = "Question: {formatted_question} Context: {combinedfact}" ``` ## Training Details ### Training Data We finetune the model on [QASC](https://huggingface.co/datasets/allenai/qasc) dataset. ## Training Hyperparameters - **Learning Rate:** 1e-4 - **Weight Decay:** 0.0 - **Training Steps:** 50000 - **Batch Size:** 1024 - **Precision:** bf16 mixed precision ## Citation If you find this model useful, please consider citing our paper: ```bibtex @article{chaichana2025decom, title={Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking}, author={Chaichana, Yuatyong and Trachu, Thanapat and Limkonchotiwat, Peerat and Preechakul, Konpat and Khandhawit, Tirasan and Chuangsuwanich, Ekapol}, journal={arXiv preprint arXiv:2505.23117}, year={2025} } ``` Please also cite QASC and the original DeBERTa model.