KoBigBird-large: Transformation of Transformer for Korean Language Understanding
Paper
•
2309.10339
•
Published
•
1
This is a large-sized Korean BigBird model introduced in our paper. The model draws heavily from the parameters of klue/roberta-large to ensure high performance. By employing the BigBird architecture and incorporating the newly proposed TAPER, the language model accommodates even longer input lengths.
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
Measurement on validation sets of the KLUE benchmark datasets
While our model achieves great results even without additional pretraining, further pretraining can refine the positional representations more.
@article{yang2023kobigbird,
title={KoBigBird-large: Transformation of Transformer for Korean Language Understanding},
author={Yang, Kisu and Jang, Yoonna and Lee, Taewoo and Seong, Jinwoo and Lee, Hyungjin and Jang, Hwanseok and Lim, Heuiseok},
journal={arXiv preprint arXiv:2309.10339},
year={2023}
}