| # Beyond Document Page Classification | |
| We release the benchmarking code together with the proposed datasets: | |
| * https://huggingface.co/datasets/bdpc/rvl_cdip_mp | |
| * https://huggingface.co/datasets/bdpc/rvl_cdip_n_mp | |
| For consistency, we add it as an anonymous model repository (can be cloned) in HuggingFace. | |
| ## Installation | |
| The scripts require [python >= 3.8](https://www.python.org/downloads/release/python-380/) to run. | |
| We will create a fresh virtualenvironment in which to install all required packages. | |
| ```sh | |
| mkvirtualenv -p /usr/bin/python3 BYD | |
| ``` | |
| Using poetry and the readily defined pyproject.toml, we will install all required packages | |
| ```sh | |
| workon BYD | |
| pip3 install poetry | |
| poetry install | |
| ``` | |
| ## Experiments | |
| To replicate all experiment results from the paper, run experiments.sh | |
| ```sh | |
| ./experiments.sh | |
| ``` | |