Models

975

Full-text search

Active filters: reward-trainer

Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1.41e-05_bs2_g4

Updated May 11, 2024 • 2

Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-06_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1e-06_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-05_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_contrast_loraR32_-1_gemma2b_lr5e-05_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-06_bs2_g4

Updated May 12, 2024 • 3

Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4

Updated May 12, 2024 • 3

Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1e-06_bs2_g4

Updated May 12, 2024 • 2

Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-05_bs2_g4

Updated May 12, 2024 • 2

thorirhrafn/gpt1B_reward_model3

Updated May 13, 2024 • 4

vwxyzjn/rm

Text Classification • 0.1B • Updated Jun 20, 2024 • 7

vwxyzjn/rm1

Text Classification • 0.1B • Updated May 21, 2024 • 2

calkp/reward_model

Text Classification • 8B • Updated May 22, 2024 • 4

ianmiller314/results

Text Classification • 82.1M • Updated May 24, 2024 • 2

mnoukhov/pythia410m-rm-tldr

Text Classification • 0.4B • Updated Jun 2, 2024 • 3

DownwardSpiral33/2c2-reward

Text Classification • 0.1B • Updated Jun 7, 2024 • 3

DownwardSpiral33/2c6-d6-reward

Text Classification • 0.1B • Updated Jun 7, 2024 • 2

DownwardSpiral33/2c2-reward-medium

Text Classification • 0.4B • Updated Jun 7, 2024 • 3

DownwardSpiral33/2c6-reward

Text Classification • 0.1B • Updated Jun 7, 2024 • 2

gsdas/temp_model

Text Classification • 0.4B • Updated Jun 8, 2024 • 2

SiMajid/working

0.3B • Updated Jul 21, 2024 • 7

RCODI/deberta-v3-large-reward-model

Text Classification • 0.4B • Updated Jun 12, 2024 • 6

santiviquez/reward_modeling_anthropic_hh

Text Classification • 0.3B • Updated Jun 13, 2024 • 41

mnoukhov/pythia160m-rm-tldr

Text Classification • 0.1B • Updated Jun 18, 2024 • 7

chandrasekhar319/reward_model_tinyllama_sql

Updated Jun 19, 2024 • 4

mnoukhov/pythia410m-rm-tldr6.9b

Text Classification • 0.4B • Updated Jun 20, 2024 • 286

vwxyzjn/rm_1b

Text Classification • 0.9B • Updated Jun 20, 2024 • 3

SiMajid/value_reward_modeling

Text Classification • 0.3B • Updated Jun 21, 2024 • 1

SiMajid/deberta_value

Text Classification • 0.2B • Updated Jun 22, 2024 • 4