Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
ankner 's Collections
Base Models With Chat Templates
Hydra Decoding
Oracle 2 Proxy Models
Oracle 2 Proxy Data
Multi Judgement Oversight
Critique-out-Loud Reward Models

Critique-out-Loud Reward Models

updated Sep 5, 2024

Paper: https://arxiv.org/abs/2408.11791 | Code: https://github.com/zankner/CLoud

Upvote
4

  • ankner/Llama3-8B-CLoud-RM

    8B • Updated Oct 16, 2024 • 6 • 1

  • ankner/Llama3-8B-Classic-RM

    8B • Updated Oct 17, 2024 • 1

  • ankner/Llama3-70B-CLoud-RM

    71B • Updated Oct 18, 2024 • 4 • 1

  • ankner/Llama3-70B-Classic-RM

    71B • Updated Oct 18, 2024

  • ankner/Llama3-8b-ultra-oracle

    Viewer • Updated Sep 5, 2024 • 124k • 189

  • ankner/Llama3-8b-ultra-self-gen-8b

    Viewer • Updated Sep 5, 2024 • 124k • 56

  • ankner/Llama3-8b-ultra-self-gen-70b

    Viewer • Updated Sep 5, 2024 • 124k • 73
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs