https://arxiv.org/abs/2501.05258
Daniele Cipollone
DanCip
AI & ML interests
None yet
Recent Activity
updated
a dataset
1 day ago
DanCip/lca-StartingPoints-expanded
liked
a Space
about 1 month ago
HuggingFaceTB/smol-training-playbook
upvoted
a
paper
about 1 month ago
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N
Sampling via max@k Optimisation