cm0002@piefed.world to Artificial Intelligence@lemmy.worldEnglish · 4 days agoPaper page - Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoninghuggingface.coexternal-linkmessage-square0fedilinkarrow-up18arrow-down10
arrow-up18arrow-down1external-linkPaper page - Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoninghuggingface.cocm0002@piefed.world to Artificial Intelligence@lemmy.worldEnglish · 4 days agomessage-square0fedilink