Что-то интересное: Happy to release Meta Code World Model (CWM), a… — @gonzo_ML

Что-то интересное: Happy to release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. https://ai.meta.com/research/publications/cwm When humans plan, we imagine the possible outcomes of different actions. When we reason about code we simulate part of its execution in our head. The current generation of LLMs struggles to do this. What kind of research will an explicitly trained code world model enable? CWM allows us to study this question. Our model is trained on large amounts of coding data & bespoke Python + Bash world modeling data, allowing it to simulate Python function execution and agentic interactions in Bash environments. The team and I can’t wait to see what new research will be enabled with a world model. 📊 Tech Report https://ai.meta.com/research/publications/cwm/ ⚖️ Models weights https://ai.meta.com/resources/models-and-libraries/cwm-downloads/ 🤗 On Huggingface https://huggingface.co/facebook/cwm https://huggingface.co/facebook/cwm-sft https://huggingface.co/facebook/cwm-pretrain 🧑‍💻 Inference Code https://github.com/facebookresearch/cwm We believe CWM provides a strong testbed for research on improving code generation with world models. We performed multi-task RL, and CWM has competitive perfor mance for its size with 68.6% on LiveCodeBench v5, 76% on AIME24, and 65.8% on SweBench Verified with test time scaling. I'm immensely proud of the work done by my cracked CodeGen team at Meta, with PhD students and veterans, for which nothing is someone else's problem. The broader Meta AI community all pulled together for this. I'm very thankful for the unwavering support of our whole leadership. https://www.facebook.com/share/p/1DEqPXYp1g/

Из этого канала