![Cover Image for [Paper Reading] ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates](https://images.lumacdn.com/cdn-cgi/image/format=auto,fit=cover,dpr=2,background=white,quality=75,width=400,height=400/event-covers/tf/ff3af89b-5c95-4388-b0e3-32df80cfde5c.jpg)
![Cover Image for [Paper Reading] ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates](https://images.lumacdn.com/cdn-cgi/image/format=auto,fit=cover,dpr=2,background=white,quality=75,width=400,height=400/event-covers/tf/ff3af89b-5c95-4388-b0e3-32df80cfde5c.jpg)
[Paper Reading] ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
This week, we will walk through and discuss the paper:
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
[https://arxiv.org/abs/2502.06772]
Abstract of the paper:
We present that hierarchical LLM reasoning via scaling thought templates can effectively optimize the reasoning
search space and outperform the mathematical reasoning capabilities of powerful LLMs like OpenAI o1-preview
and DeepSeek V3. We train our ReasonFlux-32B model with only 8 GPUs and introduces three innovations: (i) a
structured and generic thought template library, containing around 500 high-level thought templates capable of
generalizing to similar or relevant reasoning problems; (ii) performing hierarchical reinforcement learning on a
sequence of thought templates instead of original long CoT data, optimizing a base LLM to plan out an optimal
template trajectory for gradually handling complex problems; (iii) a brand new inference scaling system that
enables hierarchical LLM reasoning by adaptively scaling thought templates at inference time. With a template
trajectory containing more explainable reasoning structures than DeepSeek-R1 and o3-mini, our ReasonFlux-32B
significantly advances math reasoning capabilities to state-of-the-art levels. Notably, on the MATH benchmark,
it achieves an accuracy of 91.2% and surpasses o1-preview by 6.7%. On the USA Math Olympiad (AIME)
benchmark, ReasonFlux-32B solves an average of 56.7% of problems, surpassing o1-preview and DeepSeek-V3
by 27% and 45%, respectively.
--------
We are a group of applied AI practitioners and enthusiasts who have formed a collective learning community. Every Wednesday evening at PM PST, we hold our research paper reading seminar covering an AI topic. One member carefully explains the paper, making it more accessible to a broader audience. Then, we follow this reading with a more informal discussion and socializing.
Speaker :
Asif Qamar
LinkedIn: https://www.linkedin.com/in/asifqamar/
Technology Leader | AI/Data Scientist | Computer Scientist | Educator | Theoretical Particle Physicist
Technical Leadership
Primarily interested in technical leadership positions that couple visionary leadership with a high‐octane, technical involvement in
applied AI/Machine learning. What distinguishes me is a technical leadership that brings together extensive, hands‐on technical, AI,
and architectural ability on the one hand and a capacity to bring together a very productive, creative, passionate, and happy team
across geographical boundaries.
Track record of consistently delivering more than a dozen successful products of enduring value that I was instrumental in envisioning,
crafting the architecture of, doing the early R&D, prototyping, and then building together a dedicated, cohesive, and talented team
around to take the ideas to fruition, through significant projects. Without fail, the products that I have led the creation of are in extensive
deployment and healthy evolution after many years.
Teaching & Mentoring
Over 20 years of leading, teaching, and mentoring engineers, through team-building around non-trivial projects, classes at universities, workshops, brown bags, and other informal gatherings. Currently, running off-work hours evening workshops in AI/Data-science/Machine-learning and Cloud computing.
You are welcome to join this in person or over Zoom (https://us02web.zoom.us/meeting/register/tZUvf-uvrTwvHdP9B-vE03j3BapgRypn64CS). SupportVectors is an AI training lab located in Fremont, CA, close to Tesla and easily accessible by road and BART. We follow the weekly sessions with snacks, soft drinks, and informal discussions.
Location
46540 Fremont Blvd
Fremont, CA 94538, USA
ZOOM Link: https://us02web.zoom.us/meeting/register/tZUvf-uvrTwvHdP9B-vE03j3BapgRypn64CS