Open R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 with 25k+ GitHub Stars

Tosin Akinosho

Dec 1, 2025 — 7 min read

Open R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 with 25k+ GitHub Stars

In the rapidly evolving landscape of AI reasoning models, Open R1 by Hugging Face has emerged as a groundbreaking project that's making advanced AI reasoning capabilities accessible to everyone. With over 25,700 GitHub stars and 2,400 forks, this fully open-source reproduction of DeepSeek-R1 is transforming how developers and researchers approach AI model training and deployment.

🚀 What is Open R1?

Open R1 is an ambitious open-source project that aims to replicate and extend the DeepSeek-R1 pipeline, making it accessible for the entire AI community. Unlike proprietary solutions, Open R1 provides complete transparency and control over the entire AI reasoning pipeline, from data generation to model training and evaluation.

Key Features That Set Open R1 Apart

Complete R1 Pipeline: Comprehensive scripts for training, evaluation, and synthetic data generation
Multi-Stage Training: Support for both Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO)
Scalable Architecture: Built to scale from single GPU setups to multi-node clusters
Extensive Evaluation Suite: Reproduces DeepSeek's evaluation results on AIME 2024, MATH-500, GPQA Diamond, and LiveCodeBench
Data Generation Tools: Advanced synthetic data generation using Distilabel

🏗️ Project Architecture and Components

The Open R1 project follows a clean, modular design that makes it easy to understand and extend:

Core Components

src/open_r1/grpo.py: Trains models using GRPO on custom datasets
src/open_r1/sft.py: Performs supervised fine-tuning on datasets
src/open_r1/generate.py: Generates synthetic data using Distilabel
Makefile: Easy-to-run commands for each step in the R1 pipeline

Three-Phase Development Plan

The project follows DeepSeek-R1's tech report with a clear roadmap:

Step 1: Replicate R1-Distill models by distilling high-quality corpus from DeepSeek-R1
Step 2: Replicate the pure RL pipeline for R1-Zero with large-scale datasets
Step 3: Demonstrate base model to RL-tuned via multi-stage training

⚡ Installation and Setup

Getting started with Open R1 requires careful attention to dependencies, particularly CUDA 12.4 compatibility.

Quick Setup

# Quick installation using make
make install

# Manual setup
uv venv openr1 --python 3.11 && source openr1/bin/activate
uv pip install --upgrade pip

# Install vLLM and FlashAttention
uv pip install vllm==0.8.5.post1
uv pip install setuptools && uv pip install flash-attn --no-build-isolation

# Install development dependencies
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e ".[dev]"

Authentication Setup

# Login to required services
huggingface-cli login
wandb login

# Verify Git LFS installation
git-lfs --version

🎯 Training Models with Open R1

Open R1 supports two primary training approaches, each optimized for different use cases and hardware configurations.

Supervised Fine-Tuning (SFT)

SFT is perfect for distilling reasoning capabilities from larger models:

# Train via command line
accelerate launch --config_file=recipes/accelerate_configs/zero3.yaml src/open_r1/sft.py \
    --model_name_or_path open-r1/Qwen2.5-Math-7B-RoPE-300k \
    --dataset_name open-r1/Mixture-of-Thoughts \
    --dataset_config all \
    --eos_token '<|im_end|>' \
    --learning_rate 4.0e-5 \
    --num_train_epochs 5 \
    --max_seq_length 32768 \
    --per_device_train_batch_size 2 \
    --gradient_checkpointing \
    --bf16 \
    --use_liger_kernel \
    --output_dir data/OpenR1-Distill-7B

Group Relative Policy Optimization (GRPO)

GRPO enables advanced reinforcement learning training:

# Single-node training
ACCELERATE_LOG_LEVEL=info \
    accelerate launch --config_file recipes/accelerate_configs/zero3.yaml \
    src/open_r1/grpo.py --config recipes/DeepSeek-R1-Distill-Qwen-1.5B/grpo/config_demo.yaml \
    --vllm_mode colocate

Advanced Training Features

Code Interpreter Training

Open R1 supports training with code execution capabilities:

# Install code dependencies
uv pip install -e '.[code]'

# Setup E2B or Morph providers
echo 'E2B_API_KEY="e2b_xxx"' > .env
echo 'MORPH_API_KEY="your_key"' >> .env

# Train with code interpreter
CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7 ACCELERATE_LOG_LEVEL=info \
    accelerate launch --config_file recipes/accelerate_configs/zero2.yaml --num_processes=7 \
    src/open_r1/grpo.py --config recipes/Qwen2.5-1.5B-Instruct/grpo/config_demo_code.yaml

📊 Comprehensive Evaluation Suite

Open R1 includes a robust evaluation framework that reproduces DeepSeek's benchmark results with remarkable accuracy.

Supported Benchmarks

AIME 2024: Advanced mathematics competition problems
MATH-500: Mathematical reasoning tasks
GPQA Diamond: Graduate-level science questions
LiveCodeBench: Real-world coding challenges

Running Evaluations

# Single GPU evaluation
export VLLM_WORKER_MULTIPROC_METHOD=spawn
MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
MODEL_ARGS="model_name=$MODEL,dtype=bfloat16,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}"
OUTPUT_DIR=data/evals/$MODEL

# AIME 2024 evaluation
lighteval vllm $MODEL_ARGS "lighteval|aime24|0|0" \
    --use-chat-template \
    --output-dir $OUTPUT_DIR

# Multi-GPU evaluation with data parallelism
NUM_GPUS=8
MODEL_ARGS="model_name=$MODEL,dtype=bfloat16,data_parallel_size=$NUM_GPUS,max_model_length=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}"

lighteval vllm $MODEL_ARGS "lighteval|aime24|0|0" \
    --use-chat-template \
    --output-dir $OUTPUT_DIR

Benchmark Results Comparison

Open R1 successfully reproduces DeepSeek's results within 1-3 standard deviations:

Model	AIME 2024 (Open R1)	AIME 2024 (DeepSeek)	MATH-500 (Open R1)	MATH-500 (DeepSeek)
DeepSeek-R1-Distill-Qwen-7B	50.8	55.5	94.5	92.8
DeepSeek-R1-Distill-Qwen-32B	69.7	72.6	95.6	94.3

🔄 Data Generation and Distillation

One of Open R1's most powerful features is its ability to generate high-quality synthetic training data.

Generating Data from Distilled Models

# pipeline.py - Generate synthetic data
from datasets import load_dataset
from distilabel.models import vLLM
from distilabel.pipeline import Pipeline
from distilabel.steps.tasks import TextGeneration

prompt_template = """
You will be given a problem. Please reason step by step, and put your final answer within \\boxed{}:
{{ instruction }}"""

dataset = load_dataset("AI-MO/NuminaMath-TIR", split="train").select(range(10))
model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"

with Pipeline(
    name="distill-qwen-7b-r1",
    description="A pipeline to generate data from a distilled r1 model",
) as pipeline:
    llm = vLLM(
        model=model_id,
        tokenizer=model_id,
        extra_kwargs={
            "tensor_parallel_size": 1,
            "max_model_len": 8192,
        },
        generation_kwargs={
            "temperature": 0.6,
            "max_new_tokens": 8192,
        },
    )
    
    text_generation = TextGeneration(
        llm=llm,
        template=prompt_template,
        num_generations=4,
        input_mappings={"instruction": "problem"}
    )

if __name__ == "__main__":
    distiset = pipeline.run(dataset=dataset)
    distiset.push_to_hub(repo_id="username/numina-deepseek-r1-qwen-7b")

Large-Scale Data Generation

For production-scale data generation using DeepSeek-R1:

# Install required dependencies
pip install https://wheels.vllm.ai/221d388cc5a836fa189305785ed7e887cea8b510/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
uv pip install "distilabel[vllm,ray,openai]>=1.5.2"

# Launch multi-node generation
sbatch slurm/generate.slurm \
    --hf-dataset AI-MO/NuminaMath-TIR \
    --temperature 0.6 \
    --prompt-column problem \
    --model deepseek-ai/DeepSeek-R1 \
    --hf-output-dataset username/r1-dataset

🎯 Real-World Applications and Use Cases

1. Educational AI Tutoring Systems

Open R1's reasoning capabilities make it perfect for creating AI tutors that can explain complex mathematical and scientific concepts step-by-step.

2. Code Generation and Debugging

With built-in code interpreter support, Open R1 can generate, execute, and debug code across multiple programming languages.

3. Research and Development

Researchers can use Open R1 to:

Experiment with novel reasoning architectures
Generate synthetic datasets for specific domains
Benchmark new evaluation methodologies

4. Enterprise AI Solutions

Organizations can deploy Open R1 for:

Automated technical documentation
Complex problem-solving workflows
Training domain-specific reasoning models

🚀 Advanced Configuration and Optimization

Custom Dataset Mixtures

# config.yaml - Custom dataset mixture
dataset_mixture:
  datasets:
    - id: dataset_1
      config: config_name_1
      split: train
      columns:
        - problem
        - solution
      weight: 0.25
    - id: dataset_2
      config: config_name_2
      split: train
      columns:
        - question
        - answer
      weight: 0.75
  seed: 42
  test_split_size: 0.1

Slurm Cluster Deployment

# Single-node training
sbatch --job-name=open_r1 --nodes=1 slurm/train.slurm \
    --model OpenR1-Distill-7B \
    --task sft \
    --config distill \
    --accelerator zero3

# Multi-node GRPO training
sbatch --job-name=open_r1 --nodes=2 slurm/train.slurm \
    --model Qwen2.5-1.5B-Instruct \
    --task grpo \
    --config demo \
    --accelerator zero2 \
    --dp 4 --tp 2

📈 Performance Benchmarks and Results

Training Performance

Open R1 demonstrates excellent scaling characteristics:

Single GPU: Efficient training on consumer hardware
Multi-GPU: Linear scaling up to 8 GPUs
Multi-Node: Supports distributed training across clusters

Model Quality

The OpenR1-Distill-7B model achieves competitive performance:

Benchmark	OpenR1-Distill-7B	DeepSeek-R1-Distill-Qwen-7B
AIME 2024	52.7	51.3
MATH-500	89.0	93.5
GPQA Diamond	52.8	52.4
LiveCodeBench v5	39.4	37.4

🔧 Troubleshooting and Best Practices

Common Issues and Solutions

CUDA Compatibility

Ensure you're using CUDA 12.4 to avoid segmentation faults:

nvcc --version  # Verify CUDA version

Memory Optimization

For limited GPU memory, adjust batch sizes and use gradient checkpointing:

--per_device_train_batch_size 1 \
--gradient_accumulation_steps 8 \
--gradient_checkpointing

Chat Template Alignment

Always align EOS tokens with chat templates:

# For Qwen models
--eos_token '<|im_end|>'

# For Llama models
--eos_token '<|eot_id|>'

🌟 Community and Ecosystem

Active Development

Open R1 benefits from active community contributions:

44 Contributors: Diverse expertise from the AI community
Regular Updates: Continuous improvements and new features
Comprehensive Documentation: Detailed guides and examples

Integration Ecosystem

Open R1 integrates seamlessly with:

Hugging Face Hub: Model and dataset hosting
Weights & Biases: Experiment tracking
vLLM: High-performance inference
Distilabel: Synthetic data generation

🔮 Future Roadmap and Developments

Upcoming Features

Enhanced Multi-Modal Support: Vision and text reasoning
Improved Efficiency: Better memory utilization and speed
Extended Language Support: More programming languages for code execution
Advanced Evaluation Metrics: More comprehensive benchmarking

Research Directions

Novel reasoning architectures
Improved data synthesis techniques
Better alignment methods
Efficient scaling strategies

🎯 Getting Started: Your First Open R1 Project

Quick Start Tutorial

Generate Synthetic Data:

python pipeline.py  # Using the example above

Evaluate Performance:

make evaluate MODEL=your-model-name TASK=aime24

Train Your First Model:

accelerate launch --config_file recipes/accelerate_configs/zero3.yaml \
    src/open_r1/sft.py \
    --config recipes/OpenR1-Distill-7B/sft/config_distill.yaml

Clone and Setup:

git clone https://github.com/huggingface/open-r1.git
cd open-r1
make install

📚 Learning Resources and Documentation

Essential Reading

Community Resources

GitHub Discussions for Q&A
Hugging Face Discord community
Regular community calls and updates

🏆 Conclusion: The Future of Open AI Reasoning

Open R1 represents a paradigm shift in AI development, democratizing access to state-of-the-art reasoning capabilities. With its comprehensive toolkit, excellent performance, and vibrant community, Open R1 is not just reproducing DeepSeek-R1—it's building the foundation for the next generation of AI reasoning systems.

Whether you're a researcher pushing the boundaries of AI, a developer building intelligent applications, or an organization looking to deploy advanced reasoning capabilities, Open R1 provides the tools, documentation, and community support you need to succeed.

The project's commitment to openness, transparency, and collaboration ensures that the benefits of advanced AI reasoning are accessible to everyone, not just those with access to proprietary systems. As we move forward, Open R1 will continue to evolve, driven by community contributions and the shared goal of advancing AI for the benefit of all.

Ready to get started? Clone the repository, follow the installation guide, and join the thousands of developers already building the future of AI reasoning with Open R1.

For more expert insights and tutorials on AI and automation, visit us at decisioncrafters.com.

Open R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 with 25k+ GitHub Stars

🚀 What is Open R1?

Key Features That Set Open R1 Apart

🏗️ Project Architecture and Components

Core Components

Three-Phase Development Plan

⚡ Installation and Setup

Quick Setup

Authentication Setup

🎯 Training Models with Open R1

Supervised Fine-Tuning (SFT)

Group Relative Policy Optimization (GRPO)

Advanced Training Features

Code Interpreter Training

📊 Comprehensive Evaluation Suite

Supported Benchmarks

Running Evaluations

Benchmark Results Comparison

🔄 Data Generation and Distillation

Generating Data from Distilled Models

Large-Scale Data Generation

🎯 Real-World Applications and Use Cases

1. Educational AI Tutoring Systems

2. Code Generation and Debugging

3. Research and Development

4. Enterprise AI Solutions

🚀 Advanced Configuration and Optimization

Custom Dataset Mixtures

Slurm Cluster Deployment

📈 Performance Benchmarks and Results

Training Performance

Model Quality

🔧 Troubleshooting and Best Practices

Common Issues and Solutions

CUDA Compatibility

Memory Optimization

Chat Template Alignment

🌟 Community and Ecosystem

Active Development

Integration Ecosystem

🔮 Future Roadmap and Developments

Upcoming Features

Research Directions

🎯 Getting Started: Your First Open R1 Project

Quick Start Tutorial

📚 Learning Resources and Documentation

Essential Reading

Community Resources

🏆 Conclusion: The Future of Open AI Reasoning

Read more

Open-R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 Reasoning with 25k+ GitHub Stars

CSM: The Revolutionary Conversational Speech Model That's Transforming AI Voice Generation with Llama Architecture

Nano-vLLM: The Lightweight LLM Inference Engine That's Outperforming vLLM with Just 1,200 Lines of Code

STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars