STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars

Discover STORM, Stanford's revolutionary LLM-powered knowledge curation system with 27k+ GitHub stars. Learn installation, implementation, and real-world applications for AI-powered research and report generation.

Tosin Akinosho

Nov 27, 2025 — 5 min read

STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars

In the rapidly evolving landscape of AI-powered research tools, Stanford's OVAL Lab has created something truly groundbreaking: STORM (Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking). This innovative LLM-powered knowledge curation system has captured the attention of over 70,000 users and earned 27,600+ GitHub stars by revolutionizing how we approach research and report generation.

STORM represents a paradigm shift from traditional research methods by automating the entire process of creating Wikipedia-style articles from scratch, complete with citations and comprehensive coverage. But what makes STORM truly special is its sophisticated approach to information gathering and synthesis.

What Makes STORM Revolutionary?

Unlike simple question-answering systems, STORM tackles the complex challenge of generating long-form, well-researched articles by breaking the process into two distinct stages:

1. Pre-writing Stage: Intelligent Research

STORM conducts comprehensive Internet-based research to collect references and generates a detailed outline. This isn't just keyword searching – it's intelligent, perspective-driven research that considers multiple viewpoints on a topic.

2. Writing Stage: Synthesis and Citation

Using the outline and collected references, STORM generates full-length articles with proper citations, ensuring every claim is backed by credible sources.

The Secret Sauce: Advanced Question-Asking Strategies

What sets STORM apart is its sophisticated approach to generating research questions. The system employs two key strategies:

Perspective-Guided Question Asking

STORM doesn't just ask random questions. It discovers different perspectives by surveying existing articles from similar topics and uses these perspectives to guide the question-asking process, ensuring comprehensive coverage.

Simulated Conversation

Perhaps most innovatively, STORM simulates conversations between a Wikipedia writer and topic experts, grounded in Internet sources. This allows the language model to continuously update its understanding and ask increasingly sophisticated follow-up questions.

Co-STORM: Taking Collaboration to the Next Level

The latest evolution of STORM introduces Co-STORM, which enables human-AI collaborative knowledge curation. This system implements a collaborative discourse protocol supporting:

Co-STORM LLM Experts: AI agents that generate grounded answers and raise follow-up questions
Moderator: An AI agent that generates thought-provoking questions based on discovered information
Human Users: Can observe or actively participate in conversations to steer discussion focus

Co-STORM also maintains a dynamic mind map that organizes information into hierarchical concept structures, creating a shared conceptual space between humans and AI.

Getting Started with STORM

Installation

Installing STORM is straightforward. You can use pip for the simplest installation:

pip install knowledge-storm

For development and customization, clone the repository:

git clone https://github.com/stanford-oval/storm.git
cd storm
conda create -n storm python=3.11
conda activate storm
pip install -r requirements.txt

Setting Up API Keys

Create a secrets.toml file in your root directory:

# ============ language model configurations ============
OPENAI_API_KEY="your_openai_api_key"
OPENAI_API_TYPE="openai"

# ============ retriever configurations ============
BING_SEARCH_API_KEY="your_bing_search_api_key"

# ============ encoder configurations ============
ENCODER_API_TYPE="openai"

Implementing STORM in Your Projects

Basic STORM Implementation

Here's how to set up and run STORM with OpenAI models:

import os
from knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs
from knowledge_storm.lm import LitellmModel
from knowledge_storm.rm import YouRM

# Configure language models
lm_configs = STORMWikiLMConfigs()
openai_kwargs = {
    'api_key': os.getenv("OPENAI_API_KEY"),
    'temperature': 1.0,
    'top_p': 0.9,
}

# Use different models for different components
gpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)
gpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)

# Configure model assignments
lm_configs.set_conv_simulator_lm(gpt_35)
lm_configs.set_question_asker_lm(gpt_35)
lm_configs.set_outline_gen_lm(gpt_4)
lm_configs.set_article_gen_lm(gpt_4)
lm_configs.set_article_polish_lm(gpt_4)

# Set up retrieval module
engine_args = STORMWikiRunnerArguments()
rm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)
runner = STORMWikiRunner(engine_args, lm_configs, rm)

Running STORM

Execute the complete STORM pipeline:

topic = input('Topic: ')
runner.run(
    topic=topic,
    do_research=True,
    do_generate_outline=True,
    do_generate_article=True,
    do_polish_article=True,
)
runner.post_run()
runner.summary()

Implementing Co-STORM

For collaborative knowledge curation:

from knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner
from knowledge_storm.lm import LitellmModel
from knowledge_storm.logging_wrapper import LoggingWrapper
from knowledge_storm.rm import BingSearch

# Configure Co-STORM language models
lm_config = CollaborativeStormLMConfigs()
openai_kwargs = {
    "api_key": os.getenv("OPENAI_API_KEY"),
    "api_provider": "openai",
    "temperature": 1.0,
    "top_p": 0.9,
}

# Set up different models for different functions
question_answering_lm = LitellmModel(model='gpt-4o', max_tokens=1000, **openai_kwargs)
discourse_manage_lm = LitellmModel(model='gpt-4o', max_tokens=500, **openai_kwargs)

lm_config.set_question_answering_lm(question_answering_lm)
lm_config.set_discourse_manage_lm(discourse_manage_lm)

# Initialize Co-STORM runner
topic = input('Topic: ')
runner_argument = RunnerArgument(topic=topic)
logging_wrapper = LoggingWrapper(lm_config)
bing_rm = BingSearch(bing_search_api_key=os.environ.get("BING_SEARCH_API_KEY"))
costorm_runner = CoStormRunner(lm_config=lm_config, runner_argument=runner_argument, rm=bing_rm)

Running Co-STORM Collaborative Sessions

# Warm start the system
costorm_runner.warm_start()

# Observe the conversation
conv_turn = costorm_runner.step()

# Or actively participate
costorm_runner.step(user_utterance="Can you explore the environmental impact?")

# Generate final report
costorm_runner.knowledge_base.reorganize()
article = costorm_runner.generate_report()
print(article)

Advanced Customization Options

Supported Integrations

STORM supports extensive integrations:

Language Models: All models supported by LiteLLM
Embedding Models: All embedding models supported by LiteLLM
Retrieval Modules: YouRM, BingSearch, VectorRM, SerperRM, BraveRM, SearXNG, DuckDuckGoSearchRM, TavilySearchRM, GoogleSearch, and AzureAISearch

Custom Pipeline Modules

STORM's modular architecture allows customization of four key components:

Knowledge Curation Module: Customize information collection strategies
Outline Generation Module: Modify how information is organized
Article Generation Module: Customize content generation formats
Article Polishing Module: Enhance presentation and refinement

Real-World Applications and Use Cases

Academic Research

STORM excels at creating comprehensive literature reviews and research summaries, helping researchers quickly understand complex topics with proper citations.

Content Creation

Content creators use STORM for in-depth article research and outline generation, significantly reducing the time spent in the pre-writing phase.

Educational Support

Educators leverage Co-STORM's collaborative features to guide students through research processes, teaching critical thinking and information synthesis skills.

Business Intelligence

Companies use STORM to generate comprehensive market research reports and competitive analysis documents.

Performance and Recognition

STORM's effectiveness has been validated through rigorous academic evaluation:

EMNLP 2024 Recognition: Co-STORM was accepted to the main conference
User Preference: 78% of human evaluators preferred Co-STORM over traditional RAG chatbots
Community Adoption: Over 70,000 users have tried the research preview
Academic Impact: Published in NAACL 2024 with significant research contributions

Quick Start Commands

For immediate experimentation:

# Run STORM with GPT models
python examples/storm_examples/run_storm_wiki_gpt.py \
    --output-dir $OUTPUT_DIR \
    --retriever bing \
    --do-research \
    --do-generate-outline \
    --do-generate-article \
    --do-polish-article

# Run Co-STORM
python examples/costorm_examples/run_costorm_gpt.py \
    --output-dir $OUTPUT_DIR \
    --retriever bing

The Future of AI-Powered Research

STORM represents more than just another AI tool – it's a glimpse into the future of research and knowledge curation. By combining sophisticated question-asking strategies with multi-perspective analysis and collaborative human-AI interaction, STORM is setting new standards for what's possible in automated research.

The system's modular architecture and extensive API support make it adaptable to virtually any research domain, while its collaborative features ensure that human expertise remains central to the knowledge curation process.

Getting Involved

The STORM project actively welcomes contributions from the community. Whether you're interested in:

Adding new search engine integrations
Improving the collaborative discourse protocol
Enhancing information abstraction capabilities
Developing new presentation formats

There are opportunities to contribute to this groundbreaking project.

Conclusion

STORM and Co-STORM represent a significant leap forward in AI-powered research and knowledge curation. By automating the complex process of research, outline generation, and article writing while maintaining high standards for citation and accuracy, these systems are transforming how we approach information synthesis.

Whether you're a researcher, content creator, educator, or business professional, STORM offers powerful capabilities that can significantly enhance your research and writing workflows. With its growing community, continuous development, and proven effectiveness, STORM is positioned to become an essential tool in the AI-powered research toolkit.

The combination of sophisticated AI techniques, collaborative features, and modular architecture makes STORM not just a tool for today, but a platform for the future of intelligent research and knowledge curation.

For more expert insights and tutorials on AI and automation, visit us at decisioncrafters.com.

STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars

Tosin Akinosho

STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars

What Makes STORM Revolutionary?

1. Pre-writing Stage: Intelligent Research

2. Writing Stage: Synthesis and Citation

The Secret Sauce: Advanced Question-Asking Strategies

Perspective-Guided Question Asking

Simulated Conversation

Co-STORM: Taking Collaboration to the Next Level

Getting Started with STORM

Installation

Setting Up API Keys

Implementing STORM in Your Projects

Basic STORM Implementation

Running STORM

Implementing Co-STORM

Running Co-STORM Collaborative Sessions

Advanced Customization Options

Supported Integrations

Custom Pipeline Modules

Real-World Applications and Use Cases

Academic Research

Content Creation

Educational Support

Business Intelligence

Performance and Recognition

Quick Start Commands

The Future of AI-Powered Research

Getting Involved

Conclusion

Read more

Open-R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 Reasoning with 25k+ GitHub Stars

Open R1: The Revolutionary Open-Source Framework That's Democratizing DeepSeek-R1 with 25k+ GitHub Stars

CSM: The Revolutionary Conversational Speech Model That's Transforming AI Voice Generation with Llama Architecture

Nano-vLLM: The Lightweight LLM Inference Engine That's Outperforming vLLM with Just 1,200 Lines of Code