STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars
Discover STORM, Stanford's revolutionary LLM-powered knowledge curation system with 27k+ GitHub stars. Learn installation, implementation, and real-world applications for AI-powered research and report generation.
STORM: The Revolutionary AI Knowledge Curation System That's Transforming Research and Report Generation with 27k+ GitHub Stars
In the rapidly evolving landscape of AI-powered research tools, Stanford's OVAL Lab has created something truly groundbreaking: STORM (Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking). This innovative LLM-powered knowledge curation system has captured the attention of over 70,000 users and earned 27,600+ GitHub stars by revolutionizing how we approach research and report generation.
STORM represents a paradigm shift from traditional research methods by automating the entire process of creating Wikipedia-style articles from scratch, complete with citations and comprehensive coverage. But what makes STORM truly special is its sophisticated approach to information gathering and synthesis.
What Makes STORM Revolutionary?
Unlike simple question-answering systems, STORM tackles the complex challenge of generating long-form, well-researched articles by breaking the process into two distinct stages:
1. Pre-writing Stage: Intelligent Research
STORM conducts comprehensive Internet-based research to collect references and generates a detailed outline. This isn't just keyword searching – it's intelligent, perspective-driven research that considers multiple viewpoints on a topic.
2. Writing Stage: Synthesis and Citation
Using the outline and collected references, STORM generates full-length articles with proper citations, ensuring every claim is backed by credible sources.

The Secret Sauce: Advanced Question-Asking Strategies
What sets STORM apart is its sophisticated approach to generating research questions. The system employs two key strategies:
Perspective-Guided Question Asking
STORM doesn't just ask random questions. It discovers different perspectives by surveying existing articles from similar topics and uses these perspectives to guide the question-asking process, ensuring comprehensive coverage.
Simulated Conversation
Perhaps most innovatively, STORM simulates conversations between a Wikipedia writer and topic experts, grounded in Internet sources. This allows the language model to continuously update its understanding and ask increasingly sophisticated follow-up questions.
Co-STORM: Taking Collaboration to the Next Level
The latest evolution of STORM introduces Co-STORM, which enables human-AI collaborative knowledge curation. This system implements a collaborative discourse protocol supporting:
- Co-STORM LLM Experts: AI agents that generate grounded answers and raise follow-up questions
- Moderator: An AI agent that generates thought-provoking questions based on discovered information
- Human Users: Can observe or actively participate in conversations to steer discussion focus
Co-STORM also maintains a dynamic mind map that organizes information into hierarchical concept structures, creating a shared conceptual space between humans and AI.

Getting Started with STORM
Installation
Installing STORM is straightforward. You can use pip for the simplest installation:
pip install knowledge-stormFor development and customization, clone the repository:
git clone https://github.com/stanford-oval/storm.git
cd storm
conda create -n storm python=3.11
conda activate storm
pip install -r requirements.txtSetting Up API Keys
Create a secrets.toml file in your root directory:
# ============ language model configurations ============
OPENAI_API_KEY="your_openai_api_key"
OPENAI_API_TYPE="openai"
# ============ retriever configurations ============
BING_SEARCH_API_KEY="your_bing_search_api_key"
# ============ encoder configurations ============
ENCODER_API_TYPE="openai"Implementing STORM in Your Projects
Basic STORM Implementation
Here's how to set up and run STORM with OpenAI models:
import os
from knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs
from knowledge_storm.lm import LitellmModel
from knowledge_storm.rm import YouRM
# Configure language models
lm_configs = STORMWikiLMConfigs()
openai_kwargs = {
'api_key': os.getenv("OPENAI_API_KEY"),
'temperature': 1.0,
'top_p': 0.9,
}
# Use different models for different components
gpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)
gpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)
# Configure model assignments
lm_configs.set_conv_simulator_lm(gpt_35)
lm_configs.set_question_asker_lm(gpt_35)
lm_configs.set_outline_gen_lm(gpt_4)
lm_configs.set_article_gen_lm(gpt_4)
lm_configs.set_article_polish_lm(gpt_4)
# Set up retrieval module
engine_args = STORMWikiRunnerArguments()
rm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)
runner = STORMWikiRunner(engine_args, lm_configs, rm)Running STORM
Execute the complete STORM pipeline:
topic = input('Topic: ')
runner.run(
topic=topic,
do_research=True,
do_generate_outline=True,
do_generate_article=True,
do_polish_article=True,
)
runner.post_run()
runner.summary()Implementing Co-STORM
For collaborative knowledge curation:
from knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner
from knowledge_storm.lm import LitellmModel
from knowledge_storm.logging_wrapper import LoggingWrapper
from knowledge_storm.rm import BingSearch
# Configure Co-STORM language models
lm_config = CollaborativeStormLMConfigs()
openai_kwargs = {
"api_key": os.getenv("OPENAI_API_KEY"),
"api_provider": "openai",
"temperature": 1.0,
"top_p": 0.9,
}
# Set up different models for different functions
question_answering_lm = LitellmModel(model='gpt-4o', max_tokens=1000, **openai_kwargs)
discourse_manage_lm = LitellmModel(model='gpt-4o', max_tokens=500, **openai_kwargs)
lm_config.set_question_answering_lm(question_answering_lm)
lm_config.set_discourse_manage_lm(discourse_manage_lm)
# Initialize Co-STORM runner
topic = input('Topic: ')
runner_argument = RunnerArgument(topic=topic)
logging_wrapper = LoggingWrapper(lm_config)
bing_rm = BingSearch(bing_search_api_key=os.environ.get("BING_SEARCH_API_KEY"))
costorm_runner = CoStormRunner(lm_config=lm_config, runner_argument=runner_argument, rm=bing_rm)Running Co-STORM Collaborative Sessions
# Warm start the system
costorm_runner.warm_start()
# Observe the conversation
conv_turn = costorm_runner.step()
# Or actively participate
costorm_runner.step(user_utterance="Can you explore the environmental impact?")
# Generate final report
costorm_runner.knowledge_base.reorganize()
article = costorm_runner.generate_report()
print(article)Advanced Customization Options
Supported Integrations
STORM supports extensive integrations:
- Language Models: All models supported by LiteLLM
- Embedding Models: All embedding models supported by LiteLLM
- Retrieval Modules: YouRM, BingSearch, VectorRM, SerperRM, BraveRM, SearXNG, DuckDuckGoSearchRM, TavilySearchRM, GoogleSearch, and AzureAISearch
Custom Pipeline Modules
STORM's modular architecture allows customization of four key components:
- Knowledge Curation Module: Customize information collection strategies
- Outline Generation Module: Modify how information is organized
- Article Generation Module: Customize content generation formats
- Article Polishing Module: Enhance presentation and refinement
Real-World Applications and Use Cases
Academic Research
STORM excels at creating comprehensive literature reviews and research summaries, helping researchers quickly understand complex topics with proper citations.
Content Creation
Content creators use STORM for in-depth article research and outline generation, significantly reducing the time spent in the pre-writing phase.
Educational Support
Educators leverage Co-STORM's collaborative features to guide students through research processes, teaching critical thinking and information synthesis skills.
Business Intelligence
Companies use STORM to generate comprehensive market research reports and competitive analysis documents.
Performance and Recognition
STORM's effectiveness has been validated through rigorous academic evaluation:
- EMNLP 2024 Recognition: Co-STORM was accepted to the main conference
- User Preference: 78% of human evaluators preferred Co-STORM over traditional RAG chatbots
- Community Adoption: Over 70,000 users have tried the research preview
- Academic Impact: Published in NAACL 2024 with significant research contributions
Quick Start Commands
For immediate experimentation:
# Run STORM with GPT models
python examples/storm_examples/run_storm_wiki_gpt.py \
--output-dir $OUTPUT_DIR \
--retriever bing \
--do-research \
--do-generate-outline \
--do-generate-article \
--do-polish-article
# Run Co-STORM
python examples/costorm_examples/run_costorm_gpt.py \
--output-dir $OUTPUT_DIR \
--retriever bingThe Future of AI-Powered Research
STORM represents more than just another AI tool – it's a glimpse into the future of research and knowledge curation. By combining sophisticated question-asking strategies with multi-perspective analysis and collaborative human-AI interaction, STORM is setting new standards for what's possible in automated research.
The system's modular architecture and extensive API support make it adaptable to virtually any research domain, while its collaborative features ensure that human expertise remains central to the knowledge curation process.
Getting Involved
The STORM project actively welcomes contributions from the community. Whether you're interested in:
- Adding new search engine integrations
- Improving the collaborative discourse protocol
- Enhancing information abstraction capabilities
- Developing new presentation formats
There are opportunities to contribute to this groundbreaking project.
Conclusion
STORM and Co-STORM represent a significant leap forward in AI-powered research and knowledge curation. By automating the complex process of research, outline generation, and article writing while maintaining high standards for citation and accuracy, these systems are transforming how we approach information synthesis.
Whether you're a researcher, content creator, educator, or business professional, STORM offers powerful capabilities that can significantly enhance your research and writing workflows. With its growing community, continuous development, and proven effectiveness, STORM is positioned to become an essential tool in the AI-powered research toolkit.
The combination of sophisticated AI techniques, collaborative features, and modular architecture makes STORM not just a tool for today, but a platform for the future of intelligent research and knowledge curation.
For more expert insights and tutorials on AI and automation, visit us at decisioncrafters.com.