MiniMax-M1 Model Details

Complete technical specifications, architecture details, and performance benchmarks of the world's first open-source hybrid attention reasoning model.

Model Overview

We introduce MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism.

The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters activated per token. Consistent with MiniMax-Text-01, the M1 model natively supports a context length of 1 million tokens, 8x the context size of DeepSeek R1.

Furthermore, the lightning attention mechanism in MiniMax-M1 enables efficient scaling of test-time compute – For example, compared to DeepSeek R1, M1 consumes 25% of the FLOPs at a generation length of 100K tokens. These properties make M1 particularly suitable for complex tasks that require processing long inputs and thinking extensively.

Detailed Performance Benchmarks

Comprehensive evaluation results across multiple categories

Category Task MiniMax-M1-80K MiniMax-M1-40K DeepSeek-R1 Claude 4 Opus OpenAI-o3
Mathematics AIME 2024 86.0 83.3 79.8 76.0 91.6
AIME 2025 76.9 74.6 70.0 75.5 88.9
MATH-500 96.8 96.0 97.3 98.2 98.1
Coding LiveCodeBench 65.0 62.3 55.9 56.6 75.8
FullStackBench 68.3 67.6 70.1 70.3 69.3
Reasoning GPQA Diamond 70.0 69.2 71.5 79.6 83.3
ZebraLogic 86.8 80.1 78.7 95.1 95.8
MMLU-Pro 81.1 80.6 84.0 85.0 85.0
SWE-bench Verified 56.0 55.6 49.2 72.5 69.1
Long Context OpenAI-MRCR (128k) 73.4 76.1 35.8 48.9 56.5
OpenAI-MRCR (1M) 56.2 58.6 -- -- --
LongBench-v2 61.5 61.0 58.3 55.6 58.8

Usage Recommendations

Optimal settings for different scenarios

⚙️

Inference Parameters

Temperature: 1.0

Top_p: 0.95

Optimal for creativity and diversity while maintaining logical coherence.

💬

General Purpose

System Prompt:

"You are a helpful assistant."

For summarization, translation, Q&A, creative writing.

🔢

Mathematical Tasks

System Prompt:

"Please reason step by step, and put your final answer within \boxed{}."

For calculation and logical deduction problems.

Function Calling

Advanced capabilities for tool integration

Function Calling Support

The MiniMax-M1 model supports function calling capabilities, enabling the model to identify when external functions need to be called and output function call parameters in a structured format.

Tool Integration Structured Output Agentic Applications

API & Chatbot

For general use and evaluation, we provide online services and development tools.

Online Chat Developer API MCP Server

Citation

@misc{minimax2025minimaxm1scalingtesttimecompute,
      title={MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention}, 
      author={MiniMax},
      year={2025},
      eprint={2506.13585},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.13585}, 
}