构建软件公司的 JARVIS:AI Native 研发中枢的方法论
构建软件公司的 JARVIS:AI Native 研发中枢的方法论
Building JARVIS for Software Companies: A Methodology for AI-Native R&D
摘要 / Abstract
中文: 本文提出一种将 AI 从辅助工具转变为软件公司研发核心成员的方法论框架。通过在一家拥有 10 年历史、5.7 万个 issue 的 BI 软件公司的实践,我们验证了”产品知识库”作为 AI 研发中枢(JARVIS)记忆基座的可行性。核心发现是:AI 参与研发的瓶颈不是模型能力,而是组织记忆的结构化程度。本文从理论层面阐述为什么需要这样做、做什么、以及这样做的价值,但不涉及具体实施步骤。
English: This paper proposes a methodology framework for transforming AI from an auxiliary tool into a core member of a software company’s R&D team. Through practice at a BI software company with a 10-year history and 57,000+ issues, we validated the feasibility of a “Product Knowledge Base” as the memory foundation for an AI R&D hub (JARVIS). The core finding is: the bottleneck for AI participation in R&D is not model capability, but the degree of structured organizational memory. This paper explains why this is needed, what it entails, and its value proposition — without covering specific implementation steps.
1. 问题:为什么 Copilot 不够 / The Problem: Why Copilot Is Not Enough
大多数软件公司对 AI 的使用停留在”代码补全”层面——GitHub Copilot、Cursor、各种 IDE 插件。这些工具在战术层面有效,但存在一个根本性缺陷:
Most software companies use AI at the “code completion” level — GitHub Copilot, Cursor, various IDE plugins. These tools are tactically effective but have a fundamental flaw:
它们没有记忆。/ They have no memory.
当一个工程师面对一个 bug 时,他不只是在”写代码”。他需要:
When an engineer faces a bug, they’re not just “writing code.” They need to:
知道这个模块 3 年前为什么这样设计(设计决策)
知道类似的 bug 之前是否出现过(历史模式)
知道改了这里会影响哪些其他模块(跨模块依赖)
知道哪些方案曾经被否决过以及为什么(被否决需求)
知道这个功能的测试覆盖盲区在哪里(质量地图)
Know why this module was designed this way 3 years ago (design decisions)
Know if similar bugs have appeared before (historical patterns)
Know which other modules are affected by this change (cross-module dependencies)
Know which solutions were previously rejected and why (rejected features)
Know where the test coverage blind spots are (quality map)
这些知识存在于老员工的脑子里、散落在 GitLab 评论中、埋在 Slack 历史消息里。当老员工离职,这些知识就消失了。
This knowledge lives in senior engineers’ heads, scattered across GitLab comments, buried in Slack message history. When senior engineers leave, this knowledge vanishes.
Copilot 能帮你写一个函数,但它不知道这个函数为什么不应该被写。
Copilot can help you write a function, but it doesn’t know why that function shouldn’t be written.
2. 论点:AI 研发中枢需要”记忆” / The Thesis: AI R&D Hubs Need “Memory”
我们提出一个核心论点:
We propose a core thesis:
要让 AI 从”工具”升级为”团队成员”,必须先解决组织记忆的结构化问题。模型能力是充分条件,结构化记忆才是必要条件。
To upgrade AI from “tool” to “team member,” you must first solve the problem of structuring organizational memory. Model capability is a sufficient condition; structured memory is the necessary condition.
类比人类:一个新入职的天才工程师(等价于最强的 LLM),如果没有人给他讲公司产品的历史、设计决策、踩过的坑、被否决的方案,他也只能写出”技术上正确但业务上错误”的代码。
By analogy: a newly hired genius engineer (equivalent to the strongest LLM), without anyone telling them the product’s history, design decisions, past mistakes, and rejected proposals, can only write code that is “technically correct but business-wise wrong.”
JARVIS 不是一个更好的 Copilot。JARVIS 是一个拥有公司完整产品记忆的研发团队成员。
JARVIS is not a better Copilot. JARVIS is an R&D team member with the company’s complete product memory.
3. 框架:三层时态架构 / The Framework: Three-Layer Temporal Architecture
产品知识库的核心架构是按时态组织的三层结构:
The core architecture of the Product Knowledge Base is a three-layer structure organized by temporal state:
History(历史)— 已发生的一切 / Everything That Has Happened
覆盖产品从第一行代码到今天的所有关键事实:
Covers all key facts from the product’s first line of code to today:
模块深度文档:每个功能模块的架构、代码路径、已知问题模式、设计决策、测试覆盖
跨模块依赖矩阵:模块 A 改了什么会影响模块 B
破坏性变更索引:历史上所有 breaking changes
被否决需求索引:所有被拒绝的功能请求及其原因
Module deep documentation: architecture, code paths, known issue patterns, design decisions, test coverage for each module
Cross-module dependency matrix: what changes in Module A affect Module B
Breaking changes index: all historical breaking changes
Rejected features index: all rejected feature requests and their reasons
History 回答的核心问题是:**”这件事以前发生过吗?结果如何?”**
The core question History answers: “Has this happened before? What was the outcome?”
Present(现在)— 当前状态快照 / Current State Snapshot
实时(或近实时)反映产品的当前状态:
Reflects the product’s current state in real-time (or near real-time):
Backlog 快照:所有未完成的 issue,按模块、优先级、版本分类
版本计划:当前和未来版本的排期
团队配置:谁负责什么模块
Backlog snapshot: all unfinished issues, categorized by module, priority, version
Version plan: current and future version schedules
Team configuration: who is responsible for which module
Present 回答的核心问题是:**”现在是什么状况?”**
The core question Present answers: “What is the current situation?”
Future(未来)— AI 的判断产出 / AI’s Judgment Output
这是 JARVIS 真正产生价值的层,基于 History 和 Present 的知识做出产品经理级别的判断:
This is where JARVIS truly generates value — making product-manager-level judgments based on History and Present knowledge:
去重检测:新 issue 是否与历史 issue 重复?
根因分析:这个 bug 的本质原因是什么?是设计缺陷还是实现失误?
跨模块影响评估:修复这个 bug 会影响哪些其他模块?
实现复杂度估算:根据历史类似修复的工时推算
历史教训检索:以前类似的决策是否踩过坑?
Duplicate detection: Is this new issue a duplicate of a historical one?
Root cause analysis: What is the essential cause — design flaw or implementation mistake?
Cross-module impact assessment: Which other modules will be affected?
Implementation complexity estimation: Based on historical similar fixes
Historical lesson retrieval: Have similar decisions led to problems before?
Future 回答的核心问题是:**”应该怎么做?”**
The core question Future answers: “What should we do?”
4. 关键洞察 / Key Insights
4.1 被否决的需求是最有价值的知识 / Rejected Features Are the Most Valuable Knowledge
在我们的实践中,5.7 万个 issue 里有 7,311 条被否决的需求(by design / wontfix / not a bug / duplicate)。
In our practice, out of 57,000 issues, there are 7,311 rejected features (by design / wontfix / not a bug / duplicate).
这些被否决的需求蕴含了比已实现功能更重要的知识——它们记录了**”为什么不做”**。一个不知道”为什么不做”的 AI,会反复提出已经被否决的方案,浪费所有人的时间。
These rejected features contain knowledge more important than implemented features — they record “why we decided not to do it.” An AI that doesn’t know “why not” will repeatedly propose already-rejected solutions, wasting everyone’s time.
4.2 94 个 Bug 可能来自同一个设计缺陷 / 94 Bugs Can Stem from One Design Flaw
我们对”过滤器快照”功能做了一次深度分析:这个功能自 2022 年上线以来累计产生了 94 个 bug。但这 94 个 bug 的根因可以归结为同一个设计缺陷——功能上线时没有定义清晰的边界(哪些场景支持、哪些不支持、字段删除/重命名后快照如何失效)。
We conducted a deep analysis of the “filter snapshot” feature: since its launch in 2022, it has generated 94 bugs. But all 94 bugs trace back to a single design flaw — the feature launched without clearly defined boundaries (which scenarios are supported, what happens when fields are deleted/renamed, how snapshots should degrade).
没有产品知识库的 AI 会逐个修复这 94 个 bug。有产品知识库的 AI 会指出这是一个结构性设计缺陷,需要从根本上重新定义功能边界。
An AI without a product knowledge base will fix these 94 bugs one by one. An AI with a product knowledge base will identify this as a structural design flaw that needs fundamental boundary redefinition.
这就是”工具”和”团队成员”的区别。
This is the difference between a “tool” and a “team member.”
4.3 知识库必须与代码共同演进 / The Knowledge Base Must Co-evolve with Code
产品知识库不是一次性文档工程。它必须:
The product knowledge base is not a one-time documentation project. It must:
与 Git 仓库同源管理(版本控制、MR 审核、分支策略)
在每次 bug 修复后自动更新 known-issues
在每次设计决策后记录 decisions
定期从 issue 系统同步 backlog 快照
Be managed in the same Git workflow (version control, MR review, branching)
Auto-update known-issues after each bug fix
Record decisions after each design choice
Periodically sync backlog snapshots from the issue system
如果知识库和代码脱节了,AI 读到的就是过时的信息,做出的判断比没有知识库还危险。
If the knowledge base falls out of sync with code, the AI reads outdated information, and its judgments become more dangerous than having no knowledge base at all.
5. 价值模型 / Value Model
对工程师 / For Engineers
| 场景 | 没有 JARVIS | 有 JARVIS |
|---|---|---|
| 定位 bug | 读代码 + 问同事 + 搜 issue(30-120 min) | 搜索知识库 + 历史模式匹配(5-15 min) |
| 评估影响范围 | 凭经验猜测 | 跨模块依赖矩阵自动标注 |
| 避免重复方案 | 不知道之前有人提过 | 被否决需求索引自动告警 |
| 新人上手 | 3-6 个月 | 1-2 个月(有完整的模块文档和设计决策) |
对产品经理 / For Product Managers
| 场景 | 没有 JARVIS | 有 JARVIS |
|---|---|---|
| Backlog 优先级排序 | 凭感觉 + 客户压力 | 历史数据支撑的多维度评分 |
| 需求去重 | 人工对比 | 自动与 5.7 万条历史 issue 交叉检索 |
| 版本规划 | Excel 估算 | 基于历史修复工时的实现复杂度预测 |
对组织 / For the Organization
知识不随人员流动而流失:所有设计决策、历史教训都结构化存储
AI 能力随知识库增长而增强:不依赖更强的模型,而是更丰富的记忆
**研发决策从”经验驱动”变为”数据驱动”**:每个决策都有历史依据
Knowledge doesn’t leave when people leave: all design decisions and historical lessons are structurally stored
AI capability grows with the knowledge base: independent of stronger models, dependent on richer memory
R&D decisions shift from “experience-driven” to “data-driven”: every decision has historical backing
6. 前提条件 / Prerequisites
这个方法论不是适用于所有公司的银弹。它有明确的前提条件:
This methodology is not a silver bullet for all companies. It has clear prerequisites:
产品必须有足够的历史:至少 3-5 年的 issue 追踪数据。如果你的产品刚起步,先把 issue 管理做好。
必须有结构化的 issue 管理:标签体系、模块分类、milestone 管理。垃圾进垃圾出。
必须有人懂产品和技术:知识库的初始构建需要同时理解产品逻辑和技术实现的人来引导 AI。纯靠 AI 自动生成的知识库是垃圾。
组织必须愿意改变工作流:知识库维护必须融入日常研发流程,而不是额外负担。
The product must have sufficient history: at least 3-5 years of issue tracking data. If your product just started, focus on issue management first.
Structured issue management is required: label taxonomy, module classification, milestone management. Garbage in, garbage out.
Someone who understands both product and technology is needed: initial knowledge base construction requires someone who understands both product logic and technical implementation to guide the AI. A knowledge base auto-generated purely by AI is garbage.
The organization must be willing to change workflows: knowledge base maintenance must integrate into daily R&D processes, not be an extra burden.
7. 方法论的层次 / Levels of the Methodology
我们将这个方法论抽象为三个递进的层次,适用于不同阶段的公司:
We abstract this methodology into three progressive levels for companies at different stages:
Level 1: 被动记忆 / Passive Memory
将现有文档、issue、代码注释整理成结构化知识库
AI 能搜索和引用,但不主动参与
价值:加速信息检索,减少重复劳动
Organize existing docs, issues, and code comments into a structured knowledge base
AI can search and reference, but doesn’t proactively participate
Value: accelerate information retrieval, reduce redundant work
Level 2: 主动参与 / Active Participation
AI 能自动对新 issue 进行分类、去重、影响评估
AI 参与 Code Review,基于历史知识提出审查意见
知识库与 CI/CD 集成,自动更新
价值:AI 成为研发流程的一环
AI can automatically classify, deduplicate, and assess impact of new issues
AI participates in Code Review with history-informed opinions
Knowledge base integrates with CI/CD for auto-updates
Value: AI becomes part of the R&D workflow
Level 3: 自主演进 / Autonomous Evolution
AI 能自主维护和扩展知识库
AI 能发现知识盲区并主动补充
AI 对产品方向提出数据驱动的建议
价值:AI 成为产品演进的核心参与者
AI can autonomously maintain and extend the knowledge base
AI can discover knowledge gaps and proactively fill them
AI provides data-driven suggestions for product direction
Value: AI becomes a core participant in product evolution
我们的实践处于 Level 1 → Level 2 的过渡阶段。 Level 3 是长期目标。
Our practice is at the transition from Level 1 to Level 2. Level 3 is a long-term goal.
8. 与现有范式的区别 / How This Differs from Existing Paradigms
| 范式 / Paradigm | 特点 / Characteristics | 局限 / Limitations |
|---|---|---|
| RAG(检索增强生成) | 把文档塞进向量库,查询时检索 | 没有时态结构,不区分历史/现在/未来 |
| Code Agent(代码智能体) | 读代码 + 写代码 | 只看代码不看产品历史,不知道”为什么” |
| 知识图谱 | 三元组建模实体关系 | 过于刚性,维护成本高,难以表达”被否决的原因” |
| 本方法论 | 三层时态 × 模块化 × 与代码共同演进 | 需要初始投入和持续维护 |
关键区别:**本方法论强调”记忆”而非”检索”**。记忆是有组织的、有因果关系的、会自我更新的。检索只是从一堆文本中找相似片段。
Key difference: this methodology emphasizes “memory” over “retrieval.” Memory is organized, causal, and self-updating. Retrieval just finds similar fragments from a pile of text.
9. 结论 / Conclusion
软件行业正在经历从”AI 辅助编码”到”AI 参与研发”的转型。但大多数公司卡在了中间——他们有了强大的 LLM,却没有让 LLM 真正理解自己产品的结构化记忆。
The software industry is transitioning from “AI-assisted coding” to “AI-participated R&D.” But most companies are stuck in between — they have powerful LLMs but lack the structured memory that would let LLMs truly understand their products.
JARVIS 不是一个产品,不是一个工具,不是一个 SaaS。它是一种组织能力。 它的核心不是选哪个 LLM 或用哪个框架,而是如何把十年的组织记忆变成 AI 可以理解和推理的结构化知识。
JARVIS is not a product, not a tool, not a SaaS. It is an organizational capability. Its core is not which LLM to choose or which framework to use, but how to transform a decade of organizational memory into structured knowledge that AI can understand and reason about.
模型会迭代,框架会更替,但结构化的产品记忆只会越来越有价值。
Models iterate, frameworks change, but structured product memory only becomes more valuable over time.
本文基于作者在一家 10 年历史的 BI 软件公司的实际实践,使用 OpenClaw 作为 AI 研发中枢运行环境。
This paper is based on the author’s practice at a BI software company with 10 years of history, using OpenClaw as the AI R&D hub runtime.
作者 / Author: Thomas Chan
日期 / Date: 2026-03-23
标签 / Tags: AI Native, JARVIS, Product Knowledge, Software Engineering, OpenClaw
