The Last Human-Written Paper

Agent-Native Research Artifacts

Posted on April 24, 2026 by Amber Liu

In the near future, most CS papers will be written by AI, and most will be read by AI.

When neither the author nor the audience is human, the three-century-old paper format stops making sense. Papers flatten a branching research process into a clean story, and that flattening imposes two taxes.

The Storytelling Tax

Research is inherently branching and exploratory. Scientists try dozens of approaches, hit dead ends, pivot, and iterate. But papers collapse this rich process into a single winning narrative, discarding every failed attempt, rejected hypothesis, and negative result.

The Real Research Process

What Gets Published

Research explores many branches, but papers only report the winning path. The map of where not to go, often the most expensive knowledge a project produces, never leaves the lab.

The Engineering Tax

Papers describe methods at the precision needed to convince a reviewer, not at the precision needed to reproduce the work. Hyperparameters are underspecified. Warmup schedules live in someone's head. Numerical stability fixes exist in no document. The gap between "sufficient to believe" and "sufficient to execute" is where reproduction breaks down.

Reproduction Information Gap

8,921 expert-annotated reproduction requirements across 23 ICML papers (PaperBench)

Fully specified in PDF45.4%

Missing hyperparameters26.2%

Vague description21.9%

Cross-reference only13.4%

Missing code / baseline detail21.7%

Less than half of what an agent needs to reproduce a paper is actually in the PDF.

The information exists somewhere (a lab notebook, a Slack thread, the original author's muscle memory), but not in any document an AI agent can access. Every reproduction attempt pays the full cost of rediscovering it.

The Solution: Four Interlocking Layers

ARA restructures a paper into four machine-native layers. Together they form a single executable knowledge package: the organized, evolving knowledge produced during research, not the narrative compiled afterward.

PAPER.md                      # Human-readable overview & entry point
│
├── logic/                    # Cognitive Layer
│   ├── claims.yaml           # Falsifiable claims with epistemic status
│   ├── concepts/             # Formal concept definitions
│   ├── experiments/          # Declarative experiment plans
│   └── problem_spec.md       # The "what and why" of the research
│
├── src/                     # Physical Layer
│   ├── kernel/               # Novel algorithm core
│   ├── configs/              # Annotated with search ranges & sensitivity
│   └── environment.yaml      # Exact reproducibility spec
│
├── trace/                   # Exploration Graph
│   ├── graph.json            # Full branching research DAG
│   ├── dead_ends/            # Every failed attempt preserved
│   └── pivots/               # Decision points & lessons learned
│
└── evidence/                # Evidence Layer
    ├── results/              # Machine-readable quantitative outputs
    ├── logs/                 # Raw experiment logs
    └── curves/               # Training curves & metrics

Live Research Manager

ARA doesn't require researchers to manually package their work. The Live Research Manager silently captures the research trajectory during AI-human collaboration: no interruptions, no extra effort. The artifact builds itself in the background.

Collaborate with AI on research, and the trajectory is automatically captured with epistemic provenance: every claim tagged with who proposed it, who verified it, and how strongly the evidence supports it.

Silent integration Epistemic objectivity Framework independence Comprehensive capture Faithful translation

Results

A paper shows the path taken. ARA remembers the paths abandoned, and the choices that made the road. We evaluate across three layers: understanding, reproduction, and extension.

Understanding

+21.3pp

93.7% vs 72.4% across 450 questions

PaperBench + RE-Bench · wins every category

Reproduction

+7.0pp

64.4% vs 57.4%; advantage grows with difficulty

150 subtasks · 15 PaperBench papers

Extension

3/5

Tasks where ARA wins on best score; reaches a useful first move earlier on all 5

5 RE-Bench tasks · MALT failure traces

Knowledge over Narrative

The organized, evolving knowledge produced during research is the primary scientific object. The narrative paper is a compiled view.

Cite

@article{liu2026ara,
  title   = {The Last Human-Written Paper: Agent-Native Research Artifacts},
  author  = {Liu, Jiachen and Pei, Jiaxin and Huang, Jintao and Si, Chenglei and Qu, Ao and Tang, Xiangru and Lu, Runyu and Chen, Lichang and Bai, Xiaoyan and Zheng, Haizhong and Chen, Carl and Chen, Zhiyang and Ye, Haojie and Fu, Yujuan and He, Zexue and Jin, Zijian and Zhang, Zhenyu and Sun, Shangquan and Harmon, Maestro and Wang, Dianzhuo and Zeng, Jianqiao and Sun, Jiachen and Wu, Mingyuan and Zhou, Baoyu and You, Chenyu and Lu, Shijian and Qiu, Yiming and Lai, Fan and Yuan, Yuan and Li, Yao and Hong, Junyuan and Zhu, Ruihao and Chen, Beidi and Pentland, Alex and Chen, Ang and Chowdhury, Mosharaf and Zhang, Zechen},
  year    = {2026},
  journal = {arXiv preprint arXiv:2604.24658},
  url     = {https://arxiv.org/abs/2604.24658}
}