Flamehaven Labs' evidence-based alignment with the Leiden Declaration on Artificial Intelligence and Mathematics.
2 June 2026 · DOI: 10.5281/zenodo.20302944
Every claim on this page is backed by a public, reproducible verification record — not a pledge.
The Declaration identifies five core values of mathematical research — proof-based certainty, attributable authorship, transparent verifiability, shared evaluation standards, and disciplinary autonomy — and documents how current AI deployments threaten each one. It calls on individual mathematicians, institutions, funders, and AI companies to act.
Below we map each Declaration recommendation to a concrete Flamehaven artifact or practice. "Implemented" means the practice exists and is verifiable from public records today.
Every EQA verification record explicitly classifies AI tool involvement. The highest-profile example: EQA-TEST-0057 is formally labeled a "High-Formality, Fake Physics Slop Artifact" — the AI-generated codebase passed execution checks but rested on physically ungrounded assumptions. This classification is not buried in a footnote; it is the primary verdict surfaced on the card and in the Inspector.
All verification results are published under CC BY-NC 4.0. Every experiment with an external artifact carries a Zenodo DOI for citable, archival reference. The bioscience compliance scanner (STEM-BIO-AI) is fully open-source under the MIT license — reproduce any BSC audit with pip install stem-ai; stem scan --level 3.
Flamehaven does not conflate execution success with physical validity. EQA-TEST-0058 reports DEGRADED_PASS (48 / 0 / 1 / 1) — surfacing two specific failures rather than issuing a clean PASS. The result record states: "Verified for model-output consistency only, not external physical validity." Responsibility for the correctness boundary is explicitly retained by the human reviewer, not delegated to the pipeline.
All experiment records attribute authorship to the human reviewer (Flamehaven Labs, ORCID 0009-0009-2641-4280). No AI system is listed as author or credited with verdicts. The governance pipeline (LOGOS → LawBinder → SPAR) acts as a structured constraint layer; human sign-off is required before any result is published to the ledger.
Every EQA record cites the source paper (title, authors, DOI), the specific artifact version under test (repository commit or Zenodo release), and the verification harness version. The OpenAI Erdős Conjecture record (EQA-TEST-0056) names the original proof authors and prior human work, not just the AI model, as the decisive intellectual contribution.
Flamehaven's governance pipeline enforces hard boundaries. The LawBinder gate issues INHIBIT when a candidate lacks a concrete algebraic model (EQA-TEST-0054: contract score 0.625, dangerous-pass risk 1.0). No result proceeds to SPAR review without passing the intake gate. This is documented in the public record, not an internal checklist.
Flamehaven does not announce results through press releases. Every published result on this ledger has a Zenodo DOI or a citable peer-reviewed paper as its anchor. The verification record is the publication — not a blog post summary of a private run.
The STEM-BIO-AI bioscience compliance scanner is fully open-source (MIT license) and self-contained — no cloud API, no proprietary dependency. Any researcher can reproduce any BSC audit in this ledger from the committed repository state. The ledger itself is a static GitHub Pages site: zero vendor lock-in, zero runtime cost.