STEM BIO-AI Local Audit
Facebook LinkedIn Email
Link copied ✓
60/ 100
STEM BIO-AI Local Audit  |  2026-05-21

Runchuan-BU/BioClaw

T2 CautionDeterministic local scanNo LLM / no network / no runtime execution

Research reference and supervised non-clinical technical review only.

Executive Summary

60
Final Score
70
S1 Intent
50
S2 Repo
54
S3 Code/Bio
35
S4 Replication
TL;DR

Decision memo

This repository lands at T2 Caution with a final score of 60/100. The result is driven more by boundary, workflow-support, and governance weaknesses than by classic code-pattern failures.

Policy: defaultStatus: authoritative_releaseMode: mirror_only
Primary Risks

What pushed the review down

  • Clinical-adjacent surfaces exist without an explicit non-diagnostic/non-clinical boundary.
  • C5_compliance_boundary_integrity: WARN
Positive Evidence

What still supports reviewability

  • Package metadata was available for repo-local consistency checks.
  • CI workflow files were detected.
  • Documentation files were detected.
Freshness

When to re-check

Review after 45 days. Expires on 2026-07-05.

Change-triggered re-audit now: False

Stage 2R Focus

Repo-local contradictions

  • R2R_D2_missing_clinical_use_boundary-20
    R2R_D2_missing_clinical_use_boundaryclinical_adjacent=True and explicit non-clinical boundary was not detected
  • R2R_3_readme_test_ci_alignment+10
    R2R_3_readme_test_ci_alignmentworkflow/test support terms found across README and local support surfaces
Stage 3 Focus

Accountability surfaces

  • T1_CI_CD+15
    S3_T1_workflow_filesworkflow files present under .github/workflows/
  • B1_data_provenance_controls+15
    S3_B1_dependency_manifestdependency or lock manifest presence plus data-source, dataset, or IRB language review
  • B2_bias_limitations+8
    S3_B2_bias_limitationsbias/limitations vocabulary with optional measurement-evidence escalation
Policy Boundary

How to read this artifact

Mirror-only policy surface: selected profile metadata is shown in this report, but authoritative scan scoring still follows deterministic runtime constants. Preview-only posture changes, including Stage 4 replication emphasis, do not change the formal score until a future read-through phase.

Decision Path ?

Final = 0.4 × S1 + 0.2 × S2R + 0.4 × S3 − C1_penalty  |  Stage 4 remains a separate replication lane.
Configured, Not Rewritten

Changing review posture does not require touching the score core

Use stem policy simulate with a governed profile file when you want to preview a different review posture. The authoritative score path stays deterministic; the profile is surfaced as metadata and preview-only interpretation.

If you only need the default posture, you do not need a profile file at all.

profile.json
{
  "profile_name": "strict_clinical_adjacency",
  "profile_read_mode": "mirror_only"
}
command
stem policy simulate /path/to/repo --profile-file profile.json
artifact note
Calibration Effect: mirror-only
Policy metadata surfaced
Formal score unchanged
Stage 1 — README Intent ?
70

Claim language, limitation posture, and clinical boundary wording.

  • R2_regulatory_framework-5
    CA-INDIRECT surface lacks regulatory or governance framework language.
  • R3_clinical_disclaimer-5
    CA-INDIRECT surface lacks explicit non-clinical or non-diagnostic boundary.
  • S1_domain_readme+10
    README exposes bio/medical domain vocabulary.
  • R4_demographic_bias_boundary+10
    Demographic, subgroup, fairness, bias, or validation-cohort language detected.
Stage 2R — Repo Consistency ?
50

Internal contradictions between README, workflow claims, and support surfaces.

  • R2R_D2_missing_clinical_use_boundary-20
    R2R_D2_missing_clinical_use_boundaryclinical_adjacent=True and explicit non-clinical boundary was not detected
  • R2R_3_readme_test_ci_alignment+10
    R2R_3_readme_test_ci_alignmentworkflow/test support terms found across README and local support surfaces
Stage 3 — Code / Bio Responsibility ?
54

Engineering accountability, provenance, and reviewable responsibility surfaces.

  • T1_CI_CD+15
    S3_T1_workflow_filesworkflow files present under .github/workflows/
  • B1_data_provenance_controls+15
    S3_B1_dependency_manifestdependency or lock manifest presence plus data-source, dataset, or IRB language review
  • B2_bias_limitations+8
    S3_B2_bias_limitationsbias/limitations vocabulary with optional measurement-evidence escalation
  • B3_coi_funding+5
    S3_B3_coi_fundingCOI/funding/sponsor language review across README, docs, FUNDING, CITATION, and AUTHORS surfaces
Stage 4 — Replication ?
35

Reproducibility evidence is reported separately and does not alter the formal tier.

  • S4_environment_lock_evidence+10
    Environment, dependency, or lock manifest detected.
  • S4_exact_dependency_pins_or_hashes+10
    Exact dependency pin or hash evidence detected.
  • S4_model_weight_url_or_checksum+10
    Model artifact URL or checksum evidence detected.
  • S4_cli_entrypoint+5
    CLI entry point or argparse interface detected.

Code Integrity & Contract ?

Warnings First

Mapped risk lanes that fired

Clear Lanes

What stayed quiet in the current rule scope

Why can Code Integrity contain PASS while the overall score is still low?

Because Code Integrity is a narrow detector family. The formal score is still driven mainly by Stage 1, Stage 2R, and Stage 3 evidence posture.

What changed in the C4 / C5 / C6 split?

C4 is now reserved for executable fail-open exception behavior, C5 for unsupported compliance or boundary integrity claims, and C6 for mock-auth or no-auth trust-boundary signals.

MIT AI Risk Repository Coverage ?V4_03 | airisk.mit.edu

Feature Explainer

What this section is doing

AIRI is used here as a bounded risk-vocabulary layer around deterministic repository findings. The report uses the curated runtime bundle, not the full upstream AIRI universe.

6%
2 / 32 risks in detector scope
Bundle scope: curated_medical_clinical_subset
Snapshot: 2026-04-23 | License: MIT

Derived from The AI Risk Repository V4_03. Original source remains MIT-licensed and must be attributed in README, docs, runtime artifacts, and local registry metadata.

What does 7 / 32 mean?

It means seven AIRI risk IDs are currently reached by active local detector mappings, out of thirty-two AIRI risk IDs in the current detector scope.

What does “why mapped” mean?

Each covered AIRI row carries a bounded explanation built from the triggered detector, the local mapping justification, and the trigger reason surfaced by the scan.

What does AIRI not prove here?

AIRI does not independently verify harm, causality, clinical failure, or legal noncompliance. It is a risk-vocabulary layer around local findings.

Mapped, Not Guessed

AIRI rows light up through active detector mappings

The report does not infer AIRI coverage from prose alone. Coverage appears when a local detector fires and a governed mapping exists in the current AIRI runtime bundle.

trigger
C6_mock_auth_or_fail_open_boundary
status: detected
mapping
R2R_D5_single_external_service_dependency
→ 72.04.02 Market Concentration
report surface
covered_by: detector id
why: bounded mapping reason
Coverage Explorer

Covered and gap rows

AllAll Domains2/5
1Discrimination & Toxicity0/1
2Privacy & Security0/1
3Misinformation1/1
4Malicious Actors & Misuse0/0
5Human-Computer Interaction0/0
6Socioeconomic & Environmental0/1
7AI System Safety, Failures & Limitations1/1
Click a domain card to filter. Counts are shown as covered / gaps.
IDRiskDomainCovered by / Note
24.01.03Safe exploration problem with widely deployed AI assistLack of capability or robustnessC5_compliance_boundary_integrity: Clinical-adjacent surfaces exis
69.01.00False informationFalse or misleading informationC5_compliance_boundary_integrity: Clinical-adjacent surfaces exis
65.03.03Reidentification2.1CC-3 catches shallow validators; dedicated reidentify() API expos
70.02.02Misinformation — hallucination of clinical knowledge3.1CC-1 catches threshold=0.0 default; actual output-level hallucina
39.25.00Verifiability — black-box AI in medical healthcare7.4B2 detects surface language only; Model Card / interpretability a
11.02.00Allocative Harms — withheld resources in healthcare1.1Subgroup performance disparities require dynamic evaluation; outs
72.04.02Market Concentration — healthcare single-point failures6.1Systemic risk beyond single-repository scope.

Evidence Detail ?

All (62) FAIL WARN PASS INFO
SEVDetectorFindingFile
INFOS1_readme_bio_termsREADME exposes bio/medical vocabulary.README.md
INFOS1_readme_bio_termsREADME exposes bio/medical vocabulary.README.md
INFOS1_clinical_boundaryNo evidence detected for S1_clinical_boundary..
INFOS1_H1_clinical_certainty_hypeNo evidence detected for S1_H1_clinical_certainty_hype..
INFOS1_H2_regulatory_approval_hypeNo evidence detected for S1_H2_regulatory_approval_hype..
INFOS1_H3_autonomous_replacement_hypNo evidence detected for S1_H3_autonomous_replacement_hype..
INFOS1_H4_breakthrough_marketing_hypNo evidence detected for S1_H4_breakthrough_marketing_hype..
INFOS1_H5_universal_generalization_hNo evidence detected for S1_H5_universal_generalization_hype..
INFOS1_H6_perfect_accuracy_hypeNo evidence detected for S1_H6_perfect_accuracy_hype..
INFOS1_R1_limitations_sectionNo evidence detected for S1_R1_limitations_section..
INFOS1_R2_regulatory_frameworkNo evidence detected for S1_R2_regulatory_framework..
INFOS1_R2_weak_regulatory_self_asserNo evidence detected for S1_R2_weak_regulatory_self_assertion..
INFOS1_R2_unsupported_legal_or_complNo unsupported legal or compliance claim pattern was detected..
INFOS1_R4_demographic_bias_boundaryDemographic, subgroup, fairness, bias, or validation-cohort language detected.docs/SDK_DEEP_DIVE.md
INFOS1_R5_reproducibility_provisionsNo evidence detected for S1_R5_reproducibility_provisions..
INFOS3_T1_workflow_filesWorkflow file exists..github/workflows/skills-only.yml
INFOS3_T1_workflow_filesWorkflow file exists..github/workflows/test.yml
INFOS3_T2_domain_testsNo evidence detected for S3_T2_domain_tests..
INFOS3_T3_changelog_release_hygieneNo evidence detected for S3_T3_changelog_release_hygiene..
INFOS3_T3_changelog_bugfix_evidenceNo evidence detected for S3_T3_changelog_bugfix_evidence..
INFOS3_B1_dependency_manifestDependency or environment manifest exists.package-lock.json
INFOS3_B1_dependency_manifestDependency or environment manifest exists.package.json
INFOS3_B1_data_source_languageData source, dataset citation, IRB, or provenance language detected.README.md
INFOS3_B1_data_source_languageData source, dataset citation, IRB, or provenance language detected.README.md
INFOS3_B1_data_source_languageData source, dataset citation, IRB, or provenance language detected.docs/BEGINNER_GUIDE.md
INFOS3_B1_data_source_languageData source, dataset citation, IRB, or provenance language detected.docs/BEGINNER_GUIDE.zh-CN.md
INFOS3_B2_bias_limitationsBias, limitation, or validation-boundary language detected.docs/CHANNELS.md
INFOS3_B2_bias_limitationsBias, limitation, or validation-boundary language detected.docs/CHANNELS.md
INFOS3_B2_measurement_evidenceNo evidence detected for S3_B2_measurement_evidence..
INFOS3_B3_coi_fundingCOI, funding, sponsor, or acknowledgement language detected.docs/CHANNELS.md
INFOS2_package_bio_termsNo evidence detected for S2_package_bio_terms..
INFOR2R_D5_single_external_service_dNo named required external service dependency pattern was detected..
INFOC6_mock_auth_or_fail_open_boundaNo mock-auth or fail-open local-boundary pattern was detected..
INFOC1_hardcoded_credentialsCredential-like placeholder or test/example fixture ignored for C1 penalty.scripts/setup.sh
INFOC2_dependency_pinningNo loose dependency evidence detected..
INFOC3_dead_or_deprecated_patient_adNo evidence detected for C3_dead_or_deprecated_patient_adjacent_paths..
INFOC4_exception_handling_clinical_aNo fail-open exception handler detected in executable Python code..
INFOBIO_smiles_surface_integrityNo malformed or suspicious SMILES-like strings detected by conservative surface checks..
INFOBIO_smiles_rdkit_validationRDKit optional validation lane not exercised because no SMILES-like candidates were detect.
INFOBIO_smiles_parser_guardNo missing None/invalid guards detected after SMILES parser calls..
INFOBIO_silent_mock_fallbackNo silent mock or simulated-data fallback patterns detected in production code paths..
INFOBIO_trace_manifestNo traceability manifest or runtime audit-log schema surface detected..
INFOBIO_run_traceNo risky subprocess or os.system bio-tool execution patterns detected..
INFOAST_argparse_cliargparse CLI interface detected.container/skills/bio-tools/templates/pymol_render_
INFOAST_argparse_cliargparse CLI interface detected.container/skills/bio-tools/templates/qc_summary_pl
INFOAST_argparse_cliargparse CLI interface detected.container/skills/bio-tools/templates/volcano_plot_
INFOAST_argparse_cliargparse CLI interface detected.container/skills/sec-report/sec_pipeline.py
INFOAST_argparse_cliargparse CLI interface detected.container/skills/sec-report/tests/generate_test_da
INFOS4_container_environmentNo evidence detected for S4_container_environment..
INFOS4_make_reproduce_targetNo Makefile detected..
INFOS4_environment_lock_evidenceEnvironment, dependency, or lock manifest detected.package-lock.json
INFOS4_exact_dependency_pins_or_hashLock manifest with exact dependency resolution detected.package-lock.json
INFOS4_readme_reproducibility_sectioREADME exists but no reproducibility or replication section heading was detected..
INFOS4_checksum_filesNo evidence detected for S4_checksum_files..
INFOS4_dataset_urlDocumentation exists but no dataset URL or data source URL was detected..
INFOS4_model_weight_url_or_checksumModel weight, checkpoint, or model artifact URL detected.README.md
INFOS4_model_weight_url_or_checksumModel weight, checkpoint, or model artifact URL detected.docs/BEGINNER_GUIDE.md
INFOS4_citation_cffNo evidence detected for S4_citation_cff..
INFOS4_license_restrictionLicense/readme/docs surfaces exist but no restriction language was detected..
INFOS4_cli_entrypointargparse CLI evidence detected by AST summary..
INFOS4_seed_settingNo deterministic seed setting evidence detected by AST summary..
INFOS4_runnable_examplesNo evidence detected for S4_runnable_examples..