STEM BIO-AI — Runchuan-BU/BioClaw

Executive Summary

Final Score

S1 Intent

S2 Repo

S3 Code/Bio

S4 Replication

TL;DR

Decision memo

This repository lands at T2 Caution with a final score of 60/100. The result is driven more by boundary, workflow-support, and governance weaknesses than by classic code-pattern failures.

Policy: defaultStatus: authoritative_releaseMode: mirror_only

Primary Risks

What pushed the review down

Clinical-adjacent surfaces exist without an explicit non-diagnostic/non-clinical boundary.
C5_compliance_boundary_integrity: WARN

Positive Evidence

What still supports reviewability

Package metadata was available for repo-local consistency checks.
CI workflow files were detected.
Documentation files were detected.

Freshness

When to re-check

Review after 45 days. Expires on 2026-07-05.

Change-triggered re-audit now: False

Stage 2R Focus

Repo-local contradictions

R2R_D2_missing_clinical_use_boundary-20
R2R_D2_missing_clinical_use_boundaryclinical_adjacent=True and explicit non-clinical boundary was not detected
R2R_3_readme_test_ci_alignment+10
R2R_3_readme_test_ci_alignmentworkflow/test support terms found across README and local support surfaces

Stage 3 Focus

Accountability surfaces

T1_CI_CD+15
S3_T1_workflow_filesworkflow files present under .github/workflows/
B1_data_provenance_controls+15
S3_B1_dependency_manifestdependency or lock manifest presence plus data-source, dataset, or IRB language review
B2_bias_limitations+8
S3_B2_bias_limitationsbias/limitations vocabulary with optional measurement-evidence escalation

Policy Boundary

How to read this artifact

Mirror-only policy surface: selected profile metadata is shown in this report, but authoritative scan scoring still follows deterministic runtime constants. Preview-only posture changes, including Stage 4 replication emphasis, do not change the formal score until a future read-through phase.

Decision Path ?

Configured, Not Rewritten

Changing review posture does not require touching the score core

Use stem policy simulate with a governed profile file when you want to preview a different review posture. The authoritative score path stays deterministic; the profile is surfaced as metadata and preview-only interpretation.

If you only need the default posture, you do not need a profile file at all.

profile.json

{
  "profile_name": "strict_clinical_adjacency",
  "profile_read_mode": "mirror_only"
}

command

stem policy simulate /path/to/repo --profile-file profile.json

artifact note

Calibration Effect: mirror-only
Policy metadata surfaced
Formal score unchanged

Stage 1 — README Intent ?

Claim language, limitation posture, and clinical boundary wording.

R2_regulatory_framework-5
CA-INDIRECT surface lacks regulatory or governance framework language.
R3_clinical_disclaimer-5
CA-INDIRECT surface lacks explicit non-clinical or non-diagnostic boundary.
S1_domain_readme+10
README exposes bio/medical domain vocabulary.
R4_demographic_bias_boundary+10
Demographic, subgroup, fairness, bias, or validation-cohort language detected.

Stage 2R — Repo Consistency ?

Internal contradictions between README, workflow claims, and support surfaces.

R2R_D2_missing_clinical_use_boundary-20
R2R_D2_missing_clinical_use_boundaryclinical_adjacent=True and explicit non-clinical boundary was not detected
R2R_3_readme_test_ci_alignment+10
R2R_3_readme_test_ci_alignmentworkflow/test support terms found across README and local support surfaces

Stage 3 — Code / Bio Responsibility ?

Engineering accountability, provenance, and reviewable responsibility surfaces.

T1_CI_CD+15
S3_T1_workflow_filesworkflow files present under .github/workflows/
B1_data_provenance_controls+15
S3_B1_dependency_manifestdependency or lock manifest presence plus data-source, dataset, or IRB language review
B2_bias_limitations+8
S3_B2_bias_limitationsbias/limitations vocabulary with optional measurement-evidence escalation
B3_coi_funding+5
S3_B3_coi_fundingCOI/funding/sponsor language review across README, docs, FUNDING, CITATION, and AUTHORS surfaces

Stage 4 — Replication ?

Reproducibility evidence is reported separately and does not alter the formal tier.

S4_environment_lock_evidence+10
Environment, dependency, or lock manifest detected.
S4_exact_dependency_pins_or_hashes+10
Exact dependency pin or hash evidence detected.
S4_model_weight_url_or_checksum+10
Model artifact URL or checksum evidence detected.
S4_cli_entrypoint+5
CLI entry point or argparse interface detected.

Code Integrity & Contract ?

Warnings First

Mapped risk lanes that fired

C5 Compliance Boundary Integrity

WARN

Clinical-adjacent surfaces exist without an explicit non-diagnostic/non-clinical

⌄

Clear Lanes

What stayed quiet in the current rule scope

C1 Hardcoded Credentials

PASS

No direct credential patterns detected by local CLI scan.

⌄

C2 Dependency Pinning

PASS

Dependency manifest appears pinned or not present.

⌄

C3 Deprecated Patient Paths

PASS

No deprecated patient-adjacent metadata patterns detected.

⌄

C4 Fail-Open Exceptions

PASS

No executable fail-open exception handler detected.

⌄

C6 Mock Auth / Fail-Open Boundary

PASS

No mock-auth or fail-open local-boundary warning detected in reviewed sources.

⌄

CC1 Clinical Zero Default

PASS

count=0

⌄

CC2 API Contract

PASS

count=0

⌄

CC3 Shallow Validator

PASS

count=0

⌄

Why can Code Integrity contain PASS while the overall score is still low?

Because Code Integrity is a narrow detector family. The formal score is still driven mainly by Stage 1, Stage 2R, and Stage 3 evidence posture.

What changed in the C4 / C5 / C6 split?

C4 is now reserved for executable fail-open exception behavior, C5 for unsupported compliance or boundary integrity claims, and C6 for mock-auth or no-auth trust-boundary signals.

MIT AI Risk Repository Coverage ?V4_03 | airisk.mit.edu

Feature Explainer

What this section is doing

AIRI is used here as a bounded risk-vocabulary layer around deterministic repository findings. The report uses the curated runtime bundle, not the full upstream AIRI universe.

2 / 32 risks in detector scope

Bundle scope: curated_medical_clinical_subset

Snapshot: 2026-04-23 | License: MIT

Derived from The AI Risk Repository V4_03. Original source remains MIT-licensed and must be attributed in README, docs, runtime artifacts, and local registry metadata.

What does 7 / 32 mean?

It means seven AIRI risk IDs are currently reached by active local detector mappings, out of thirty-two AIRI risk IDs in the current detector scope.

What does “why mapped” mean?

Each covered AIRI row carries a bounded explanation built from the triggered detector, the local mapping justification, and the trigger reason surfaced by the scan.

What does AIRI not prove here?

AIRI does not independently verify harm, causality, clinical failure, or legal noncompliance. It is a risk-vocabulary layer around local findings.

Mapped, Not Guessed

AIRI rows light up through active detector mappings

The report does not infer AIRI coverage from prose alone. Coverage appears when a local detector fires and a governed mapping exists in the current AIRI runtime bundle.

trigger

C6_mock_auth_or_fail_open_boundary
status: detected

mapping

R2R_D5_single_external_service_dependency
→ 72.04.02 Market Concentration

report surface

covered_by: detector id
why: bounded mapping reason

Coverage Explorer

Covered and gap rows

AllAll Domains2/5

1Discrimination & Toxicity0/1

2Privacy & Security0/1

3Misinformation1/1

4Malicious Actors & Misuse0/0

5Human-Computer Interaction0/0

6Socioeconomic & Environmental0/1

7AI System Safety, Failures & Limitations1/1

Click a domain card to filter. Counts are shown as covered / gaps.

ID	Risk	Domain	Covered by / Note
24.01.03	Safe exploration problem with widely deployed AI assist	Lack of capability or robustness	C5_compliance_boundary_integrity: Clinical-adjacent surfaces exis
69.01.00	False information	False or misleading information	C5_compliance_boundary_integrity: Clinical-adjacent surfaces exis
65.03.03	Reidentification	2.1	CC-3 catches shallow validators; dedicated reidentify() API expos
70.02.02	Misinformation — hallucination of clinical knowledge	3.1	CC-1 catches threshold=0.0 default; actual output-level hallucina
39.25.00	Verifiability — black-box AI in medical healthcare	7.4	B2 detects surface language only; Model Card / interpretability a
11.02.00	Allocative Harms — withheld resources in healthcare	1.1	Subgroup performance disparities require dynamic evaluation; outs
72.04.02	Market Concentration — healthcare single-point failures	6.1	Systemic risk beyond single-repository scope.

Evidence Detail ?

All (62) FAIL WARN PASS INFO

SEV	Detector	Finding	File
INFO	S1_readme_bio_terms	README exposes bio/medical vocabulary.	README.md
INFO	S1_readme_bio_terms	README exposes bio/medical vocabulary.	README.md
INFO	S1_clinical_boundary	No evidence detected for S1_clinical_boundary.	.
INFO	S1_H1_clinical_certainty_hype	No evidence detected for S1_H1_clinical_certainty_hype.	.
INFO	S1_H2_regulatory_approval_hype	No evidence detected for S1_H2_regulatory_approval_hype.	.
INFO	S1_H3_autonomous_replacement_hyp	No evidence detected for S1_H3_autonomous_replacement_hype.	.
INFO	S1_H4_breakthrough_marketing_hyp	No evidence detected for S1_H4_breakthrough_marketing_hype.	.
INFO	S1_H5_universal_generalization_h	No evidence detected for S1_H5_universal_generalization_hype.	.
INFO	S1_H6_perfect_accuracy_hype	No evidence detected for S1_H6_perfect_accuracy_hype.	.
INFO	S1_R1_limitations_section	No evidence detected for S1_R1_limitations_section.	.
INFO	S1_R2_regulatory_framework	No evidence detected for S1_R2_regulatory_framework.	.
INFO	S1_R2_weak_regulatory_self_asser	No evidence detected for S1_R2_weak_regulatory_self_assertion.	.
INFO	S1_R2_unsupported_legal_or_compl	No unsupported legal or compliance claim pattern was detected.	.
INFO	S1_R4_demographic_bias_boundary	Demographic, subgroup, fairness, bias, or validation-cohort language detected.	docs/SDK_DEEP_DIVE.md
INFO	S1_R5_reproducibility_provisions	No evidence detected for S1_R5_reproducibility_provisions.	.
INFO	S3_T1_workflow_files	Workflow file exists.	.github/workflows/skills-only.yml
INFO	S3_T1_workflow_files	Workflow file exists.	.github/workflows/test.yml
INFO	S3_T2_domain_tests	No evidence detected for S3_T2_domain_tests.	.
INFO	S3_T3_changelog_release_hygiene	No evidence detected for S3_T3_changelog_release_hygiene.	.
INFO	S3_T3_changelog_bugfix_evidence	No evidence detected for S3_T3_changelog_bugfix_evidence.	.
INFO	S3_B1_dependency_manifest	Dependency or environment manifest exists.	package-lock.json
INFO	S3_B1_dependency_manifest	Dependency or environment manifest exists.	package.json
INFO	S3_B1_data_source_language	Data source, dataset citation, IRB, or provenance language detected.	README.md
INFO	S3_B1_data_source_language	Data source, dataset citation, IRB, or provenance language detected.	README.md
INFO	S3_B1_data_source_language	Data source, dataset citation, IRB, or provenance language detected.	docs/BEGINNER_GUIDE.md
INFO	S3_B1_data_source_language	Data source, dataset citation, IRB, or provenance language detected.	docs/BEGINNER_GUIDE.zh-CN.md
INFO	S3_B2_bias_limitations	Bias, limitation, or validation-boundary language detected.	docs/CHANNELS.md
INFO	S3_B2_bias_limitations	Bias, limitation, or validation-boundary language detected.	docs/CHANNELS.md
INFO	S3_B2_measurement_evidence	No evidence detected for S3_B2_measurement_evidence.	.
INFO	S3_B3_coi_funding	COI, funding, sponsor, or acknowledgement language detected.	docs/CHANNELS.md
INFO	S2_package_bio_terms	No evidence detected for S2_package_bio_terms.	.
INFO	R2R_D5_single_external_service_d	No named required external service dependency pattern was detected.	.
INFO	C6_mock_auth_or_fail_open_bounda	No mock-auth or fail-open local-boundary pattern was detected.	.
INFO	C1_hardcoded_credentials	Credential-like placeholder or test/example fixture ignored for C1 penalty.	scripts/setup.sh
INFO	C2_dependency_pinning	No loose dependency evidence detected.	.
INFO	C3_dead_or_deprecated_patient_ad	No evidence detected for C3_dead_or_deprecated_patient_adjacent_paths.	.
INFO	C4_exception_handling_clinical_a	No fail-open exception handler detected in executable Python code.	.
INFO	BIO_smiles_surface_integrity	No malformed or suspicious SMILES-like strings detected by conservative surface checks.	.
INFO	BIO_smiles_rdkit_validation	RDKit optional validation lane not exercised because no SMILES-like candidates were detect	.
INFO	BIO_smiles_parser_guard	No missing None/invalid guards detected after SMILES parser calls.	.
INFO	BIO_silent_mock_fallback	No silent mock or simulated-data fallback patterns detected in production code paths.	.
INFO	BIO_trace_manifest	No traceability manifest or runtime audit-log schema surface detected.	.
INFO	BIO_run_trace	No risky subprocess or os.system bio-tool execution patterns detected.	.
INFO	AST_argparse_cli	argparse CLI interface detected.	container/skills/bio-tools/templates/pymol_render_
INFO	AST_argparse_cli	argparse CLI interface detected.	container/skills/bio-tools/templates/qc_summary_pl
INFO	AST_argparse_cli	argparse CLI interface detected.	container/skills/bio-tools/templates/volcano_plot_
INFO	AST_argparse_cli	argparse CLI interface detected.	container/skills/sec-report/sec_pipeline.py
INFO	AST_argparse_cli	argparse CLI interface detected.	container/skills/sec-report/tests/generate_test_da
INFO	S4_container_environment	No evidence detected for S4_container_environment.	.
INFO	S4_make_reproduce_target	No Makefile detected.	.
INFO	S4_environment_lock_evidence	Environment, dependency, or lock manifest detected.	package-lock.json
INFO	S4_exact_dependency_pins_or_hash	Lock manifest with exact dependency resolution detected.	package-lock.json
INFO	S4_readme_reproducibility_sectio	README exists but no reproducibility or replication section heading was detected.	.
INFO	S4_checksum_files	No evidence detected for S4_checksum_files.	.
INFO	S4_dataset_url	Documentation exists but no dataset URL or data source URL was detected.	.
INFO	S4_model_weight_url_or_checksum	Model weight, checkpoint, or model artifact URL detected.	README.md
INFO	S4_model_weight_url_or_checksum	Model weight, checkpoint, or model artifact URL detected.	docs/BEGINNER_GUIDE.md
INFO	S4_citation_cff	No evidence detected for S4_citation_cff.	.
INFO	S4_license_restriction	License/readme/docs surfaces exist but no restriction language was detected.	.
INFO	S4_cli_entrypoint	argparse CLI evidence detected by AST summary.	.
INFO	S4_seed_setting	No deterministic seed setting evidence detected by AST summary.	.
INFO	S4_runnable_examples	No evidence detected for S4_runnable_examples.	.