InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Shiyang Feng; Runmin Ma; Xiangchao Yan; Yue Fan; Yusong Hu; Songtao Huang; Shuaiyu Zhang; Zongsheng Cao; Tianshuo Peng; Jiakang Yuan; Zijie Guo; Zhijie Zhong; Shangheng Du; Weida Wang; Jinxin Shi; Yuhao Zhou; Xiaohan He; Zhiyin Yu; Fangchen Yu; Qihao Zheng; Jiamin Wu; Mianxin Liu; Chi Zhang; Shaowei Hou; Shuya Li; Yankai Jiang; Wenjie Lou; Lilong Wang; Zifu Wang; Jiong Wang; Wanghan Xu; Yue Deng; Dongrui Liu; Yiheng Wang; Wenlong Zhang; Fenghua Ling; Shufei Zhang; Xiaosong Wang; Shuangjia Zheng; Xun Huang; Siqi Sun; Shuyue Hu; Peng Ye; Chunfeng Song; Bin Wang; Conghui He; Yihao Liu; Xin Li; Qibin Hou; Tao Chen; Xiangyu Yue; Bin Wang; Liang He; Dahua Lin; Bowen Zhou; Bo Zhang; Lei Bai

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Intermediate

Shiyang Feng, Runmin Ma, Xiangchao Yan et al.2/9/2026

arXiv

Key Summary

•InternAgent-1.5 is a single AI system that can read papers, plan experiments, run code or lab steps, check results, and keep improving over time.
•It is built from three teams that work in a loop: Generation (makes ideas), Verification (tests them), and Evolution (learns from the results).
•A big knowledge map connects facts, methods, data, and experiments across fields like biology, chemistry, physics, and earth science.
•A special planning graph breaks big questions into smaller steps, runs them in the right order, and stitches the answers back together.
•Its optimization engine explores many solution ideas in parallel and mixes the best parts together to make stronger new ideas.
•A three-part memory helps it remember strategies, past trials, and long-term lessons so it can operate for weeks or months without forgetting.
•On tough science tests (GAIA, HLE, GPQA, FrontierScience), it scores at or near the top, showing strong reasoning and research skills.
•It discovers algorithms (like better forecasting or molecular models) and also completes empirical science (like climate diagnostics and protein design).
•This unified approach reduces repeated work, speeds up discovery, and helps scientists focus on the most promising directions.

Why This Research Matters

InternAgent-1.5 helps scientists move from questions to trustworthy results faster by uniting reading, planning, experimentation, and learning in one loop. Its cross-disciplinary map reveals connections that people might miss, speeding up insights in medicine, climate, and materials. The optimization engine saves time and resources by trying many ideas in parallel and mixing the best parts. Its memory means progress compounds over weeks or months instead of resetting each day. This can shorten the path to new therapies, sharper climate tools, safer chemistry, and stronger AI methods. It also improves transparency by organizing evidence and steps so experts can audit decisions. Overall, it turns AI from a chatty helper into a reliable research partner.

Detailed Explanation

Tap terms for definitions

01Background & Problem Definition

🍞 Hook: You know how a great science fair project needs more than one skill—you have to read about the topic, plan what to try, run the tests, check your results, and then try again? Imagine an AI that can do that whole loop, not just one part.

🥬 The Concept (AI Scientist systems): An AI Scientist system is a group of smart tools that help do real scientific research from reading to experimenting to learning.

How it works: 1) Read and organize papers, 2) Form a hypothesis, 3) Plan methods, 4) Run code or lab steps, 5) Check results, 6) Improve the plan.
Why it matters: Without this, AI tends to either talk about science or code a little—but it can’t carry research forward over many rounds.

🍞 Anchor: Think of a robot teammate who can find good sources, design the test, run it on a computer or robot arm, and learn which ideas to try next.

The World Before: AI tools were strong but scattered. Some were good at reading papers, others at writing code, and a few could help in the lab. Most systems were built for one field (like only chemistry) and one kind of task. They rarely worked well across different sciences or for long stretches without a human fixing things.

🍞 Hook: Imagine if each sports team only practiced one move. Great at that one move—but weak in a real game that needs passing, defense, and strategy.

🥬 The Concept (Domain-specific designs): Many earlier systems were built for a single domain (like just medicine) with special rules baked in.

How it works: They choose tools and playbooks only for that field, so they work well there but struggle elsewhere.
Why it matters: If you want to study a biology problem that needs physics clues and earth data, a single-domain tool gets lost.

🍞 Anchor: A top chess player is amazing at chess—but won’t beat a soccer team on the field. You need flexible players.

The Problem: Real scientific discovery needs cross-field thinking (biology + chemistry + computation), strong optimization (try lots of ideas, keep the winners), and memory that lasts for many weeks. Most systems did short, straight-line searches, forgot old lessons, and couldn’t easily reuse good ideas from one branch of exploration to another.

🍞 Hook: You know how your best school projects happen when you remember past mistakes, borrow tricks that worked in other classes, and test several drafts?

🥬 The Concept (Long-horizon autonomous operation): This means an AI can work for a long time, keep track of what happened, and get better as it goes.

How it works: It stores what it did, why, and what happened, and uses that to guide the next round.
Why it matters: Without it, the AI keeps repeating the same errors and stalls.

🍞 Anchor: Like keeping a science journal that your future self actually reads before the next experiment.

Failed Attempts: Systems tried “one-path-at-a-time” optimization, where each search branch didn’t share discoveries with others. They used simple memory (or none), and their knowledge bases were flat lists instead of connected maps. That led to repeated dead ends, half-baked plans, and shallow conclusions.

The Gap: We needed a unified framework that:

Works across computational and wet-lab worlds
Remembers strategies and results over many cycles
Plans research like a roadmap, not a straight line
Refines solutions by mixing the best parts from many tries
Runs autonomously while staying organized and checkable

Real Stakes: Faster drug leads, better climate tools, safer chemistry, smarter AI algorithms—all depend on connecting ideas across fields and improving them steadily. For people’s lives, this means earlier disease insights, clearer climate signals, and technology that arrives faster and safer.

🍞 Hook: Imagine a team captain that never gets tired, keeps perfect notes, remembers every play that worked, and can switch sports mid-game.

🥬 The Concept (Unified Agentic Framework): A unified agentic framework is one smart system that handles the whole research loop—idea → test → learn—across many sciences.

How it works: It organizes three subsystems—Generation (make plans), Verification (test plans), Evolution (learn and improve)—and powers them with deep research, solution refinement, and long-horizon memory.
Why it matters: Without a single organized loop, the system cannot keep momentum or improve over time.

🍞 Anchor: Think of an orchestra with three sections (strings, winds, percussion) playing different parts but following the same conductor and score.

02Core Idea

🍞 Hook: Picture a puzzle table where many small groups try different clusters, but every time someone finds a good match, the whole room learns from it and gets faster.

🥬 The Concept (Key insight): The big idea is to run science as a single, connected loop that remembers and reuses good ideas across branches, not just within one path.

How it works: Build a graph of knowledge across fields; plan research as a flow of linked steps; test many ideas in parallel; mix their best parts; store what worked in a structured memory; repeat.
Why it matters: Without sharing and memory, you waste time, repeat mistakes, and never reach the best designs.

🍞 Anchor: Like class projects where teams share tricks on a whiteboard so the next draft is everyone’s best.

Multiple analogies:

City map analogy 🍞 Hook: You know how a good city map shows roads, bridges, and train lines so you can choose the fastest route? 🥬 The Concept (Cross-Disciplinary Knowledge Graph): It’s a giant map of science that connects papers, methods, datasets, and experiments across fields.

How it works: Nodes are things (methods, data, tasks); edges are relationships (cites, depends on, supports). The AI follows paths to find relevant tools and ideas.
Why it matters: Without a map, you wander and miss shortcuts. 🍞 Anchor: Need a biology method involving physics? The map shows the bridge from protein folding to energy models.

Recipe kitchen analogy 🍞 Hook: Chefs try many recipe tweaks at once, then combine the winning parts. 🥬 The Concept (Solution Refinement with Graph-Augmented Search): The system generates many candidate methods, tests them, and then mixes the best components.

How it works: It explores in parallel, references good ideas across branches, and aggregates top pieces to form stronger hybrids.
Why it matters: Without mixing, you might improve a so-so idea instead of building the best one from great pieces. 🍞 Anchor: Take the best crust from one pie, the best filling from another, and the smartest bake time from a third.

Memory backpack analogy 🍞 Hook: Imagine carrying a backpack with three pouches: one for strategies, one for detailed tries, and one for big lessons. 🥬 The Concept (Structured Cognitive Memory: SPM, TEM, SKM): The AI saves reusable strategies (SPM), specific trial episodes (TEM), and long-term concepts and directions (SKM).

How it works: It retrieves what’s relevant each round and avoids old mistakes while aiming at fresh but related goals.
Why it matters: Without memory, you redo bad ideas and forget what works. 🍞 Anchor: Like using last semester’s best study guide, this week’s quiz notes, and your overall plan for the school year—together.

Before vs After:

Before: Single-domain tools, linear plans, little memory, ideas stuck in one branch.
After: Cross-domain map, graph-shaped plans, parallel tests, idea-mixing, strong memory that improves over time.

Why it works (intuition):

Graphs capture how science ideas connect, so the system can hop between fields logically.
Parallel exploration finds diverse good parts; aggregation builds better solutions faster.
Memory ensures progress compounds rather than resets.
A flow graph keeps the research steps organized, verifiable, and adaptable.

Building blocks introduced with the sandwich pattern:

Generation–Verification–Evolution (G–V–E) 🍞 Hook: Think of a 3-step study routine: plan, try, and learn. 🥬 The Concept: G–V–E is the loop that makes ideas, tests them, and updates knowledge.

How it works: Generation drafts hypotheses and methods; Verification runs code/lab checks; Evolution updates memory and strategy.
Why it matters: Without a loop, you can’t improve. 🍞 Anchor: Homework plan → attempt → review mistakes → better plan.

Deep Research 🍞 Hook: When you research a topic, you don’t just read one page—you find sources, compare them, and connect themes. 🥬 The Concept: Deep research is the system’s skill for searching, retrieving, and organizing cross-field evidence.

How it works: It builds a knowledge graph and a flow of tasks to pull in the right data and tools.
Why it matters: Without good sources, your ideas wobble. 🍞 Anchor: A well-annotated bibliography that points straight to the needed facts.

Cross-Disciplinary Knowledge Graph 🍞 Hook: A subway map helps you switch lines to reach new neighborhoods. 🥬 The Concept: A graph that links concepts, methods, data, and experiments across domains.

How it works: Nodes and typed edges form evidence paths; graph + dense retrieval finds what matters.
Why it matters: Without paths, you miss connections. 🍞 Anchor: Chemistry node → “by-product” edge → Earth-science data node for atmospheric reactions.

Dynamic Structured Knowledge Flow (Flow Graph) 🍞 Hook: Large tasks are easier when you split them into steps and draw arrows for what depends on what. 🥬 The Concept: A directed graph of subtasks (search, solve, answer) with clear dependencies.

How it works: The planner expands nodes, updates edges, executes ready nodes, and propagates context.
Why it matters: Without structure, you either stall or go in circles. 🍞 Anchor: For an AMOC question: define, review consensus, compare models, analyze biases, then synthesize.

Solution Refinement 🍞 Hook: Great projects happen after many drafts. 🥬 The Concept: Systematically improve methods by generating, testing, and mixing better parts.

How it works: Tries variants in parallel, learns from scores, reuses best components.
Why it matters: Without refinement, first drafts stick. 🍞 Anchor: From baseline code to faster, more accurate models in a few rounds.

Graph-Augmented Monte Carlo Search 🍞 Hook: A detective uses clues from all cases, not just one, to solve a tricky mystery. 🥬 The Concept: A search that explores many branches and also shares wins across branches using a solution graph.

How it works: Primary expansion, intra-branch evolution, cross-branch reference, and multi-branch aggregation.
Why it matters: Without cross-branch sharing, you reinvent the wheel. 🍞 Anchor: Mix the best reinforcement-learning tricks found by different attempts into a stronger algorithm.

Structured Cognitive Memory (SPM, TEM, SKM) 🍞 Hook: Keep three notebooks—strategies, experiment logs, and big-picture lessons. 🥬 The Concept: A memory system that supports short-term tweaks, mid-term adaptation, and long-term evolution.

How it works: SPM stores reusable plans; TEM stores trial episodes; SKM stores high-level insights and novelty guidance.
Why it matters: Without memory, progress fades. 🍞 Anchor: Next week’s plan is guided by last week’s wins, yesterday’s mistakes, and the semester goals.

03Methodology

At a high level: Question → Generation (plan ideas and methods) → Verification (test by code or lab) → Evolution (update memory and strategy) → Better Question or Plan.

Generation subsystem 🍞 Hook: Before you bake, you choose a recipe and make a shopping list. 🥬 The Concept: Generation creates hypotheses and step-by-step methods using deep research.

How it works: It queries the knowledge graph, builds a flow graph of subtasks, and drafts multiple method variants. It records reasoning traces for later use.
Why it matters: Without a clear plan, experiments waste time and miss key checks. 🍞 Anchor: For climate downscaling, it proposes a deep model pipeline, specifies datasets, metrics, and baselines.

Detailed steps:

Input: A research query and any prior memory.
Build a Flow Graph: Break the problem into nodes: search (gather evidence), solve (design method), answer (synthesize results).
Knowledge collection: Use tools (e.g., scientific APIs, simulators) and RAG over the cross-disciplinary knowledge graph.
Produce candidate methods: Draft several structured plans with parameters and evaluation metrics.

Verification subsystem 🍞 Hook: Baking time—try the recipes, taste, and score each. 🥬 The Concept: Verification runs computational experiments or wet-lab protocols and scores the results, guiding refinement.

How it works: Uses Graph-Augmented Monte Carlo Search to explore many candidates, backpropagate scores, and combine top pieces.
Why it matters: Without careful testing and scoring, you can’t tell which idea is truly better. 🍞 Anchor: Train model variants on benchmark data, log accuracy/RMSE/F1, and pick better components.

Detailed steps:

Parallel execution: Spin up runs in simulators, compute clusters, or lab robots via a protocol layer.
Scoring: Collect quantitative metrics (accuracy, R2, RMSE, MAE, F1, resource cost) and qualitative flags (stability, reproducibility).
Graph-augmented operators: • Primary Expansion: Small tweaks from a parent plan. • Intra-branch Evolution: Learn from the branch’s own history to avoid repeated mistakes. • Cross-branch Reference: Borrow a great trick from another branch. • Multi-branch Aggregation: Merge the best components from multiple winners.
Backpropagation: Feed scores up the solution graph to shape the next explorations.
Safety and constraints: For chemistry, enforce atom balance; for bio, respect lab constraints and controls.

Evolution subsystem 🍞 Hook: After tasting, update your master recipe book so your next cake is even better. 🥬 The Concept: Evolution updates the three-part memory and adjusts future goals and strategies.

How it works: Stores procedural strategies (SPM), episodic trials (TEM), and long-term lessons and novelty guidance (SKM). It selects the next objectives that are promising yet not redundant.
Why it matters: Without evolution, each round starts from scratch. 🍞 Anchor: Keep the learning that ‘lower learning rate + data augmentation’ worked; avoid the failed optimizer setting; try a new architecture family next.

Secret sauce pieces (with examples):

Cross-Disciplinary Knowledge Graph 🍞 Hook: A map reveals shortcuts you didn’t know. 🥬 The Concept: A rich graph linking documents, methods, datasets, settings, and problems.
How it works: Schema-guided extraction; combine graph search with dense retrieval; return evidence chains.
Why it matters: It finds relevant but non-obvious connections. 🍞 Anchor: From ‘esterification’ text to a structured reaction node with reactants, products, and by-products.
Dynamic Structured Knowledge Flow 🍞 Hook: A to-do list with arrows makes big jobs doable. 🥬 The Concept: A DAG of subtasks that evolves as new evidence appears.
How it works: Incremental planning, execute-ready nodes, propagate context.
Why it matters: Prevents wandering and supports verification. 🍞 Anchor: For AMOC: define, review consensus, analyze model biases, compare to proxies, synthesize timelines.
Reasoning enhancement via multiple pathways 🍞 Hook: Don’t trust one guess—cross-check. 🥬 The Concept: Produce a direct answer, a search-augmented answer, and a self-retrieval-refined answer; then combine.
How it works: Ensemble the three for completeness and factuality.
Why it matters: Reduces overreliance on any single path. 🍞 Anchor: A chemistry prediction is validated both by text evidence and tool-calculated descriptors.
Graph-Augmented Monte Carlo Search 🍞 Hook: Share clues across detectives. 🥬 The Concept: Classic exploration strengthened by a solution graph that lets branches learn from each other.
How it works: Four operators (primary, intra-branch, cross-branch, multi-branch) and backpropagated scores.
Why it matters: Faster convergence and better solutions. 🍞 Anchor: Time-series model gains by borrowing graph-attention from another winning branch.
Structured Cognitive Memory (SPM, TEM, SKM) 🍞 Hook: Three notebooks beat one sticky note. 🥬 The Concept: Store strategies, episodes, and high-level knowledge with novelty pressure.
How it works: Retrieve by similarity to guide planning, avoid repeats, and push fresh directions.
Why it matters: Supports long-horizon autonomy. 🍞 Anchor: In week 5, it recalls what beat Kriging last month and targets a new deep-architecture family.

Inputs and outputs example:

Input: “Improve climate downscaling from 2° to 0.25° vs ERA5.”
Output: Candidate deep models + configs, execution logs, metrics, error maps; updated memory entries; next-step objective (e.g., try spatial transformers with physics-informed loss).

04Experiments & Results

🍞 Hook: Report cards are clearer when you compare to the class average and explain what the grades mean.

🥬 The Concept (Meaningful evaluation): The system was tested on tough science exams and real discovery tasks to see if it truly reasons, plans, experiments, and improves.

How it works: Benchmarks for reasoning (GAIA, HLE, GPQA, FrontierScience, SGI), algorithm discovery tasks (chemistry, bio, physics, time series), and empirical tasks (earth, life, biological, physical sciences).
Why it matters: Without clear tests, we can’t trust the system for real research.

🍞 Anchor: It’s like taking math, science, and reading tests plus doing a real lab project—and then comparing to the best students.

Key benchmark highlights (contextualized):

GAIA and HLE: InternAgent-1.5 scores at or near the top, especially on harder problems. Think of getting an A when many strong models are getting B’s.
GPQA-diamond: Expert-level science questions—InternAgent-1.5 achieves leading accuracy across biology, chemistry, and physics. Like scoring top marks on an advanced placement exam.
FrontierScience: On both Olympiad-style and research-style tasks, it leads overall, particularly in chemistry and physics—indicating both structured problem-solving and open-ended reasoning strength.
SGI-bench (Deep Research, Idea Generation): Substantial improvements suggest the planning + retrieval + refinement loop makes the AI more evidence-driven and creative in a grounded way.

Algorithm discovery (numbers with meaning):

Chemistry (AutoRYP): Higher R2 means better reaction-yield predictions—InternAgent-1.5 beats strong baselines, indicating it learns useful chemical patterns.
Molecular dynamics (AutoMD): Lower energy error (MAE) shows better physics modeling—InternAgent-1.5 cuts error notably, like shrinking a ruler’s marking size so you can measure more precisely.
Power flow (AutoPower): Lower RMSE boosts reliability in power-grid estimates—valuable for engineering safety.
Time series (AutoTSF): Lower MAE across long horizons means more accurate forecasts—handy for energy, weather, and logistics.
Genomics (AutoTPPR, AutoEAP): Big jumps (e.g., enhancer-activity correlations) show it captures complex biology signals—important for understanding gene regulation.
AI methods (AutoTTS, AutoMem, AutoLM, AutoTTRL): Gains over recent strong baselines suggest the optimization and memory design are generally useful for AI research itself.

Empirical discovery case studies:

Earth science: It automated climate diagnostics across 20 models and improved downscaling beyond standard baselines, reproducing fine spatial details closer to ERA5—like upgrading from a blurry to a sharp weather map.
Life science: It reconstructed evidence chains to identify targets like GPR160 (HCC) and ARG2 (CRC), matched literature findings, and proposed experiment-ready protocols.
Biological science: In fluorescent protein design, it selected sfGFP variants using dry+wet loops and produced a structured report that matched known best practices.
Physical science: In reaction prediction, it used chemistry tools and atom-balance logic to beat reasoning-focused models; in drug design, it performed scaffold hopping and hit-to-lead refinements aligned with medicinal chemistry rules.

Surprising findings:

The memory system doesn’t just help recall; it reduces unnecessary tool calls and keeps plans tighter and more coherent.
Cross-branch mixing in search reliably outperforms single-branch polishing, confirming that sharing wins multiplies progress.
The flow graph planning made long, messy questions manageable and verifiable—improving both accuracy and transparency.

Scoreboard in plain terms:

Where others were strong, InternAgent-1.5 was often stronger; where tasks were especially tricky or long-horizon, its advantage grew—consistent with a system built to plan, remember, and aggregate the best ideas.

05Discussion & Limitations

🍞 Hook: Even a championship team has limits, needs good equipment, and picks its battles.

🥬 The Concept (Honest assessment): Know where the system shines, where it struggles, and what’s needed to use it well.

How it works: List limitations, resources, cautions, and open questions.
Why it matters: Clear expectations make the tool safer and more effective.

🍞 Anchor: Like reading the user manual before flying a drone.

Limitations:

Tool and data dependency: Results depend on the quality and correctness of external tools (e.g., chemistry toolkits) and datasets (e.g., biases in training or reanalysis data).
Wet-lab variability: Real-world lab execution introduces noise (equipment drift, batch effects) that the system must account for; some conditions still require human oversight.
Compute cost: Parallel exploration and large retrieval graphs need significant compute and storage.
Evaluation complexity: Open-ended science can be hard to grade objectively; some successes are qualitative and need expert review.
Novelty vs safety: Pushing into new areas must balance innovation with constraints (safety, ethics, feasibility).

Required resources:

Access to scientific APIs, databases, and domain tools (e.g., RDKit, bioinformatics suites, climate data archives).
Compute for parallel search, model training, and simulations.
If doing wet lab: robotics or lab-on-instrument interfaces, plus safety protocols via a control layer.
Storage and indexing for the knowledge graph and multi-part memory.

When NOT to use:

High-stakes experiments without human safety checks (e.g., dangerous chemistry or biohazards).
Settings with extremely scarce or proprietary data where retrieval is blocked—reduces deep-research power.
One-off, simple tasks where the setup overhead outweighs benefits.

Open questions:

Stronger theory coupling: How best to encode physics/biology priors so models obey constraints yet stay flexible?
Causality: Can the system tease apart cause and effect more reliably in observational settings?
Trust and provenance: What are the best ways to show sources, decisions, and evidence so experts can audit quickly?
Human–AI teaming: What division of labor yields the fastest, safest progress in each domain?
Efficiency: How to keep the benefits of parallel exploration while lowering compute and memory costs?

06Conclusion & Future Work

Three-sentence summary:

InternAgent-1.5 turns scientific discovery into a single, memory-powered loop that plans ideas, tests them in code or labs, and learns from results.
It uses a cross-disciplinary knowledge map, a flow-graph planner, a graph-augmented optimization engine, and a three-part memory to run long research cycles autonomously.
Across benchmarks and real tasks, it reaches top-tier performance and delivers practical discoveries in algorithms and empirical science.

Main achievement:

A unified agentic framework that coordinates Generation, Verification, and Evolution with deep research, solution refinement, and long-horizon memory—scaling from neat benchmarks to messy, real-world science.

Future directions:

Tighter links between theory, simulation, and experiment; richer safety and provenance layers; more efficient search and memory; broader tool ecosystems across domains.

Why remember this:

It shows that sharing wins across branches, planning as a graph, and keeping structured memory can transform AI from a smart assistant into a sustained scientific partner—one that helps us move faster and more reliably from questions to trustworthy discoveries.

Practical Applications

•Automate literature reviews that synthesize multi-field evidence into a clear, cited research brief.
•Design, execute, and iterate computational experiments (e.g., forecasting, molecular modeling) with automatic scoring and comparison.
•Plan and coordinate wet-lab workflows through lab-control protocols, including data capture and analysis.
•Discover and refine machine learning algorithms by exploring architectures, losses, and training tricks in parallel.
•Improve climate downscaling and diagnostics by integrating physical insights with data-driven models.
•Prioritize drug targets by constructing multi-omics evidence graphs and suggesting validation assays.
•Engineer protein variants by combining sequence analysis, structure prediction, and iterative evaluation.
•Predict chemical reaction outcomes and by-products with tool-assisted, conservation-aware reasoning.
•Run long-horizon research projects that remember past strategies and avoid repeating dead ends.
•Produce audit-ready reports that trace decisions from sources and tools to final conclusions.

Version: 1