Alpha v0.6.0: Forced vs Auto Tool Mode

Left to its own devices, a language model asked to process a document will often respond with a question: "What would you like me to do with this?" That response is useless in a workflow context — the user already told the system what to do when they selected the workflow. The tool mode switch is the mechanism that prevents this behavior while still allowing the agent the flexibility to ask meaningful questions once it has actually analyzed the file.

Two Modes of Operation

The reason node invokes the language model in one of two configurations:

# Forced mode — LLM MUST call a tool, cannot respond with text
llm_forced = llm.bind_tools(tools, tool_choice="any")

# Auto mode — LLM chooses: call a tool OR respond with text
llm_auto = llm.bind_tools(tools)
# tool_choice=None means autonomous decision

In forced mode, the model cannot produce a text-only response. It must select at least one tool from the available set and invoke it. In auto mode, the model can either call tools or produce a final answer — it decides based on its assessment of what the workflow requires.

The distinction matters because workflow completion looks like a text response. When the agent has finished all required phases, stored the artifact, and called complete_workflow, the graph routes to the finish node — which only fires if the last response was text (no pending tool calls). Forced mode must end at some point, or the workflow can never complete.

The Threshold Formula

The switch from forced to auto happens when the agent has called enough distinct tools to demonstrate that it has actually engaged with the workflow. The threshold is computed dynamically at each turn:

unique_tools = _count_unique_tool_calls(state["messages"])
min_threshold = state.get("min_tools_for_auto", _MIN_TOOLS_FOR_AUTO_MODE)
# _MIN_TOOLS_FOR_AUTO_MODE = 5 (default constant)

dynamic_threshold = max(2, min(min_threshold, len(available_tools)))

if unique_tools >= dynamic_threshold:
    llm_to_use = llm_auto   # agent can now respond or call tools
else:
    llm_to_use = llm_forced  # agent must call a tool

Three things to note about this formula:

Floor of 2. Even if the spec sets min_tools_for_auto = 1, the agent must call at least 2 distinct tools before switching. A single tool call is not sufficient evidence that the file has been processed meaningfully — one call could be a lookup, not an analysis.

Cap at available tool count. If the workflow only has 3 tools registered, requiring 5 distinct calls before switching is impossible. The cap prevents the agent from being stuck in forced mode indefinitely when the tool set is small.

Per-workflow configurability. The min_tools_for_auto field in the workflow spec overrides the default of 5. A simple 2-phase workflow can set this to 2; a complex 8-phase workflow might set it to 7. The threshold is a spec-level policy, not a system-level constant.

What Counts — and What Doesn't

The count is computed from distinct tool names in the message history, with important exclusions:

_CONTROL_FLOW_TOOLS = frozenset({"request_user_input", "complete_workflow"})

def _count_unique_tool_calls(messages: list) -> int:
    tool_names: set[str] = set()
    for msg in messages:
        if hasattr(msg, "tool_calls") and msg.tool_calls:
            for tc in msg.tool_calls:
                name = tc.get("name", "")
                if name and name not in _CONTROL_FLOW_TOOLS:
                    tool_names.add(name)
    return len(tool_names)

request_user_input and complete_workflow are excluded because they are routing signals, not workflow-progress indicators. Asking the user a question doesn't demonstrate that the agent has analyzed the document. Calling complete_workflow is the end signal — counting it toward the threshold would allow the agent to complete the workflow in auto mode before actually finishing (which is precisely what the threshold is designed to prevent).

The count is also by unique name, not by call count. Calling resume-profile-mcp:parse_document five times counts the same as calling it once. This prevents gaming the threshold by repeating the same tool call. Only calling different tools — demonstrating engagement across multiple workflow phases — advances the count.

Why store_artifact Is Excluded from Forced Mode

There is one additional subtlety: store_artifact is excluded from the forced-mode tool list but not from auto mode:

_ARTIFACT_TOOLS = frozenset({"store_artifact"})

# Forced mode: exclude store_artifact to prevent premature uploads
forced_tools = [t for t in tools if t.name not in _ARTIFACT_TOOLS] or tools
llm_forced = llm.bind_tools(forced_tools, tool_choice="any")

# Auto mode: all tools available including store_artifact
llm_auto = llm.bind_tools(tools)

In forced mode, the agent must call a tool — but it shouldn't be able to prematurely store an artifact when it hasn't finished the workflow. If store_artifact were available in forced mode, the agent could technically satisfy the "must call a tool" constraint by uploading an empty or incomplete result and then immediately calling complete_workflow.

By excluding it from forced mode, storing an artifact is only possible once the agent has reached auto mode — which only happens after it has called enough distinct data tools to satisfy the threshold. The artifact upload is gated behind genuine workflow progress.

Effect of the Transition

When the agent crosses the threshold, behavior changes noticeably:

# Before threshold (forced mode):
Turn 3: [Tool call: doc-parser-mcp:extract_text]
Turn 4: [Tool call: text-analysis-mcp:extract_keywords]
Turn 5: [Tool call: resume-profile-mcp:extract_profile]
# Agent cannot respond with text regardless of what it "wants" to say

# After threshold (auto mode, unique_tools ≥ dynamic_threshold):
Turn 6: "Based on my analysis of the document, I have one question
         before proceeding: would you prefer your resume tailored
         for the engineering path or the product path?"
         [Tool call: request_user_input]
# Or simply:
Turn 6: [Tool call: store_artifact, complete_workflow]
# If no clarification is needed

The transition from forced to auto is a one-way gate within a single workflow execution. Once the threshold is crossed, the agent operates in auto mode for the remainder of the workflow. There is no mechanism to switch back to forced mode — if the agent produced enough tool calls to cross the threshold, it has demonstrably engaged with the content.

Interaction with the request_user_input Guardrail

The tool mode switch interacts with the request_user_input guardrail described in the MCP architecture post. That guardrail requires at least 2 data tools to be called before the agent is allowed to ask the user anything. The threshold formula with its floor of 2 means that by the time the agent reaches auto mode, the user-input guardrail condition is already satisfied.

These are not independent constraints — they work in concert. Forced mode pushes the agent toward data tools. The user-input guardrail prevents asking questions before data tools have been called. The threshold formula ensures forced mode persists until enough data has been gathered to make any question the agent asks informed rather than reflexive.

The tool mode switch is not a performance optimization — it's a behavioral contract. Forced mode guarantees the agent reads before it speaks. Auto mode grants it the judgment to decide what to do next. The transition moment is the boundary between "gathering information" and "reasoning about information."