MCP Path

Crawl Start SiteUrlIndex

Crawl Start SiteUrlIndex is a public reference for bounded website review runs and discoverable result data. It names the signal, policy, or flow an agent should understand before choosing a concrete tool.

Catalog path

Reference page for a documented MCP capability path.

Type: MCP path
Family: Crawler & Site Discovery
Effect: changes state
Status: Reference
Path: 9.23

Purpose

What this entry explains

What it does

This reference explains Crawl Start SiteUrlIndex for bounded website review runs and discoverable result data. It is kept as a named reference so agents can cite the flow without inventing a tool name.

Use when

Use this entry when an agent needs to carry out the bounded step "Crawl Start SiteUrlIndex" for bounded website review runs and discoverable result data.
Use it as a reference path when the catalog describes a capability but no single public tool name is explicit.
Use it before chaining follow-up tools so the next step is based on current evidence.

Reference Use

How agents should cite and apply this area

Examples are maintained at family level and use only public tool names or reference paths already present in the catalog.

Signal, gate, behavior, boundary

Crawl Start SiteUrlIndex describes a behavior for bounded website review runs and discoverable result data. The path shows which signal, gate, behavior, or boundary must be checked before choosing a concrete tool.

When agents cite it

An agent cites this path when it needs Crawl Start SiteUrlIndex as context for a decision, block, target check, or follow-up tool choice.

Why no callable name

The public source does not name one callable tool for this path. The documentation therefore keeps it as a reference path and does not invent a callable name.

Signals and rule

Relevant response signals: siteUrlIndex, siteUrlIndexAdvisory. Safety axes: Browser state, Automation. The reference path alone is not permission to execute. Before acting, check current MCP discovery, visible target, scope, and the actual response.

Family example

A task in bounded website review runs and discoverable result data can trigger powerful execution and therefore needs target, approval, and result check before the step.

The agent starts with nova.crawl_start, reads the current response or reference, and only then chooses the concrete next tool.

nova.crawl_start nova.crawl_status nova.crawl_results

Current discovery, target, user control, warning signals, and result check come before execution.

Contract

Inputs and important response fields

This page is a public reference. Agents and integrators should still read current MCP tool discovery before execution, because schemas can be gated by settings or version.

Inputs

No stable public input field is derived from the catalog source for this path. Read current MCP discovery before execution.

Response field	Explanation
`siteUrlIndex`	Response field named by the catalog source. Treat it as current evidence for the next decision.
`siteUrlIndexAdvisory`	Response field named by the catalog source. Treat it as current evidence for the next decision.

Safety

Boundary before execution

Effect

Can change browser, page, or workflow state. The target and expected result must be clear before execution.

Agent rule

Confirm the current target first, perform only the intended bounded action, and verify the resulting page or workflow state afterwards.

Human control

For humans, this entry names the browser or workflow state in bounded website review runs and discoverable result data that may change, so the action can be reviewed before and after execution.

High-Impact Review

Execution boundary and recheck hints

Review category: Scheduler/tasks/automation

Execution boundary

Runs need scope, budget, progress, stop condition, and reviewable terminal status before they start or continue.

Typical false assumption

False assumption: once started, a run may continue until success.

Visible user control

Task, schedule, variables, workspace, and run status must remain reviewable by the user.

Agent rule

Bound automations, poll progress, check terminal status, and avoid chaining when results are unclear.

Abort or recheck

Abort or recheck when budget, target set, run ID, workspace, or result status becomes unclear.

Safety Axes

How this path can affect work

Axes are stable catalog signals for humans, agents, and LLM discovery. One path can carry several axes.

Browser state browser_state_change

Changes tab, navigation, focus, claim, scroll position, window state, or browser environment.

Confirm the target context visibly before execution and verify that the expected browser state was reached afterwards.

Automation automation_run

Starts or monitors crawls, sequences, schedulers, tasks, batches, or longer runs.

Keep scope, budget, progress, stop condition, and terminal status visible before and during the run.