MCP Path
Crawl Start SiteUrlIndex
Crawl Start SiteUrlIndex is a public reference for bounded website review runs and discoverable result data. It names the signal, policy, or flow an agent should understand before choosing a concrete tool.
Reference page for a documented MCP capability path.
- Type
- MCP path
- Family
- Crawler & Site Discovery
- Effect
- changes state
- Status
- Reference
- Path
- 9.23
Purpose
What this entry explains
What it does
This reference explains Crawl Start SiteUrlIndex for bounded website review runs and discoverable result data. It is kept as a named reference so agents can cite the flow without inventing a tool name.
Use when
- Use this entry when an agent needs to carry out the bounded step "Crawl Start SiteUrlIndex" for bounded website review runs and discoverable result data.
- Use it as a reference path when the catalog describes a capability but no single public tool name is explicit.
- Use it before chaining follow-up tools so the next step is based on current evidence.
Reference Use
How agents should cite and apply this area
Examples are maintained at family level and use only public tool names or reference paths already present in the catalog.
Crawl Start SiteUrlIndex describes a behavior for bounded website review runs and discoverable result data. The path shows which signal, gate, behavior, or boundary must be checked before choosing a concrete tool.
An agent cites this path when it needs Crawl Start SiteUrlIndex as context for a decision, block, target check, or follow-up tool choice.
The public source does not name one callable tool for this path. The documentation therefore keeps it as a reference path and does not invent a callable name.
Relevant response signals: siteUrlIndex, siteUrlIndexAdvisory. Safety axes: Browser state, Automation. The reference path alone is not permission to execute. Before acting, check current MCP discovery, visible target, scope, and the actual response.
Family example
A task in bounded website review runs and discoverable result data can trigger powerful execution and therefore needs target, approval, and result check before the step.
The agent starts with nova.crawl_start, reads the current response or reference, and only then chooses the concrete next tool.
Current discovery, target, user control, warning signals, and result check come before execution.Contract
Inputs and important response fields
This page is a public reference. Agents and integrators should still read current MCP tool discovery before execution, because schemas can be gated by settings or version.
Inputs
No stable public input field is derived from the catalog source for this path. Read current MCP discovery before execution.
| Response field | Explanation |
|---|---|
siteUrlIndex | Response field named by the catalog source. Treat it as current evidence for the next decision. |
siteUrlIndexAdvisory | Response field named by the catalog source. Treat it as current evidence for the next decision. |
Safety
Boundary before execution
Can change browser, page, or workflow state. The target and expected result must be clear before execution.
Confirm the current target first, perform only the intended bounded action, and verify the resulting page or workflow state afterwards.
For humans, this entry names the browser or workflow state in bounded website review runs and discoverable result data that may change, so the action can be reviewed before and after execution.
High-Impact Review
Execution boundary and recheck hints
Review category: Scheduler/tasks/automation
Runs need scope, budget, progress, stop condition, and reviewable terminal status before they start or continue.
False assumption: once started, a run may continue until success.
Task, schedule, variables, workspace, and run status must remain reviewable by the user.
Bound automations, poll progress, check terminal status, and avoid chaining when results are unclear.
Abort or recheck when budget, target set, run ID, workspace, or result status becomes unclear.
Safety Axes
How this path can affect work
Axes are stable catalog signals for humans, agents, and LLM discovery. One path can carry several axes.
browser_state_change
Changes tab, navigation, focus, claim, scroll position, window state, or browser environment.
Confirm the target context visibly before execution and verify that the expected browser state was reached afterwards.automation_run
Starts or monitors crawls, sequences, schedulers, tasks, batches, or longer runs.
Keep scope, budget, progress, stop condition, and terminal status visible before and during the run.