Skill+CLI
This page currently applies to
@xcrawl/cliversion0.2.7.
XCrawl Skill+CLI lets you run scrape, search, map, crawl, account, auth, and config workflows directly from the terminal.
It can also be used as an AI-callable tool layer in agent workflows, not only as a human-facing terminal command.
This page is based on the current local CLI source, package metadata, changelog, integration tests, and command help output.
Install
Node.js requirement:
>=18
Run with npx:
npx -y @xcrawl/cli@0.2.7 --helpInstall globally:
npm install -g @xcrawl/cli
xcrawl --helpAuthenticate
Browser login
xcrawl init -y --browser
xcrawl login --browserBehavior in 0.2.7:
xcrawl init -y --browseris the non-interactive setup path for install scripts and first-run onboarding- Browser auth opens
https://dash.xcrawl.com/cli-auth; if the browser cannot be opened, the CLI prints the URL for manual access - The browser flow polls every
3000ms and times out after60attempts Ctrl+Ccancels an in-progress browser auth session
Save API key locally
xcrawl login --api-key <your_api_key>
xcrawl init --api-key <your_api_key>This stores your API key in:
~/.xcrawl/config.jsonInteractive fallback
When an interactive terminal has no configured API key, these entry points prompt for authentication instead of failing immediately:
xcrawlxcrawl loginxcrawl init- Authenticated commands such as
status,scrape,search,map, andcrawl
Prompt text:
XCrawl CLI
Turn websites into LLM-ready data
Welcome! To get started, authenticate with your XCrawl account.
1. Login with browser (recommended)
2. Enter API key manually
Tip: You can also set XCRAWL_API_KEY environment variableWhen --json is enabled on an authenticated command, auth prompts and browser-status messages are written to stderr so stdout remains valid JSON.
Use environment variables
export XCRAWL_API_KEY=<your_api_key>The CLI resolves runtime config in this order:
- CLI flags
- Environment variables
- Local config file
~/.xcrawl/config.json - Built-in defaults
Shortcuts
xcrawl https://example.comis treated asxcrawl scrape https://example.comxcrawl crawl https://example.comis treated asxcrawl crawl start https://example.comxcrawlwith a configured API key prints help; without one in an interactive terminal it starts the auth flowxcrawl -Vorxcrawl --versionprints the current CLI versionxcrawl help <command>shows command-specific help
Defaults
| Key | Default |
|---|---|
| API base URL | https://run.xcrawl.com |
| Default output format | markdown |
| Default batch output directory | .xcrawl |
| Request timeout | 30000 ms |
| Debug | false |
Environment Variables
| Variable | Description |
|---|---|
XCRAWL_API_KEY | API key |
XCRAWL_API_BASE_URL | Override API base URL |
XCRAWL_DEFAULT_FORMAT | Default scrape format |
XCRAWL_OUTPUT_DIR | Default batch output directory |
XCRAWL_TIMEOUT_MS | Request timeout in milliseconds |
XCRAWL_DEBUG | Debug mode; accepts values like 1, true, yes, on, 0, false, no, off |
Commands
login
Authenticate and save the XCrawl API key to local config.
xcrawl login [--browser | --api-key <your_api_key>]Options:
| Option | Description |
|---|---|
--api-key <key> | XCrawl API key to save directly |
--browser | Authenticate with the XCrawl browser flow |
--json | Output machine-readable JSON |
Current behavior in 0.2.7:
- Running
xcrawl loginwith no flags in an interactive terminal opens the same browser-vs-manual selection prompt shown above --api-keyand--browserare mutually exclusive- In non-interactive mode,
xcrawl loginwithout an explicit auth method returns an auth error with next steps
logout
Clear the locally saved API key.
xcrawl logoutOptions:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
status
Show account and credit package status.
xcrawl statusOptions:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Current behavior in 0.2.7:
statusalways callshttps://api.xcrawl.com/web_v1/user/credit-user-info- Authentication is sent as query parameter
app_key=<your_api_key> Usernameis no longer returned in the command output--api-base-urlis intentionally not available on this command- If the API key is missing and the terminal is interactive,
statustriggers the auth prompt before the request is sent
scrape
Scrape one or more URLs.
xcrawl scrape [url...] [options]Arguments:
| Argument | Description |
|---|---|
[url...] | One or more http or https URLs |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file; for multiple URLs this is treated as a directory |
--format <format> | Output format shown in help: markdown, json, html, screenshot |
--wait-for <selector> | Wait-for value accepted by the CLI |
--headers <k:v,k2:v2> | Additional request headers |
--cookies <cookies> | Cookie string such as a=1; b=2 |
--proxy <proxy> | Proxy value |
--input <path> | Read URLs from a newline-delimited file |
--concurrency <n> | Concurrent scrape workers; default is 3 in batch mode |
Examples:
xcrawl scrape https://example.com --format markdown
xcrawl https://example.com
xcrawl scrape --input ./urls.txt --concurrency 3 --json
xcrawl scrape https://example.com --headers "Accept-Language:en-US,X-Test:1"Batch behavior:
- At least one URL is required, either from positional arguments or
--input - Input files must be newline-delimited; empty lines and lines starting with
#are ignored - Multiple URLs with
--jsonand no--outputprint a JSON array to stdout - Multiple URLs without
--jsonwrite one file per URL - When scraping multiple URLs without
--output, files are written to.xcrawl/
Current behavior in 0.2.7:
textis also accepted by the implementation as a scrape format, even though the help text does not list ittextfalls back to text-style rendering of markdown-like content--wait-foris accepted by the command layer but is not currently forwarded to the XCrawl scrape request--proxyis forwarded asproxy.location; region-style values such asUSmatch the current request shape better than a literal proxy URL
search
Run a web search.
xcrawl search <query...> [options]Arguments:
| Argument | Description |
|---|---|
<query...> | Search query; all positional parts are joined with spaces |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
--limit <n> | Result limit; default is 10 |
--country <country> | Country code, for example US |
--language <language> | Language code, for example en |
Examples:
xcrawl search "xcrawl cli" --limit 10
xcrawl search "site:docs.xcrawl.com CLI" --country US --language enCurrent behavior in 0.2.7:
--countryis forwarded to the XCrawl search API aslocation
map
Generate site map links for a URL.
xcrawl map <url> [options]Arguments:
| Argument | Description |
|---|---|
<url> | Target http or https URL |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
--max-depth <n> | Maximum traversal depth |
--limit <n> | Maximum number of links |
Example:
xcrawl map https://example.com --limit 100Current behavior in 0.2.7:
- The command accepts
--max-depth, but the current request payload only forwardsurlandlimit
crawl
Manage crawl jobs.
Root command:
xcrawl crawl [command]Available subcommands:
xcrawl crawl start <url>xcrawl crawl status <job-id>
crawl start
Start a crawl job for a target URL.
xcrawl crawl start <url> [options]
xcrawl crawl <url> [options]Arguments:
| Argument | Description |
|---|---|
<url> | Target http or https URL |
Options:
| Option | Description |
|---|---|
--wait | Poll until the job reaches completed or failed |
--interval <ms> | Polling interval; default is 2000 |
--wait-timeout <ms> | Polling timeout; default is 60000 |
--max-pages <n> | Maximum pages to crawl |
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Example:
xcrawl crawl https://example.com --wait --interval 2000 --wait-timeout 60000crawl status
Fetch crawl job status by job ID.
xcrawl crawl status <job-id> [options]Arguments:
| Argument | Description |
|---|---|
<job-id> | Crawl job ID |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Current behavior in 0.2.7:
- Crawl statuses
queuedandrunningare normalized topendingandcrawling completedPagesis derived from the returned page array lengthfailedPagesis currently always reported as0
config
Read and update local CLI config.
xcrawl config [command]Available subcommands:
xcrawl config get <key>xcrawl config set <key> <value>xcrawl config keys
Supported config keys:
| Key | Type | Notes |
|---|---|---|
api-key | string | Saved API key |
api-base-url | string | Default base URL for API commands |
default-format | string | Allowed values: markdown, json, html, screenshot, text |
output-dir | string | Default batch scrape directory |
timeout-ms | integer | Must be greater than 0 |
debug | boolean | true / false or 1 / 0 when using config set |
config get
xcrawl config get <key> [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
config set
xcrawl config set <key> <value> [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
config keys
xcrawl config keys [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
init
Initialize XCrawl CLI authentication for this machine.
xcrawl init [options]Examples:
xcrawl init -y --browser
xcrawl init --api-key <your_api_key>
xcrawl initOptions:
| Option | Description |
|---|---|
-y, --yes | Skip interactive selection and require an explicit auth method |
--browser | Authenticate with the XCrawl browser flow |
--api-key <key> | Save the XCrawl API key directly |
--json | Output machine-readable JSON |
Current behavior in 0.2.7:
initis now a real authentication bootstrap command, not a placeholder- Running
xcrawl initin an interactive terminal uses the same browser-vs-manual auth prompt aslogin -yrequires either--browseror--api-key--browserand--api-keyare mutually exclusive- Successful init saves
~/.xcrawl/config.jsonand prints suggested next steps
Output Handling
- Default output is human-readable text
--jsonreturns machine-readable JSON--output <path>writes the result to a file and printsSaved output: <path>- Batch scrape output filenames are derived from sanitized URLs
- Auth prompts are printed to stderr when a command also needs to keep stdout JSON clean
Validation Rules
- URLs must use
http://orhttps:// - Positive integer flags such as
--timeout,--limit,--concurrency,--interval,--wait-timeout, and--max-pagesmust be greater than0 - Header strings must use the format
Key:Value,Key2:Value2
