Skill+CLI
This page currently applies to
@xcrawl/cliversion0.2.5.
XCrawl Skill+CLI lets you run scrape, search, map, crawl, account, and config workflows directly from the terminal.
It can also be used as an AI-callable tool layer in agent workflows, not only as a human-facing terminal command.
This page is based on the current local CLI source, package metadata, and command help output.
Install
Node.js requirement:
>=18
Run with npx:
npx -y @xcrawl/cli@0.2.5 doctorInstall globally:
npm install -g @xcrawl/cli
xcrawl --helpAuthenticate
Save API key locally
xcrawl login --api-key <your_api_key>This stores your API key in:
~/.xcrawl/config.jsonUse environment variables
export XCRAWL_API_KEY=<your_api_key>The CLI resolves runtime config in this order:
- CLI flags
- Environment variables
- Local config file
~/.xcrawl/config.json - Built-in defaults
Shortcuts
xcrawl https://example.comis treated asxcrawl scrape https://example.comxcrawl crawl https://example.comis treated asxcrawl crawl start https://example.comxcrawl -Vorxcrawl --versionprints the current CLI versionxcrawl help <command>shows command-specific help
Defaults
| Key | Default |
|---|---|
| API base URL | https://run.xcrawl.com |
| Default output format | markdown |
| Default batch output directory | .xcrawl |
| Request timeout | 30000 ms |
| Debug | false |
Environment Variables
| Variable | Description |
|---|---|
XCRAWL_API_KEY | API key |
XCRAWL_API_BASE_URL | Override API base URL |
XCRAWL_DEFAULT_FORMAT | Default scrape format |
XCRAWL_OUTPUT_DIR | Default batch output directory |
XCRAWL_TIMEOUT_MS | Request timeout in milliseconds |
XCRAWL_DEBUG | Debug mode; accepts values like 1, true, yes, on, 0, false, no, off |
Commands
login
Save the XCrawl API key to local config.
xcrawl login --api-key <your_api_key>Options:
| Option | Description |
|---|---|
--api-key <key> | XCrawl API key to save |
--json | Output machine-readable JSON |
logout
Clear the locally saved API key.
xcrawl logoutOptions:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
status
Show account profile and credit package status.
xcrawl statusOptions:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Current behavior in 0.2.5:
statusalways callshttps://api.xcrawl.com/web_v1/user/credit-user-info- Authentication is sent as query parameter
app_key=<your_api_key> --api-base-urlis intentionally not available on this command
doctor
Run local diagnostics and connectivity checks.
xcrawl doctorOptions:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Checks performed:
- Node.js version
- Read/write access to
~/.xcrawl/config.json - API connectivity when an API key is available
Current behavior in 0.2.5:
- When the base URL is
https://run.xcrawl.com, a404from the account status endpoint is treated as a successful connectivity check for the public API
scrape
Scrape one or more URLs.
xcrawl scrape [url...] [options]Arguments:
| Argument | Description |
|---|---|
[url...] | One or more http or https URLs |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file; for multiple URLs this is treated as a directory |
--format <format> | Output format shown in help: markdown, json, html, screenshot |
--wait-for <selector> | Wait-for value accepted by the CLI |
--headers <k:v,k2:v2> | Additional request headers |
--cookies <cookies> | Cookie string such as a=1; b=2 |
--proxy <proxy> | Proxy value |
--input <path> | Read URLs from a newline-delimited file |
--concurrency <n> | Concurrent scrape workers; default is 3 in batch mode |
Examples:
xcrawl scrape https://example.com --format markdown
xcrawl https://example.com
xcrawl scrape --input ./urls.txt --concurrency 3 --json
xcrawl scrape https://example.com --headers "Accept-Language:en-US,X-Test:1"Batch behavior:
- At least one URL is required, either from positional arguments or
--input - Input files must be newline-delimited; empty lines and lines starting with
#are ignored - Multiple URLs with
--jsonand no--outputprint a JSON array to stdout - Multiple URLs without
--jsonwrite one file per URL - When scraping multiple URLs without
--output, files are written to.xcrawl/
Current behavior in 0.2.5:
textis also accepted by the implementation as a scrape format, even though the help text does not list ittextfalls back to text-style rendering of markdown-like content--wait-foris accepted by the command layer but is not currently forwarded to the XCrawl scrape request--proxyis documented as a proxy value, but the current implementation forwards it asproxy.location; region-style values such asUSalign better with the current request shape than a literal proxy URL
search
Run a web search.
xcrawl search <query...> [options]Arguments:
| Argument | Description |
|---|---|
<query...> | Search query; all positional parts are joined with spaces |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
--limit <n> | Result limit; default is 10 |
--country <country> | Country code, for example US |
--language <language> | Language code, for example en |
Examples:
xcrawl search "xcrawl cli" --limit 10
xcrawl search "site:docs.xcrawl.com CLI" --country US --language enCurrent behavior in 0.2.5:
--countryis forwarded to the XCrawl search API aslocation
map
Generate site map links for a URL.
xcrawl map <url> [options]Arguments:
| Argument | Description |
|---|---|
<url> | Target http or https URL |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
--max-depth <n> | Maximum traversal depth |
--limit <n> | Maximum number of links |
Example:
xcrawl map https://example.com --limit 100Current behavior in 0.2.5:
- The command accepts
--max-depth, but the current request payload only forwardsurlandlimit
crawl
Manage crawl jobs.
Root command:
xcrawl crawl [command]Available subcommands:
xcrawl crawl start <url>xcrawl crawl status <job-id>
crawl start
Start a crawl job for a target URL.
xcrawl crawl start <url> [options]
xcrawl crawl <url> [options]Arguments:
| Argument | Description |
|---|---|
<url> | Target http or https URL |
Options:
| Option | Description |
|---|---|
--wait | Poll until the job reaches completed or failed |
--interval <ms> | Polling interval; default is 2000 |
--wait-timeout <ms> | Polling timeout; default is 60000 |
--max-pages <n> | Maximum pages to crawl |
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Example:
xcrawl crawl https://example.com --wait --interval 2000 --wait-timeout 60000crawl status
Fetch crawl job status by job ID.
xcrawl crawl status <job-id> [options]Arguments:
| Argument | Description |
|---|---|
<job-id> | Crawl job ID |
Options:
| Option | Description |
|---|---|
--api-key <key> | Override API key |
--api-base-url <url> | Override API base URL |
--timeout <ms> | Request timeout in milliseconds |
--debug | Enable debug output |
--json | Output machine-readable JSON |
--output <path> | Save output to a file |
Current behavior in 0.2.5:
- Crawl statuses
queuedandrunningare normalized topendingandcrawling completedPagesis derived from the returned page array lengthfailedPagesis currently always reported as0
config
Read and update local CLI config.
xcrawl config [command]Available subcommands:
xcrawl config get <key>xcrawl config set <key> <value>xcrawl config keys
Supported config keys:
| Key | Type | Notes |
|---|---|---|
api-key | string | Saved API key |
api-base-url | string | Default base URL for API commands |
default-format | string | Allowed values: markdown, json, html, screenshot, text |
output-dir | string | Default batch scrape directory |
timeout-ms | integer | Must be greater than 0 |
debug | boolean | true / false or 1 / 0 when using config set |
config get
xcrawl config get <key> [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
config set
xcrawl config set <key> <value> [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
config keys
xcrawl config keys [options]Options:
| Option | Description |
|---|---|
--json | Output machine-readable JSON |
init
Project initialization placeholder.
xcrawl initCurrent behavior in 0.2.5:
- The command exists, but only prints that
initwill be implemented in a later phase
Output Handling
- Default output is human-readable text
--jsonreturns machine-readable JSON--output <path>writes the result to a file and printsSaved output: <path>- Batch scrape output filenames are derived from sanitized URLs
Validation Rules
- URLs must use
http://orhttps:// - Positive integer flags such as
--timeout,--limit,--concurrency,--interval,--wait-timeout, and--max-pagesmust be greater than0 - Header strings must use the format
Key:Value,Key2:Value2
