Crawl Result API
概述
GET /v1/crawl/{crawl_id} 用于查询批量抓取任务状态与结果。
请求
Endpoint
GET https://run.xcrawl.com/v1/crawl/{crawl_id}
Headers
Authorization: Bearer <api_key>
响应
| 字段 | 类型 | 说明 |
|---|---|---|
crawl_id | string | 任务 ID |
endpoint | string | 固定为 crawl |
version | string | 版本标识 |
status | string | pending / crawling / completed / failed |
completed | integer | 当前已经完成的页面数量 |
total | integer | 当前已调度的页面总数 |
url | string | 本次任务的入口 URL |
data | object[] | 抓取结果数组 |
started_at | string | 任务开始时间(ISO 8601) |
ended_at | string | 任务结束时间(ISO 8601) |
total_credits_used | integer | 总积分消耗 |
data 为数组,每个元素代表一个已抓取页面,字段含义如下(按 output.formats 返回相应内容):
html:剔除<head>、<script>等后的正文 HTMLraw_html:原始 HTMLmarkdown:页面内容转换后的 Markdownlinks:页面中解析到的链接列表metadata:页面元信息(包含title、statusCode、contentType、proxy_location、proxy_sticky_session、sourceURL等;会根据响应头与 HTML<head>内容动态扩展)screenshot:截图下载地址summary:页面 AI 摘要json:AI 结构化抽取结果(JSON)traffic_bytes:本次抓取消耗的流量credits_used:本次抓取消耗的积分credits_detail:本次抓取消耗的积分详情
示例
请求示例
curl -s -X GET 'https://run.xcrawl.com/v1/crawl/01KKE8BNNVQH9PCYEEKJGXKE07' \
-H 'Authorization: Bearer $XCRAWL_API_KEY'响应示例
{
"crawl_id": "01KKE8BNNVQH9PCYEEKJGXKE07",
"endpoint": "crawl",
"version": "dca0d4b3bff035e4",
"status": "completed",
"completed": 1,
"total": 1,
"url": "https://docs.xcrawl.com/doc/",
"data": [
{
"markdown": "[ Skip to content ](https://docs.xcrawl.com/doc/developer-guides/proxies/#VPContent)\n# Proxy Setup\nXCrawl supports proxy configuration to choose an exit region or reuse sticky sessions.\n...",
"metadata": {
"contentType": "text/html",
"description": "Proxy Setup XCrawl supports proxy configuration to choose an exit region or reuse sticky sessions. ...",
"favicon": "https://www.xcrawl.com/favicon.ico",
"keywords": null,
"og:title": "Proxy Setup",
"og:type": "article",
"og:url": "https://docs.xcrawl.com/doc/developer-guides/proxies/",
"proxy_location": "US",
"proxy_sticky_session": "sticky-e7ad98da",
"sourceURL": "https://docs.xcrawl.com/doc/developer-guides/proxies",
"statusCode": 200,
"timezone": "Wed, 11 Mar 2026 10:51:30 GMT",
"title": "Proxy Setup",
"url": "https://docs.xcrawl.com/doc/developer-guides/proxies/"
},
"traffic_bytes": 1129,
"credits_used": 1,
"credits_detail": {
"base_cost": 1,
"traffic_cost": 0,
"json_extract_cost": 0
}
}
],
"started_at": "2026-03-11T10:51:24Z",
"ended_at": "2026-03-11T10:51:37Z",
"total_credits_used": 1
}