自动化访问竞品站、数据源、采购源,深入页面采集内容并截图;完成站点标签/分类、竞品功能总结、采购员画像标签,并将结果写入知识库。
tools/auto_research/research.pyautomation/research_targets.jsonautomation/research_status.jsonautomation/research_pending_registrations.jsonautomation/research_logs/automation/research_runs//root/ca_v3/cps/input/auto_research.txt.env)AUTO_RESEARCH_TARGETS_PATHAUTO_RESEARCH_STATE_PATHAUTO_RESEARCH_STATUS_PATHAUTO_RESEARCH_LOG_DIRAUTO_RESEARCH_RUN_DIRAUTO_RESEARCH_OUTPUT_PATHAUTO_RESEARCH_INDEX_CMDAUTO_RESEARCH_MAX_PAGESAUTO_RESEARCH_MAX_DEPTHAUTO_RESEARCH_MAX_LINKS_PER_PAGEAUTO_RESEARCH_SCREENSHOTAUTO_RESEARCH_CAPTURE_HTMLAUTO_RESEARCH_SUMMARIZEAUTO_RESEARCH_HEADLESSAUTO_RESEARCH_SITE_TAGS(可选,| 分隔)AUTO_RESEARCH_PERSONA_TAGS(可选,| 分隔)automation/research_targets.json)示例结构:
{
"defaults": {
"type": "competitor",
"crawl": {
"max_pages": 8,
"max_depth": 2,
"seed_paths": ["/", "/product", "/pricing"]
}
},
"sites": [
{
"id": "example-competitor",
"name": "Example Competitor",
"type": "competitor",
"base_url": "https://example.com",
"tags": ["竞品"],
"categories": ["招投标平台"],
"login": {
"enabled": false,
"credentials": { "username": "", "password": "" },
"steps": []
},
"register": {
"enabled": false,
"steps": []
}
}
]
}
login.steps 与 register.steps 支持动作:
goto(进入页面)fill(填写字段)click(点击按钮)wait_for_selector(等待元素)wait_for_url(等待跳转)press(键盘输入)sleep(毫秒)可用变量:${USERNAME} ${PASSWORD} ${TIMESTAMP} ${base_url}。
每次运行都会在 automation/research_runs/run-YYYYmmdd-HHMMSS/ 下生成:
report.json(页面列表、标签、总结)先安装 Playwright:
python3 -m pip install playwright,然后python3 -m playwright install chromium。
python3 tools/auto_research/research.py
python3 tools/auto_research/research.py --site example-competitor
如果缺少账号或注册失败,会进入
automation/research_pending_registrations.json,补齐账号后可继续自动调研。