支持用户按一个或多个研究领域订阅 arXiv 最新论文,按重要性排序并以中英双语卡片形式推送(英文标题/中文标题/英文摘要/中文摘要/arXiv 链接)。支持每领域独立数量上限(5-20)、关键词高亮、NEW/UPDATED 版本标识、Markdown 存档,以及定时推送与即时推送双路径。首次使用时先完成订阅配置...
python scripts/bootstrap_env.py --run-doctorpython scripts/sync_arxiv_taxonomy.py --output data/arxiv_taxonomy.jsonpush_time 必须按用户 timezone 的本地时间理解,不能按 UTC 理解。12:00 + Asia/Shanghai -> 0 12 * * * (Asia/Shanghai)08:30 + Asia/Shanghai -> 30 8 * * * (Asia/Shanghai)/home/USER_HOME/.openclaw/workspace/agent-daily-paper 执行:export PATH="/home/USER_HOME/miniconda3/bin:/home/USER_HOME/.nvm/versions/node/NODE_VERSION/bin:/usr/local/bin:/home/USER_HOME/.local/bin:/home/USER_HOME/.bun/bin:/usr/bin:/bin:/home/USER_HOME/.nvm/current/bin:/home/USER_HOME/.npm-global/bin:/home/USER_HOME/bin:/home/USER_HOME/.volta/bin:/home/USER_HOME/.asdf/shims:/home/USER_HOME/.fnm/current/bin:/home/USER_HOME/.local/share/pnpm" && conda run -n arxiv-digest-lab python scripts/run_digest.py --only-due-now --due-window-minutes 15 --emit-markdownUSER_HOME 与 NODE_VERSION 必须替换为当前机器的真实路径delivery.mode: announcedelivery.channel: feishudelivery.to: user:FEISHU_USER_IDcron: 0 12 * * *timezone: Asia/Shanghaireason=already_pushed_today -> 今天该领域已推送过当天该领域无最新论文config/subscriptions.json 中 setup_required=true,必须先向用户收集配置并写入订阅;禁止直接按样例配置执行推送。output/daily/*.mdfield_settings[].name:研究领域名(可多个)field_settings[].limit:每领域推荐数量(5-20)push_time:每日推送时间(HH:MM,本地时间)timezone:时区(默认 Asia/Shanghai)keywords / exclude_keywordstime_window_hoursquery_strategy(推荐 category_keyword_union)require_primary_category(推荐 true)history_scope(推荐 subscription,避免跨订阅误去重)category_expand_mode(off/conservative/balanced/broad)agent-categories-only(仅使用 Agent 提供分类;缺失分类则报错)taxonomy-json(默认 data/arxiv_taxonomy.json,用于分类合法性校验与补全)embedding_filter.model / embedding_filter.threshold / embedding_filter.top_kagent_rerank.model(默认 BAAI/bge-reranker-v2-m3)/ agent_rerank.top_khighlight.title_keywords / highlight.authors / highlight.venuesTRANSLATE_PROVIDER:openai / argos / auto / none优先级:
config/agent_field_profiles.json(默认路径,存在即优先)data/arxiv_taxonomy.json)支持字段画像 JSON 结构:
canonical_encategorieskeywordstitle_keywordsvenuesquery_strategy=category_keyword_union)require_primary_category=true)embedding_filter)agent_rerank)每篇论文输出:
NEW / UPDATED(vX->vY) + 高亮标签)命名规则:<领域1>_<领域2>_<YYYY-MM-DD>.md
Field Profiles,每个领域给出:
Canonical EN(英文领域名)Keywords(检索关键词)Venues/Journals(相关会议或期刊)primary_categories:检索与过滤实际使用的主分类categories:扩展参考分类(展示用)python scripts/doctor.pypython scripts/run_digest.py --emit-markdownpython scripts/run_digest.py --config config/subscriptions.json --emit-markdown0 12 * * * (Asia/Shanghai) -> cd <repo> && conda run -n arxiv-digest-lab python scripts/run_digest.py --config config/subscriptions.json --emit-markdown0 12 * * * (Asia/Shanghai) + export PATH="..." && conda run -n arxiv-digest-lab python scripts/run_digest.py --only-due-now --due-window-minutes 15 --emit-markdownpython scripts/run_digest.py --config config/subscriptions.json --only-due-now --due-window-minutes 15 --emit-markdownpython scripts/instant_digest.py --fields "数据库优化器,推荐系统" --limit 20 --time-window-hours 72推荐 Conda 环境:arxiv-digest-lab
conda create -n arxiv-digest-lab python=3.10 -y
conda activate arxiv-digest-lab
pip install argostranslate
python scripts/install_argos_model.py
pip install sentence-transformers
python scripts/install_embedding_model.py --model BAAI/bge-m3
[待翻译],不中断主流程。ZIP package — ready to use