智能助手网
标签聚合 env

/tag/env

hnrss.org · 2026-04-15 08:42:30+08:00 · tech

I want to share a new dataset of 331 reward-hackable environments. These are real environments used in Terminal Bench and adjacent benchmarks. I first got interested in this because, as a reviewer of Terminal Bench, I noticed a lot of our tasks were hackable. I also noticed that many contributors to the benchmark do so because it provides credibility when selling environments to labs. Hence, TBench tasks are, in my opinion, held to a higher quality standard than those being used today for RL. No one is spending hours manually reviewing the $1B in tasks being purchased by major labs. As far as I understand, while everyone knows environments are hackable, nobody has released hundreds of "realistic" environments. Comments URL: https://news.ycombinator.com/item?id=47773298 Points: 6 # Comments: 1

linux.do · 2026-04-14 11:08:28+08:00 · tech

RT,配置文件里括号带上思考等级 "env": { "ANTHROPIC_AUTH_TOKEN": "sk-iVBCgVi6KlDV3aQeX", "ANTHROPIC_BASE_URL": "http://localhost:8085", "ANTHROPIC_REASONING_MODEL": "gpt-5.4(high)", "ANTHROPIC_DEFAULT_OPUS_MODEL": "gpt-5.3-codex(high)", "ANTHROPIC_DEFAULT_SONNET_MODEL": "gpt-5.3-codex(medium)", "ANTHROPIC_DEFAULT_HAIKU_MODEL": "gpt-5.4(auto)", "ENABLE_TOOL_SEARCH": "true" }, 但是claude可以调整effort level,这样不会冲突吗,是会覆盖还是怎样? 1 个帖子 - 1 位参与者 阅读完整话题

www.solidot.org · 2026-04-13 16:38:12+08:00 · tech

根据发表在《Environmental Research Letters》期刊上的一项研究,1990-2023 年间热带和极地之间的中纬度地区的夏季平均每十年延长约 6 天。城市的变化更为惊人,澳大利亚悉尼的夏季如今持续 130 天,而在 1990 年只有 80 天,相当于每十年增加 15 天。加拿大多伦多的夏季每十年延长 8 天。研究还发现季节的转换变得更突然,春季不是慢慢升温转换到夏季,而是春季突然就暴热切换到夏季。这种骤然变化可能会扰乱依赖季节变化的系统,比如花朵可能在授粉昆虫活跃前就盛开了,农作物可能需要更早播种,春季气温的快速升高可能促使积雪更快融化洪涝风险加大。

hnrss.org · 2026-04-12 22:08:46+08:00 · tech

I built a tool to run multiple AI coding agents in parallel across projects. NeZha is a character from Chinese mythology, famous for having three heads and six arms — which feels like the perfect metaphor for running AI coding agents across multiple projects at once. Claude Code + Codex + Git + Editor, all in one place. Managing sessions in the terminal was getting painful — especially across projects. Comments URL: https://news.ycombinator.com/item?id=47739860 Points: 2 # Comments: 0