智能助手网
标签聚合 Browser

/tag/Browser

hnrss.org · 2026-04-18 14:49:26+08:00 · tech

Hi HN, I'm the builder. I realized that using cloud AI APIs for sensitive workflows—like transcribing board meetings, OCRing employment contracts, or cleaning up ID photos—is a massive privacy liability. So I built a client-side workspace using transformers.js, Whisper, and WebGPU. Everything runs locally. You can turn on Airplane Mode after the initial model load, and it still transcribes and extracts text perfectly. To keep myself honest, I wrote a technical audit of how the data flows (or rather, doesn't flow). My only backend is a tiny 2-core node in Singapore running self-hosted Plausible analytics: [ https://gist.github.com/ygx2378/3275b333504c6a9def50ef531b54... ] I'm still learning the ropes of browser-based memory management, so I'd love your feedback on how the models load on your specific GPUs! Comments URL: https://news.ycombinator.com/item?id=47813703 Points: 1 # Comments: 0

hnrss.org · 2026-04-18 05:03:18+08:00 · tech

We built AI Subroutines in rtrvr.ai. Record a browser task once, save it as a callable tool, replay it at: zero token cost, zero LLM inference delay, and zero mistakes. The subroutine itself is a deterministic script composed of discovered network calls hitting the site's backend as well as page interactions like click/type/find. The key architectural decision: the script executes inside the webpage itself, not through a proxy, not in a headless worker, not out of process. The script dispatches requests from the tab's execution context, so auth, CSRF, TLS session, and signed headers get added to all requests and propagate for free. No certificate installation, no TLS fingerprint modification, no separate auth stack to maintain. During recording, the extension intercepts network requests (MAIN-world fetch/XHR patch + webRequest fallback). We score and trim ~300 requests down to ~5 based on method, timing relative to DOM events, and origin. Volatile GraphQL operation IDs are detected and force a DOM-only fallback before they break silently on the next run. The generated code combines network calls with DOM actions (click, type, find) in the same function via an rtrvr.* helper namespace. Point the agent at a spreadsheet of 500 rows and with just one LLM call parameters are assigned and 500 Subroutines kicked off. Key use cases: - record sending IG DM, then have reusable and callable routine to send DMs at zero token cost - create routine getting latest products in site catalog, call it to get thousands of products via direct graphql queries - setup routine to file EHR form based on parameters to the tool, AI infers parameters from current page context and calls tool - reuse routine daily to sync outbound messages on LinkedIn/Slack/Gmail to a CRM using a MCP server We see the fundamental reason that browser agents haven't taken off is that for repetitive tasks going through the inference loop is unnecessary. Better to just record once, and get the LLM to generate a script leveraging all the possible ways to interact with a site and the wider web like directly calling backed API's, interacting with the DOM, and calling 3P tools/APIs/MCP servers. Comments URL: https://news.ycombinator.com/item?id=47810533 Points: 5 # Comments: 1

hnrss.org · 2026-04-18 01:46:22+08:00 · tech

Waputer is an operating system that runs entirely in the browser. When you visit the website at https://waputer.app , a kernel written in JavaScript sets up a filesystem and launches a WebAssembly program, which in turn talks to the kernel to handle the display and input. A purely terminal-based version is at https://waputer.dev . My original intention was to create programs that run in the browser that have a lot more in common with the desktop. The traditional "hello world" program is not really suited for the web. Waputer changes that. The GitHub repo at https://github.com/waputer/docs gives a very brief overview of compiling a C program and running it on Waputer. There is a blog available from the main site that has a long-form explanation of Waputer and my motivations if you want some additional reading. Comments URL: https://news.ycombinator.com/item?id=47808554 Points: 2 # Comments: 0

linux.do · 2026-04-18 00:15:57+08:00 · tech

在L站看到 NexBrowser的推广文章,下载注册后,想体验一下这个指纹浏览器的使用体验,然后,无语的事情发生了,按照官方提示,注册了 https://user.nexip.net/ ,然后添加tg获得了2个静态IP的体验,页面写的无限会话,最长24小时,佬们可以想象嘛,在他们页面点击一下检测,ok,检测完成,然后导入浏览器,结果啥网页都打不开,再回到 https://user.nexip.net/ ,页面显示IP有效,但是检测都是不通过,问客服,回复是“体验只是提供IP用于检测,大部分的资源只能是给付费用户的”,好嘛,准客户都不要了,体验感如此拉跨,真无语了,你好歹也给个5分钟或半小时的体验嘛。 5 个帖子 - 2 位参与者 阅读完整话题

linux.do · 2026-04-17 22:57:50+08:00 · tech

Roxychrome 147 内核已正式上线 Roxychrome147内核有什么亮点? 引入 Rust 编写的 XML 解析器,提升内存安全性 做长效自动化或海量抓取的佬友应该清楚,由于 C++ 遗留的内存管理问题,浏览器在解析复杂 DOM 和 XML 时极易发生内存溢出(OOM)导致崩溃。 147 内核在底层率先引入了 Rust 编写的 XML 解析器。借助 Rust 的所有权机制切断内存泄漏,现在的版本在高强度挂机时的环境稳定性有了质的提升,不再轻易出现进程僵死。 强化本地网络访问限制,提升 WebSocket 与 WebTransport 安全性 针对现在越来越狡猾的高级风控,我们进一步收紧了底层本地网络访问权限,重写了 WebSocket 与 WebTransport 的安全拦截策略。直接切断恶意网站通过本地端口扫描获取真实设备特征的路径,护甲再叠一层。 新增 CSS contrast-color() 与视图过渡能力,优化 UI 表现 防关联也看重渲染真实度。新内核全面支持最新的 CSS contrast-color() 与视图过渡 API,不仅光影细节更平滑,更能在各大厂的“前端特征探针”面前表现出符合最新物理真机的原生渲染行为,有效提升仿真浏览和截图校验的通过率。 修复核心安全漏洞,稳固业务运行基石 除了功能新增,本次更新一如既往地修复了多项底层安全问题。在风控模型日益依赖浏览器特征指纹的当下,保持内核的绝对纯净与前沿,是防止账号被“特征标记”的核心前提。 RoxyBrowser 近期还有哪些重磅更新? 我们近期还上线了几个主打“自动化与提效”的核心更新: 1.Roxy AI Agent: 通过自然语言对话,实现 AI 驱动多窗口跑登录、浏览、点赞等流程,底层自带人类行为噪音,支持无头模式。—— 详情见此 2.RoxyClaw上线(手机远程控制浏览器): 现已支持手机远程控制。支持在飞书(Lark)、Telegram 等应用内调用 Roxy AI Agent。实现更灵活的跨端协同。 3.代理 IP 返利(至高 30%): 推广计划升级,您的受邀用户在代理IP商店进行消费,即可为您带来至高30%的收益,实现收益与业务同步增长。 返利比例诚意拉满了—— 立即体验 4.Roxy全球代理 IP 商店持续拓展: 法国、香港、日本等地区优质原生 IP 上线,跨境访问更稳定、成功率更高,大大降低封号概率。 5.一键仿真输入 + Cookie 机器人: 一键仿真输入解决粘贴风控,手打又太慢的烦恼 Cookie 机器人自动生成真实浏览环境,显著增加您的账号存活率 ** 需求征集与吐槽 ** 在 L 站交流,最想听到的还是大家最真实的使用体验和技术反馈。我们最怕闭门造车脱离一线业务。 借着这次 147 内核上线,想和佬友们交流一下: Roxy目前的功能,还有哪些无法满足大家的痛点,或者在交互上有哪些想吐槽的~ 欢迎佬友在楼下直接开麦提需求,合理的痛点和需求我们直接拉进版本开发排期 ! 3 个帖子 - 3 位参与者 阅读完整话题

hnrss.org · 2026-04-17 22:45:32+08:00 · tech

Hi everyone, I built a small tool called GitShrink to solve a simple problem: making videos small enough (<10MB) to upload to GitHub. It runs entirely in the browser, so nothing is uploaded anywhere. Website: https://igtumt.github.io/gitshrink/ GitHub: https://github.com/igtumt/gitshrink It’s a small, local-first tool with no accounts, no tracking, and no backend. Use cases: README demo videos Small product demos Screen recordings for GitHub Feedback is welcome. Comments URL: https://news.ycombinator.com/item?id=47806514 Points: 2 # Comments: 1

hnrss.org · 2026-04-17 20:38:48+08:00 · tech

Hey HN, recently I wrote an open-source Z-machine ( https://github.com/techbelly/elm-zmachine ) to support a course I'm teaching about interpreters and functional programming. Once I'd done that, I just had to make my own client. Partly, I wanted to enjoy playing the games I played when I was a kid. Partly, I just wanted to give my Z-machine a real test and see what kind of things I could build with access to the internals of the VM. Those old games could be super-frustrating. Especially the ones that teach you how to play by killing you over and over again - looking at you, Infidel. And while I used to sit and play for hours at a time, these days I only have a few minutes here and there. So, in Planedrift, every time you move, the full transcript and game state are snapshotted to localStorage. You can close the tab mid-game and come back to exactly where you were or use the history list to jump back in time. The idea is to make it easy to pick up a game for ten minutes and then put it down again. I'm no designer, and I've done my best to make it pleasant to look at. Behind the scenes it's written in Elm - which I know is not everyone's first choice, but it works for me! It only supports .z3 files at the minute, and .z5 is in progress. I’ve bundled the three publicly available Zorks, but you can bring your own .z3 file from one of the online archives. I'm thinking of adding more comprehensive note taking, maybe auto-mapping, transcript search and I'm playing with some plug-in ideas, and of course, dark mode! What do you think? What features should I prioritize? Ultimately, I hope you play some old Infocom games with Planedrift and enjoy it. Comments URL: https://news.ycombinator.com/item?id=47805289 Points: 2 # Comments: 0

hnrss.org · 2026-04-17 20:07:36+08:00 · tech

We all know the feeling: can’t remember the word but we know what it means. That’s why I built: WordFor - a reverse dictionary where you describe a concept and it suggests the word you’re looking for. - Runs entirely in the browser (no server calls) - No ads, tracking, or accounts - Low latency (results show up immediately as you type) Curious how well it works for other people, and happy to answer questions about the approach or tradeoffs. Technical Blog explaining the workings: https://zshn25.github.io/wordfor-reverse-dictionary/ Comments URL: https://news.ycombinator.com/item?id=47805014 Points: 3 # Comments: 2

hnrss.org · 2026-04-17 00:58:29+08:00 · tech

Mulligan Labs is a browser-based playtester for Magic: The Gathering. No account or install needed. Just create a room, share the link, import a decklist from Archidekt or Moxfield, and play with mouse and keyboard (mobile support is not great right now). Stack: SvelteKit on Cloudflare Workers, PartyKit (Durable Objects) for the authoritative game server. Clients propose actions over WebSocket; the server validates and broadcasts state. My background is networking and my cofounder's is industrial design. Neither of us had shipped a codebase like this before. We built it over the last 5 months with heavy Claude assistance. Happy to get into what that actually looked like in the comments. It's rough in places (the deck builder is just ok right now) but the core multiplayer loop is solid and we have played a ton of games on it with our Commander pod. We'd love feedback, especially from anyone who's played Cockatrice/XMage/Untap and has opinions on what a browser-native version should feel like. Comments URL: https://news.ycombinator.com/item?id=47796266 Points: 3 # Comments: 4

hnrss.org · 2026-04-16 20:48:21+08:00 · tech

I built https://securelocalpdf.com/ , a set of 25 PDF tools that run entirely in your browser — no uploads, no servers, no tracking. Most online PDF tools require you to send sensitive files and promise they’ll delete them “after 2 hours.” That includes sensitive, even password-protected documents — which still get processed server-side. That model fundamentally relies on trust. With SecureLocalPDF, there’s no upload step at all — everything happens locally in your browser, so your files never leave your device. It supports upto 20MB PDF instantly, 50MB PDFs with 2 to 3 second latency and 100MB PDFs with 5-6 second latency. All these depends on browser and the user device performance. Check out - https://securelocalpdf.com/ Please leave feedback for any improvement. ( There are couple of known UI bugs to iron out) Comments URL: https://news.ycombinator.com/item?id=47792231 Points: 3 # Comments: 2

linux.do · 2026-04-16 17:47:05+08:00 · tech

NexBrowser 指纹浏览器 现已经开启公测了,欢迎大家来试用,也欢迎多提意见。 我们收集了市面上多种指纹浏览器的使用体验,打磨出了目前的NexBrowser指纹浏览器,致力于将操作流程更简化,更丝滑,内存占用更小。 未来我们的重心会放在大规模自动化,AI批量操作,这些步骤,并会与NexIP,NexSMS等服务接通。让佬们实现一个窗口全自动注册账户。 直达链接: NexBrowser 这次就不吹太多了,公测主要还是想收集一下大家真实使用过程里的反馈,看看产品到底还有哪些地方需要继续打磨。 为了方便大家先上手试试,活动还是在L站进行, 现在注册就送 5 个免费窗口,另外还能支持 2 个团队成员一起用。 先简单说下 NexBrowser 现在已经有的一些功能特点: 团队协作 :支持成员分配、权限协同,并可实现多人分环境管理 全量扩展中心 :支持安装 Google Store 插件,并提供 NexBrowser 推荐扩展 社交账号管理 :支持 TikTok、Facebook、X 等多个平台账号批量导入,并统一管理 代理批量管理 :支持代理统一导入、批量配置,并适配多环境使用 一键切换指纹 :支持自定义指纹参数,并可随机生成新指纹快速切换 多账号管理 :支持多账号集中创建、维护,并提升日常管理效率 浏览器环境隔离 :支持独立环境隔离,并减少不同账号之间的相互影响 多项目并行操作 :支持多个项目同步开展,并减少频繁切换带来的操作成本 多配置环境创建与管理 :支持按需创建不同配置环境,并统一进行管理 平台账号批量导入 支持 TikTok、Facebook、X 等几十种平台的账号批量导入。 代理IP管理 代理IP支持批量导入,并支持检测存活状态。 扩展中心 支持自定义安装 Google Store 中的插件,也支持安装 NexBrowser 扩展中心推荐插件。 指纹创建 新建浏览器环境时可以自定义指纹参数,也可以随机生成一套新的指纹。 环境测试 配置完成后可以直接做环境检测,看看当前环境效果。 目前除了基础的环境隔离、账号管理、代理配置这些功能外,后面也会继续往 AI 辅助管理、MCP 接入、批量化操作、自动化流程联动 这些方向去完善。 这次公测我们主要还是想收集两类东西: 真实使用反馈 有效 BUG 提交 之前看大家对这类工具多少还是有点需求,所以这次就直接开放出来给大家上手试试。 如果你觉得哪里不顺手,哪里有问题,或者哪些功能还可以继续优化,欢迎直接提。 公测期间 提建议,欢迎 找问题,欢迎 抓 BUG,欢迎 有效 BUG 会有奖励 有兴趣的可以直接注册试试: NexBroswer 目前官网注册页面正在优化,如遇不能注册可以下载安装后注册。 103 个帖子 - 82 位参与者 阅读完整话题

hnrss.org · 2026-04-16 03:21:57+08:00 · tech

You run an AI research lab. Manage compute, train models, balance overfitting vs underfitting, deploy to users. I'm a professor in AI and wanted to make model training feel intuitive for people without a technical background. The loss curve graph, the overfitting trap, the compute/quality tradeoff - all mechanics grounded in how training actually works. Phase 1 is live (~20 min). AI safety mechanics will come in Phase 2 with rival labs and regulatory pressure. Built with React + Vite. Curious what HN thinks, especially anyone in AI who can tell me what I got wrong. Comments URL: https://news.ycombinator.com/item?id=47783938 Points: 1 # Comments: 0

hnrss.org · 2026-04-15 23:57:54+08:00 · tech

Libretto ( https://libretto.sh ) is a Skill+CLI that makes it easy for your coding agent to generate deterministic browser automations and debug existing ones. Key shift is going from “give an agent a prompt at runtime and hope it figures things out” to: “Use coding agents to generate real scripts you can inspect, run, and debug”. Here’s a demo: https://www.youtube.com/watch?v=0cDpIntmHAM . Docs start at https://libretto.sh/docs/get-started/introduction . We spent a year building and maintaining browser automations for EHR and payer portal integrations at our healthcare startup. Building these automations and debugging failed ones was incredibly time-consuming. There’s lots of tools that use runtime AI like Browseruse and Stagehand which we tried, but (1) they’re reliant on custom DOM parsing that's unreliable on older and complicated websites (including all of healthcare). Using a website’s internal network calls is faster and more reliable when possible. (2) They can be expensive since they rely on lots of AI calls and for workflows with complicated logic you can’t always rely on caching actions to make sure it will work. (3) They’re at runtime so it’s not interpretable what the agent is going to do. You kind of hope you prompted it correctly to do the right thing, but legacy workflows are often unintuitive and inconsistent across sites so you can’t trust an agent to just figure it out at runtime. (4) They don’t really help you generate new automations or help you debug automation failures. We wanted a way to reliably generate and maintain browser automations in messy, high-stakes environments, without relying on fragile runtime agents. Libretto is different because instead of runtime agents it uses “development-time AI”: scripts are generated ahead of time as actual code you can read and control, not opaque agent behavior at runtime. Instead of a black box, you own the code and can inspect, modify, version, and debug everything. Rather than relying on runtime DOM parsing, Libretto takes a hybrid approach combining Playwright UI automation with direct network/API requests within the browser session for better reliability and bot detection evasion. It records manual user actions to help agents generate and update scripts, supports step-through debugging, has an optional read-only mode to prevent agents from accidentally submitting or modifying data, and generates code that follows all the abstractions and conventions you have already in your coding repo. Would love to hear how others are building and maintaining browser automations in practice, and any feedback on the approach we’ve taken here. Comments URL: https://news.ycombinator.com/item?id=47780971 Points: 12 # Comments: 2

hnrss.org · 2026-04-15 22:34:54+08:00 · tech

Glance is a browser extension that fact-checks posts in your X/Twitter feed as you scroll. A small icon sits next to each post, reads the content, and surfaces missing context, disputed claims, and things that are being pushed back on in the replies. It works on any Chromium-based browser. We waited 6 months before building this because the economics looked impossible. A decent quality AI fact-checking analysis is $0.05-0.15 per post, a typical user scrolls hundreds of posts per seconds. The only way to make the math work was a pipeline that leverages the comment section to triage posts and analyze in depth only when it's necessary. It works more or less like this: 1. Local filter in-browser (free): short posts, already-seen content. 2. Small-model triage: does this post even make a factual worth checking? 3. Comment analysis (main path): pull the replies, analyze them alongside the post. 4. Full web-search analysis: only when steps 1-3 can't decide. Average cost landed at ~$0.0015 per post, which looks sustainable with a subscription model, and can definitely be optimized. Comments URL: https://news.ycombinator.com/item?id=47779599 Points: 1 # Comments: 0