智能助手网
标签聚合 embedding

/tag/embedding

hnrss.org · 2026-04-15 23:05:25+08:00 · tech

For some reason, embedding a Jupyter notebook in a documentation site still means doing one of these: screenshot every cell, run nbconvert and clean up the HTML, copy-paste the code into fenced blocks (losing the output), or iframe in nbviewer. Every time you update the notebook, you redo the whole dance. notebook-mdx lets you just link the `.ipynb` file and it renders inline in your MDX. No export step, no cleanup, no re-pasting. Update the notebook, rebuild the docs, done. :::notebook{file="./analysis.ipynb"} ::: That's the API just link the file. You get authentic Jupyter styling (In/Out prompts, syntax highlighting, rich outputs including images and HTML), build-time rendering so there's zero client-side JS, multi-language support (Python, R, JS, SQL). Works with Next.js, Docusaurus, Fumadocs, and any MDX framework. Example output: - Example 1: https://notebook-mdx.vercel.app/docs/examples/notebook-demo - Example 2: https://notebook-mdx.vercel.app/docs/examples/directive-exam... Repo: https://github.com/abhay-ramesh/notebook-mdx Docs: https://notebook-mdx.vercel.app Comments URL: https://news.ycombinator.com/item?id=47780138 Points: 2 # Comments: 0

hnrss.org · 2026-04-13 23:07:09+08:00 · tech

I've been building an AI system to automate parts of the NRC Combined Operational License process: gap analysis against the Standard Review Plan, FSAR strength scoring, and RAI prediction using vector similarity to historical NRC requests. I intended this as a SaaS business, but was ultimately beat to the market. What I think is the most interesting artifact is the dataset: 37,734 chunks of NRC regulatory documents (NUREG-0800, 10 CFR Parts 20/50/51/52/72/73/100, and Regulatory Guides) embedded with OpenAI text-embedding-3-small. It covers the full regulatory corpus an applicant would need for a COL submission. I'm not aware of anything like this being publicly available before. The embeddings are ready to load directly into ChromaDB, Pinecone, or any other vector store. If you're doing nuclear AI, regulatory NLP, or just want a large real-world RAG dataset to experiment with, it should be useful. Here's the full codebase if you're interested: https://github.com/Davenporten/nrc-licensing-rag Comments URL: https://news.ycombinator.com/item?id=47753102 Points: 1 # Comments: 0

imjuya.github.io · 2026-03-11 09:36:24+08:00 · tech

AI 早报 2026-03-11 视频版:哔哩哔哩 | YouTube 概览 要闻 谷歌发布 Gemini Embedding 2 原生多模态嵌入模型 ↗ #1 Google 升级 Workspace Gemini 功能 ↗ #2 模型发布 Tencent AI Lab 开源 LeVo 2 音乐模型,支持 4 分半全曲生成 ↗ #3 Fish Audio 开源 S2 文本转语音模型 ↗ #4 Hume AI 开源 TADA,基于 Llama 3.2 实现语音语言统一模型 ↗ #5 开发生态 OpenAI 调整 Codex 服务,免费层级移除 gpt-5.4 模型权限 ↗ #6 Claude Code 引入 /btw 命令支持后台任务对话 ↗ #7 JetBrains 上线 Air,支持多 Agent 并行执行…