Open Notebook: The Self-Hosted Open Source NotebookLM That Doesn't Lock You In

Your NotebookLM research sessions live on Google's servers. Your documents, your audio overviews, your AI-generated insights — all processed by Google's models, stored under Google's terms, with no option to run it elsewhere.
Open Notebook is what several developers built instead.
Started by Luis Novo and now carrying 32k+ GitHub stars — gaining 3,891 of them in a single week — Open Notebook is a self-hosted, MIT-licensed open source NotebookLM implementation that swaps Google's closed stack for 18+ AI providers, local model support, and full data ownership. It's not a clone. It extends the concept in directions Google structurally can't go.
What NotebookLM Gets Right (and Why the Gap Still Matters)
Google's NotebookLM solved a real research problem. Before it, researchers were manually copy-pasting sources into chat windows, losing context, and starting fresh every session. NotebookLM introduced persistent research workspaces: load PDFs, YouTube videos, and web pages into a single grounded context, then query against exactly those sources — not the entire training distribution of a general-purpose model.
The audio overview feature was the viral moment. Two AI voices synthesizing your sources into a podcast-style conversation sounds like a toy until you're reviewing 40 pages of technical documentation at the gym. It worked. People used it constantly.
But the model is fixed. The data is Google's. The system prompts are locked. And if your research involves sensitive customer data, unpublished work, or anything subject to data residency requirements — healthcare records, legal documents, pre-publication research — "put it in NotebookLM" isn't an option.
That's the gap Open Notebook fills.
The Architecture: How It Actually Works
Open Notebook runs as Docker containers — a Next.js/React frontend, a Python/FastAPI backend, and SurrealDB handling storage, vector indexes, and full-text search in a single database.
The SurrealDB choice is worth noting. Rather than bolting a vector database alongside a traditional relational store (the typical pattern for AI research tools), SurrealDB handles everything: document metadata, embeddings for semantic search, and full-text indexing. One database, one query interface, no sync complexity between two systems.
The AI layer runs through LangChain and the Esperanto library, which abstracts across providers so you can swap models without rewriting integration code. Supported providers: OpenAI, Anthropic, Groq, Google GenAI, Vertex AI, Ollama, Perplexity, ElevenLabs, Deepgram, Azure OpenAI, Mistral, DeepSeek, Voyage, xAI, OpenRouter, DashScope, MiniMax, LM Studio. Eighteen-plus, covering every tier from frontier closed models to fully local inference.
What You Can Feed It
Open Notebook accepts PDFs, Office documents, YouTube videos (auto-transcribed), audio files, web pages, and raw text. Each source gets chunked, embedded, and indexed. Retrieval is semantic (vector search) or keyword-based (full-text), depending on the query. Per-source visibility controls let you mark specific sources as private or restricted — available to some AI interactions but not others. Google doesn't offer anything comparable.
The Podcast Feature, Taken Further
NotebookLM made the AI audio overview famous. Open Notebook extends it in ways the Google version structurally can't. Where Google gives you two fixed speakers, Open Notebook supports 1–4 speakers with fully configurable profiles — custom personalities, voice characteristics, conversational tendencies. You write what those speakers are like; the generation follows your spec.
Episode Profiles control structure: debate, lecture, interview, explainer for experts versus newcomers. The generation prompts are fully editable. If Google's audio overview sounds slightly off for your subject matter, you can't fix it. In Open Notebook, you can change every part of the pipeline.
TTS connects to external providers — ElevenLabs Flash v2 and Groq Whisper are common choices — which means audio quality scales with what you're willing to pay for, not what Google provisions.
Setup in Three Commands
# Download the compose file
curl -O https://raw.githubusercontent.com/lfnovo/open-notebook/main/docker-compose.yml
# Set encryption key
export SECRET_KEY=your_secret_key_here
# Start
docker compose up -d
Access the UI at http://localhost:8502 after 15–20 seconds. Configure your AI provider, add API keys, assign defaults for chat, embedding, and speech, and it's live. Full setup takes under five minutes for anyone comfortable with Docker.
Minimum 4GB RAM is recommended. SurrealDB can develop OOM issues below 2GB at scale (around 500 chunks). For a typical research workload on a modern laptop or a standard VPS, it runs comfortably. Want zero external API costs? Ollama is a supported provider — run it alongside Open Notebook and you have a fully air-gapped local RAG workspace with no API bills.
Open Notebook vs. Google NotebookLM
| Open Notebook | Google NotebookLM | |
|---|---|---|
| Deployment | Self-hosted (Docker) | Cloud-only |
| Data ownership | Fully local | Google's infrastructure |
| AI models | 18+ providers, local AI | Fixed to Google's models |
| Podcast speakers | 1–4, custom profiles | 2, fixed |
| System prompts | Fully editable | Locked |
| Source visibility control | Per-source granularity | Not available |
| Reasoning model support | DeepSeek-R1, Qwen3 | Not available |
| REST API | Full API included | No public API |
| Cost structure | Free + your API usage | Subscription |
| License | MIT | Proprietary |
One area where Google still leads: inline citation highlighting. NotebookLM's click-a-claim, see-the-source-chunk experience is genuinely polished. Open Notebook's citation system is functional but more basic. That gap is acknowledged in the project issues and on the roadmap.
Where Open Notebook pulls ahead is anywhere "send your data to Google" creates a problem. Enterprise research teams, security researchers, legal teams working with privileged documents, regulated healthcare, financial compliance — for all of these, the self-hosted option isn't a consolation prize. It's the baseline requirement.
List Your AI Research Tool on SaaSCity
Building a research tool, RAG product, or knowledge management application? Your users are actively looking for alternatives — they're just not finding yours yet.
SaaSCity.io is the premier directory for AI tools and open-source software. Every listing appears as a building in an interactive 3D city map — not another row in a static table.
- Free to list: Submit in under 2 minutes. No credit card required.
- Earn dofollow backlinks: Every listing earns high-quality backlinks that compound over time. Our guide to domain rating covers why this matters for long-term SEO.
- 3D map visibility: Your product gets placed in the SaaSCity engine — a genuinely distinctive directory experience.
- Reach the right audience: Founders, technical buyers, and early adopters actively looking for what you ship.
What the Numbers Say
Open Notebook has 32k+ GitHub stars, 3.6k+ forks, 60+ contributors, and 38 releases across the project's history. A single-week spike of 3,891 stars placed it in the top-15 trending Python repositories — a signal that the privacy-and-flexibility problem resonates well beyond the self-hosting enthusiast crowd.
The project is at v1.10.0 as of June 2026. Shipping velocity has been consistent, not spiked-then-abandoned. The Discord community is active. Issues are triaged. The gap between "interesting GitHub project" and "thing I'd actually deploy" is one this project has clearly crossed for a meaningful number of teams.
Community coverage has been broad: KDnuggets ran a technical comparison against Google's product, XDA-Developers published two pieces on real-world switching experience, and the DEV Community has seen architectural deep-dives from developers running it in production.
The consistent thread across reviews: it works, setup takes more effort than a cloud tool (expected), and the flexibility-versus-simplicity trade-off resolves clearly toward Open Notebook for anyone who actually needs that flexibility.
What This Means for SaaS Builders
Three things worth thinking through if you're building AI-powered products.
Local RAG Architecture Has a Reference Implementation Now
Open Notebook's stack — FastAPI, SurrealDB, LangChain, Docker — is clean enough to study directly. If you're building any kind of document intelligence product, look at how they've structured source ingestion, the chunking pipeline, and the provider abstraction layer. It's not a novel architecture, but it's a well-executed, MIT-licensed one you can adapt freely.
For context on how this fits the broader open-source AI tooling ecosystem, the comparison of open-source AI SaaS boilerplates covers similar ground at the infrastructure layer — useful if you're deciding how much to build versus assemble.
The Privacy Pitch Is a Real Differentiator for Enterprise
"We process all your documents locally, with no external calls unless you configure external providers" is a direct sales argument against NotebookLM for any enterprise buyer with a data protection obligation. Self-hosting isn't a premium feature for regulated buyers — it's a procurement requirement. This is the same dynamic driving uptake for fully open models like Apertus: when a compliance audit asks "where is this data processed and by whom," closed cloud tools give you an uncomfortable answer. Open Notebook gives you a technically verifiable one.
AI Podcast Generation Is an Underexplored Product Surface
Most SaaS tools treat audio as a format. NotebookLM showed that synthesized audio can be a mode of comprehension — people genuinely process long-form technical content better when it sounds like a conversation. Open Notebook extends that with configurable speakers, episode profiles, and editable generation prompts. If you're building for researchers, analysts, or knowledge workers, that's a product direction worth evaluating seriously — and with a self-hosted stack, you're not constrained by whatever two-speaker format a cloud vendor decides to ship.
The Honest Trade-Off
Open Notebook is not a drop-in replacement for NotebookLM's user experience. Cloud tools start faster, require no infrastructure decisions, and arrive pre-integrated with a polished UI that required no setup on your part. NotebookLM's citation highlighting is better. Multi-user support in Open Notebook is basic — single-password authentication, no RBAC, not built for team deployments yet.
What Open Notebook trades for all of that is control: over the data, the models, the prompts, the cost structure, and the pipeline itself. You pay for exactly what you use, with the providers you choose, running on hardware you own. For individual researchers, small technical teams, and anyone with a data sensitivity requirement, that trade-off resolves clearly.
Thirty-two thousand people starred the repo. Most of them already knew they needed exactly this.
Get started:
- Source code: github.com/lfnovo/open-notebook
- Official site: open-notebook.ai
- Demo video: YouTube walkthrough
- Community: Discord
SaaSCity.io covers AI tools and open-source research products. Explore the SaaSCity directory to discover what's shipping right now — or list your own product.
Get your SaaS in front of founders
List your product on the SaaSCity live city map — a permanent listing, real discovery, and a backlink from a high-DR directory. Free to start; upgrade for a dofollow link and a building on the map.


