What is Atlas?
Atlas is a production-grade memory backend for AI agents. Instead of losing context between conversations or cramming everything into a single prompt, Atlas gives your agents three distinct, interconnected memory stores that mirror how human memory actually works.Episodic Memory
Stores raw conversation chunks as vector embeddings in Qdrant. Enables fast similarity search over past experiences.
Semantic Memory
Extracts entities and relationships into a Neo4j knowledge graph. Supports multi-hop reasoning across facts.
Working Memory
A per-session sliding context cache in Redis. Tracks rolling topic vectors, recent entities, and hot facts.
How it works
When you callbrain.add("Sarah is the Lead Engineer at Acme Corp."), Atlas:
- Chunks the text using a semantic chunker (embedding-aware sentence grouping)
- Extracts named entities and relationships via LLMGraphTransformer or spaCy
- Stores vector embeddings in Qdrant (episodic) and graph triples in Neo4j (semantic)
- Updates the session topic vector in Redis (working memory)
brain.search("Who manages engineering at Acme?"), Atlas:
- Runs hybrid retrieval across all three stores simultaneously
- Scores results vector similarity, recency, frequency, graph strength
- Returns ranked, de-duplicated facts ready to inject into your LLM system prompt
Key features
Hybrid scoring — not just vector search
Hybrid scoring — not just vector search
Atlas scores memories on four axes: Vector similarity, temporal Recency (Ebbinghaus decay), access Frequency, and graph Association strength. This means recently-reinforced, highly-related facts always surface above stale noise.
Multi-hop graph reasoning
Multi-hop graph reasoning
Ask complex relational questions across up to 5 hops in the knowledge graph. Atlas traverses entity relationships and grounds LLM answers in retrieved facts — no hallucination.
Memory lifecycle management
Memory lifecycle management
Memories decay over time via the Ebbinghaus forgetting curve. Consolidation compresses related clusters into abstractions. Pruning removes low-confidence nodes below a configurable threshold.
Tenant isolation built-in
Tenant isolation built-in
Every memory write and read is scoped to a
user_id derived server-side from your API key. No cross-tenant leakage is possible — even if a client sends the wrong user_id.Zero-dependency Python SDK
Zero-dependency Python SDK
The
atlas-mem package requires only requests. Async support via httpx is available as an optional extra.Pricing & tiers
| Tier | Monthly Ops | Batch Size | Graph QA | Prune |
|---|---|---|---|---|
| Free | 1,000 | 5 | ✗ | ✗ |
| Starter | 50,000 | 20 | ✓ | ✗ |
| Pro | 500,000 | 50 | ✓ | ✓ |
| Scale | 5,000,000 | 100 | ✓ | ✓ |
| Enterprise | Unlimited | 100 | ✓ | ✓ |
Operation costs:
ingest = 5 ops, retrieve = 2 ops, graph_qa = 10 ops, consolidate = 3 ops, prune = 1 op, stats/health = free.Next steps
Quickstart
Be up and running in under 5 minutes.
API Reference
Full endpoint documentation with request/response schemas.