Skip to main content
Cost: 2 operations per call

POST /brain/retrieve

Runs the full query-adaptive retrieval pipeline:
  1. Session context — loads rolling topic vector from Redis (if session_id provided)
  2. Episodic search — similarity search in Qdrant (text chunks)
  3. Semantic search — vector lookup in Neo4j entity graph
  4. Graph traversal — multi-hop BFS from seed entities (if max_hops > 1)
  5. Working memory injection — appends hot-cached facts from the current session
  6. Hybrid scoring — across all candidates
  7. Conflict resolution — deduplicates facts by slot, preferring user-sourced and newer
  8. LongContextReorder — reorders results for optimal LLM prompt position

Request body

{
  "query": "What database does Acme Corp use and who chose it?",
  "k": 5,
  "persona": "shared",
  "session_id": "session-abc",
  "include_episodic": true,
  "include_semantic": true,
  "include_working": true,
  "max_hops": 2,
  "min_score": 0.2,
  "use_compression": false
}
FieldTypeRequiredDefaultDescription
querystringNatural language query.
kint5Number of results to return (1–20).
personastringnullFilter results to a specific persona (+ always includes "shared").
session_idstringnullEnables working memory context blending.
include_episodicbooltrueSearch Qdrant vector store.
include_semanticbooltrueSearch Neo4j entity graph.
include_workingbooltrueInject hot-cached session facts.
max_hopsintnullGraph traversal depth (1–5). Set null to skip traversal.
min_scorefloatnullMinimum hybrid score threshold (0.0–1.0).
use_compressionboolfalseLLM-based contextual compression (requires OpenAI).

Response

{
  "context": "- Sarah Jenkins is the Lead Software Engineer at Acme Corp. (score: 0.84, via: episodic)\n- acme_corp USE postgres (score: 0.79, via: semantic)\n- Sarah Jenkins joined in 2019 and proposed the Postgres migration. (score: 0.71, via: episodic)",
  "facts": [
    {
      "fact": "Sarah Jenkins is the Lead Software Engineer at Acme Corp.",
      "score": 0.84,
      "source_type": "episodic",
      "created_at": 1719859200.0,
      "access_count": 4,
      "confidence": 1.0,
      "metadata": {}
    },
    {
      "fact": "acme_corp USE postgres",
      "score": 0.79,
      "source_type": "semantic",
      "created_at": 1719859200.0,
      "access_count": 1,
      "confidence": 0.9,
      "metadata": null
    }
  ],
  "episodic_count": 2,
  "semantic_count": 1,
  "working_count": 0,
  "reasoning_chain": null,
  "latency_ms": 187.3
}
FieldTypeDescription
contextstringPre-formatted memory context string ready to inject into an LLM system prompt.
factsarrayRanked list of memory facts.
facts[].factstringHuman-readable fact text.
facts[].scorefloatHybrid score (0.0–1.0). Higher is more relevant.
facts[].source_typestring"episodic", "semantic", or "working".
facts[].created_atfloatUnix timestamp when this memory was created.
facts[].confidencefloatExtraction confidence (1.0 for user-sourced).
episodic_countintFacts from Qdrant vector store.
semantic_countintFacts from Neo4j graph.
reasoning_chainarray|nullTraversal steps when max_hops > 1.
latency_msfloatTotal retrieval time in milliseconds.

Code examples

from atlas_mem import AtlasMem

brain = AtlasMem(api_key="atlas_...", base_url="https://api.bsyncs.com", user_id="user-123")

# Basic search
results = brain.search("Who is the lead engineer at Acme?", k=5)

# Inject into system prompt
print(results.format(max_facts=5))

# Iterate over individual facts
for fact in results:
    print(f"[{fact['source_type']}] {fact['fact']} ({fact['score']:.2f})")

# With multi-hop graph traversal
results = brain.search(
    "What infrastructure choices did the engineering team make?",
    k=10,
    max_hops=3,
)
print(results.reasoning_chain)

Using context in your LLM prompt

The context field is a pre-formatted string designed to drop directly into an LLM system prompt:
import openai
from atlas_mem import AtlasMem

brain = AtlasMem(api_key="atlas_...", base_url="https://api.bsyncs.com", user_id="user-123")

def answer(user_question: str) -> str:
    memory = brain.search(user_question, k=5)

    messages = [
        {
            "role": "system",
            "content": f"""You are a helpful assistant. Use the memory context below to personalise your answer.

MEMORY:
{memory.context}

If memory is not relevant, answer from general knowledge.""",
        },
        {"role": "user", "content": user_question},
    ]

    resp = openai.chat.completions.create(model="gpt-4o", messages=messages)
    return resp.choices[0].message.content

Score interpretation

Score rangeMeaning
0.7 – 1.0High confidence — directly relevant
0.4 – 0.7Medium confidence — probably relevant
0.2 – 0.4Low confidence — tangentially related
< 0.2Filtered out by v_floor (not returned by default)