Retrieve - Atlas SDK Docs

Cost: 2 operations per call

`POST /brain/retrieve`

Runs the full query-adaptive retrieval pipeline:

Session context — loads rolling topic vector from Redis (if session_id provided)
Episodic search — similarity search in Qdrant (text chunks)
Semantic search — vector lookup in Neo4j entity graph
Graph traversal — multi-hop BFS from seed entities (if max_hops > 1)
Working memory injection — appends hot-cached facts from the current session
Hybrid scoring — across all candidates
Conflict resolution — deduplicates facts by slot, preferring user-sourced and newer
LongContextReorder — reorders results for optimal LLM prompt position

Request body

{
  "query": "What database does Acme Corp use and who chose it?",
  "k": 5,
  "persona": "shared",
  "session_id": "session-abc",
  "include_episodic": true,
  "include_semantic": true,
  "include_working": true,
  "max_hops": 2,
  "min_score": 0.2,
  "use_compression": false
}

Field	Type	Required	Default	Description
`query`	`string`	✓	—	Natural language query.
`k`	`int`		`5`	Number of results to return (1–20).
`persona`	`string`		`null`	Filter results to a specific persona (+ always includes `"shared"`).
`session_id`	`string`		`null`	Enables working memory context blending.
`include_episodic`	`bool`		`true`	Search Qdrant vector store.
`include_semantic`	`bool`		`true`	Search Neo4j entity graph.
`include_working`	`bool`		`true`	Inject hot-cached session facts.
`max_hops`	`int`		`null`	Graph traversal depth (1–5). Set `null` to skip traversal.
`min_score`	`float`		`null`	Minimum hybrid score threshold (0.0–1.0).
`use_compression`	`bool`		`false`	LLM-based contextual compression (requires OpenAI).

Response

{
  "context": "- Sarah Jenkins is the Lead Software Engineer at Acme Corp. (score: 0.84, via: episodic)\n- acme_corp USE postgres (score: 0.79, via: semantic)\n- Sarah Jenkins joined in 2019 and proposed the Postgres migration. (score: 0.71, via: episodic)",
  "facts": [
    {
      "fact": "Sarah Jenkins is the Lead Software Engineer at Acme Corp.",
      "score": 0.84,
      "source_type": "episodic",
      "created_at": 1719859200.0,
      "access_count": 4,
      "confidence": 1.0,
      "metadata": {}
    },
    {
      "fact": "acme_corp USE postgres",
      "score": 0.79,
      "source_type": "semantic",
      "created_at": 1719859200.0,
      "access_count": 1,
      "confidence": 0.9,
      "metadata": null
    }
  ],
  "episodic_count": 2,
  "semantic_count": 1,
  "working_count": 0,
  "reasoning_chain": null,
  "latency_ms": 187.3
}

Field	Type	Description
`context`	`string`	Pre-formatted memory context string ready to inject into an LLM system prompt.
`facts`	`array`	Ranked list of memory facts.
`facts[].fact`	`string`	Human-readable fact text.
`facts[].score`	`float`	Hybrid score (0.0–1.0). Higher is more relevant.
`facts[].source_type`	`string`	`"episodic"`, `"semantic"`, or `"working"`.
`facts[].created_at`	`float`	Unix timestamp when this memory was created.
`facts[].confidence`	`float`	Extraction confidence (1.0 for user-sourced).
`episodic_count`	`int`	Facts from Qdrant vector store.
`semantic_count`	`int`	Facts from Neo4j graph.
`reasoning_chain`	`array\|null`	Traversal steps when `max_hops` > 1.
`latency_ms`	`float`	Total retrieval time in milliseconds.

Code examples

from atlas_mem import AtlasMem

brain = AtlasMem(api_key="atlas_...", base_url="https://api.bsyncs.com", user_id="user-123")

# Basic search
results = brain.search("Who is the lead engineer at Acme?", k=5)

# Inject into system prompt
print(results.format(max_facts=5))

# Iterate over individual facts
for fact in results:
    print(f"[{fact['source_type']}] {fact['fact']} ({fact['score']:.2f})")

# With multi-hop graph traversal
results = brain.search(
    "What infrastructure choices did the engineering team make?",
    k=10,
    max_hops=3,
)
print(results.reasoning_chain)

Using `context` in your LLM prompt

The context field is a pre-formatted string designed to drop directly into an LLM system prompt:

import openai
from atlas_mem import AtlasMem

brain = AtlasMem(api_key="atlas_...", base_url="https://api.bsyncs.com", user_id="user-123")

def answer(user_question: str) -> str:
    memory = brain.search(user_question, k=5)

    messages = [
        {
            "role": "system",
            "content": f"""You are a helpful assistant. Use the memory context below to personalise your answer.

MEMORY:
{memory.context}

If memory is not relevant, answer from general knowledge.""",
        },
        {"role": "user", "content": user_question},
    ]

    resp = openai.chat.completions.create(model="gpt-4o", messages=messages)
    return resp.choices[0].message.content

Score interpretation

Score range	Meaning
`0.7 – 1.0`	High confidence — directly relevant
`0.4 – 0.7`	Medium confidence — probably relevant
`0.2 – 0.4`	Low confidence — tangentially related
`< 0.2`	Filtered out by `v_floor` (not returned by default)

​POST /brain/retrieve

​Request body

​Response

​Code examples

​Using context in your LLM prompt

​Score interpretation

`POST /brain/retrieve`

Request body

Response

Code examples

Using `context` in your LLM prompt

Score interpretation