Search

How does full-text search work?

Pass a query string to trigger BM25 full-text search over all indexed grain fields. Areev uses a Tantivy index that tokenizes subjects, relations, objects, descriptions, and content fields.

Full-text search in the Areev context database ranks results by term frequency and inverse document frequency. Grains that mention query terms more often and more uniquely score higher. This is the default search mode for AI memory recall — any time you pass a query parameter without structural filters, BM25 handles the request end-to-end.

Results include a score field indicating relevance. BM25 works well for keyword-oriented lookups (“deployment failed staging”) where the exact terms matter. For meaning-based retrieval where synonyms and paraphrases should match, combine BM25 with vector search or use an embedding directly.

import areev

db = areev.open("./my-data")
hits = db.recall(query="deployment failed staging", limit=10)
for hit in hits:
    print(hit["score"], hit["subject"])
POST /api/memories/ops-log/recall
Content-Type: application/json

{
  "query": "deployment failed staging",
  "limit": 10
}
areev recall --query "deployment failed staging" --limit 10

How do I filter by time and metadata?

Use temporal expressions, namespace filters, grain type filters, and tag filters to narrow results. The CLI accepts natural language temporal expressions like "today", "last 7 days", or "yesterday".

Filters operate as post-retrieval constraints applied after the primary search engine scores candidates. You can combine any number of filters in a single request. The namespace filter restricts results to a specific organizational partition. The grain_type filter limits results to a single type (belief, event, state, etc.). Tag filters use AND logic — a grain must have all specified tags to match. The confidence_threshold and importance filters discard grains below the given thresholds.

Temporal filtering supports two modes: natural language expressions via temporal_expr in the HTTP API or the CLI --temporal flag (parsed server-side), and explicit epoch millisecond ranges via time_range_start/time_range_end in the HTTP API. The natural language parser handles relative expressions (“last 7 days”, “yesterday”, “this month”) and absolute dates. Use temporal filters when your AI agent memory grows large and you need to focus on recent context.

hits = db.recall(query="errors", namespace="production", grain_type="event")
hits = db.recall(query="errors", time_range_start=1709251200000, time_range_end=1709337600000)
POST /api/memories/ops-log/recall
Content-Type: application/json

{
  "query": "errors",
  "namespace": "production",
  "grain_type": "event",
  "limit": 20
}
areev recall --query "errors" --namespace production --grain-type event \
  --temporal "last 7 days" --importance 0.5 --confidence 0.8 --limit 20

How does semantic vector search work?

Pass a pre-computed embedding vector to perform KNN similarity search over the HNSW index. This finds semantically similar grains regardless of exact keyword overlap.

Vector search uses the HNSW approximate nearest-neighbor algorithm backed by USearch or FAISS (depending on the memory’s vector_backend setting). You supply the embedding from your own model — Areev does not embed text for you at query time. The search returns grains whose stored embeddings are closest to the input vector by cosine distance.

When you provide both query and embedding, Areev fuses BM25 text scores with vector similarity scores using Reciprocal Rank Fusion (RRF). You can also combine vector search with structural filters (subject, relation, object) for maximum precision. This multi-signal approach is especially effective for autonomous memory systems where keyword matching alone misses semantic connections.

embedding = my_model.encode("user preferences for dark themes")
hits = db.recall(embedding=embedding, limit=5)
POST /api/memories/ops-log/recall
Content-Type: application/json

{
  "embedding": [0.012, -0.034, 0.056],
  "limit": 5
}

How do I detect contradictions in results?

Enable contradiction detection to penalize grains that conflict with each other. Areev identifies grains sharing the same subject and relation but with different objects, and reduces their scores.

Contradiction detection is part of the interference engine. When enabled, the query pipeline scans result pairs for conflicting subject-relation combinations. If “john likes coffee” and “john likes tea” both appear, the engine applies a score penalty to the older or lower-confidence grain, pushing the most authoritative version to the top.

Use contradiction detection when querying a context database that accumulates beliefs over time without strict supersession discipline. It acts as a safety net, surfacing the most current truth even when outdated beliefs remain in storage. The penalty is a post-retrieval adjustment — it does not remove grains, only reorders them.

hits = db.recall(query="john preferences", detect_contradictions=True)
POST /api/memories/ops-log/recall
Content-Type: application/json

{
  "query": "john preferences",
  "subject": "john",
  "detect_contradictions": true
}
areev recall --query "john preferences" --detect-contradictions --json
  • Add and Query — basic grain operations
  • CAL — declarative query language for advanced retrieval
  • Scopes — hierarchical namespace search with scope paths