Crypto-Erasure

How does crypto-erasure work?

Crypto-erasure permanently renders user data unrecoverable by destroying the per-user Data Encryption Key (DEK) rather than overwriting individual grain blobs. This context database encrypts every grain with AES-256-GCM using the user’s DEK, so destroying the key makes all associated AI memory computationally infeasible to decrypt — there is no partially-deleted state.

The forget_user() operation executes a deterministic sequence: emit a UserErased audit event (audit-first ordering), delete all grain blobs and index entries (hexastore, FTS, vector, entity_latest), redact cross-user grains where the erased entity appears as subject or object, destroy the user’s DEK from the key store (crypto-erasure), run secondary warm/cold tier blob deletion, and return an ErasureProof. Grain deletion happens before key destruction because the DEK is needed to locate encrypted index entries. This approach is faster than data overwriting (O(1) key destruction vs O(n) blob rewrites) and provides cryptographic certainty of deletion.

The UserErased audit event records the key fingerprint (first 16 hex chars of SHA-256 of the key) and grain count for forensic correlation without revealing the actual key material.

import requests

# Erase all data for a user
resp = requests.post("http://localhost:4009/api/memories/default/forget",
    json={"user_id": "john"})
proof = resp.json()
# proof["count"], proof["key_fingerprint"]
POST /api/memories/default/forget HTTP/1.1
Host: localhost:4009
Content-Type: application/json

{"user_id": "john"}
areev erase john

What does the erasure proof contain?

The forget_user() call returns an ErasureProof struct that serves as a verifiable record of deletion. This autonomous memory system produces a proof that data subjects or regulators can use as evidence of GDPR Art. 17 compliance.

The proof includes user_id, count (number of grains erased), key_fingerprint (computed as hex(SHA-256(dek)[..8]) — 16 hex characters that uniquely identify the destroyed key without revealing key material), timestamp (Unix milliseconds), tiers_erased (list of storage tiers cleaned), warm_objects_deleted and cold_objects_deleted (tiered storage cleanup counts), user_record_deleted (boolean confirming user record removal), refresh_tokens_revoked (auth token cleanup count), hook_events_tombstoned (CDC event cleanup count), and cross_user_grains_redacted (number of other users’ grains where the erased entity appeared as subject or object and was replaced with “[erased]”). Six compliance verification checks validate erasure completeness: erasure_crypto, erasure_completeness, erasure_proof, erasure_key_destruction, erasure_data_inaccessible, and erasure_memory_clean.

{
  "user_id": "john",
  "count": 42,
  "key_fingerprint": "a1b2c3d4e5f67890",
  "timestamp": 1741531800000,
  "tiers_erased": ["hot"],
  "warm_objects_deleted": null,
  "cold_objects_deleted": null,
  "user_record_deleted": true,
  "refresh_tokens_revoked": 0,
  "hook_events_tombstoned": null,
  "cross_user_grains_redacted": 3
}

How does tiered storage erasure work?

When warm or cold storage tiers are configured, forget_user() performs a secondary cleanup pass after destroying the DEK. Blobs in object storage (S3/Azure/GCS) and archive storage (Glacier/Archive) are explicitly deleted. This AI agent memory system treats tiered cleanup as best-effort — even if secondary cleanup partially fails due to a network error or storage timeout, the data remains cryptographically inaccessible because the DEK is already destroyed.

Primary erasure in the hot tier destroys the DEK (making all local ciphertext unrecoverable) and deletes grain entries from the Fjall blobs partition. Secondary cleanup enumerates the user’s grains in the tier map, deletes each blob from the object storage backend, and emits a TierErasureAttempt audit event recording attempted vs successful deletions for operational follow-up.

Primary erasure (hot tier):
  1. Destroy DEK -> all local ciphertext is unrecoverable
  2. Delete grain entries from Fjall blobs partition

Secondary cleanup (warm/cold tiers):
  1. Enumerate user's grains in the tier map
  2. Delete each blob from the object storage backend
  3. Audit: TierErasureAttempt event with counts

How does scope-level erasure work?

Areev supports scope-level crypto-erasure, where all grains within a scope are deleted by destroying the scope’s DEK. This context database uses random per-scope DEKs wrapped by the master key, so destroying the scope key is independent of user-level DEKs — a scope erasure does not affect a user’s data in other scopes, and vice versa.

Scope-level erasure is useful for tenant isolation scenarios where an entire project or department needs to be purged. The ScopeErased audit event records the scope path and grain count. The operation follows the same deterministic sequence as user-level erasure: key destruction, index cleanup, audit logging, and tiered storage cleanup.

areev erase-scope projects/acme
# Scope 'projects/acme' erased: 128 grains deleted, key fingerprint: 9f8e7d6c5b4a3210
  • Key Management: How DEKs are created, wrapped, and rotated
  • Encryption: AES-256-GCM encryption details
  • Audit Trail: Hash-chained audit entries for erasure events
  • GDPR: GDPR Art. 17 right-to-erasure requirements