Awareness Memory Cloud Python SDK
Python SDK for Awareness Memory Cloud APIs and MCP-style memory workflows.
Install
pip install awareness-memory-cloud
Framework extras:
pip install awareness-memory-cloud[langchain]
pip install awareness-memory-cloud[crewai]
pip install awareness-memory-cloud[frameworks]
Quickstart
import os
from memory_cloud import MemoryCloudClient
client = MemoryCloudClient(
base_url=os.getenv("AWARENESS_API_BASE_URL", os.getenv("AWARENESS_BASE_URL", "https://awareness.market/api/v1")),
api_key="YOUR_API_KEY",
)
client.write(
memory_id="memory_123",
content="Customer asked for SOC2 evidence and retention policy.",
kwargs={"source": "python-sdk", "session_id": "demo-session"},
)
result = client.retrieve(
memory_id="memory_123",
query="What did customer ask for?",
custom_kwargs={"k": 3},
)
print(result["results"])
API Coverage (SDK/API aligned)
MemoryCloudClient now includes:
- Memory:
create_memory,list_memories,get_memory,update_memory,delete_memory - Content:
write,list_memory_content,delete_memory_content - Retrieval/Chat:
retrieve,chat,chat_stream,memory_timeline - MCP ingest:
ingest_events,ingest_content - Export:
export_memory_package,save_export_memory_package - Async jobs & upload:
get_async_job_status,upload_file,get_upload_job_status - Insights/API keys/wizard:
insights,create_api_key,list_api_keys,revoke_api_key,memory_wizard
MCP-style Helpers (SDK/MCP aligned)
These helpers mirror MCP tool semantics:
begin_memory_sessionrecall_for_taskremember_stepremember_batchbackfill_conversation_history
Example:
session = client.begin_memory_session(memory_id="memory_123", source="python-sdk")
client.remember_step(
memory_id="memory_123",
text="Refactored auth middleware and added tests.",
)
ctx = client.recall_for_task(
memory_id="memory_123",
task="summarize latest auth changes",
limit=8,
multi_level=False, # enable broader context retrieval across sessions and time ranges
cluster_expand=False, # enable topic-based context expansion for deeper exploration
)
print(ctx["results"])
Read Exported Packages
SDK includes export readers:
read_export_package(path)read_export_package_bytes(bytes)parse_jsonl_bytes(bytes)
from memory_cloud import read_export_package
parsed = read_export_package("memory_export.zip")
print(parsed["manifest"])
print(len(parsed["chunks"]))
print(bool(parsed["safetensors"]))
print(parsed.get("kv_summary"))
Client Auto-Extraction
Pass an OpenAI or Anthropic client to MemoryCloudClient to automatically extract insights when remember_step/remember_batch returns an extraction_request:
import openai
from memory_cloud import MemoryCloudClient
client = MemoryCloudClient(
base_url="...",
api_key="...",
extraction_llm=openai.OpenAI(), # or anthropic.Anthropic()
)
# Now remember_step automatically extracts insights in the background
client.remember_step(memory_id="mem-xxx", text="Fixed auth bug in login.py by adding JWT refresh.")
When extraction_llm is provided, every remember_step/remember_batch call that receives an extraction_request from the server will:
- Call the provided LLM with the extraction prompt (background thread, non-blocking)
- Parse the JSON response with brace-depth matching + retry
- Submit extracted insights via
submit_insights()
Client extraction options: extraction_llm (OpenAI/Anthropic client), extraction_model (default: "gpt-4o-mini" for OpenAI, "claude-haiku-4-5-20251001" for Anthropic), extraction_max_tokens (default 16384, env: AWARENESS_EXTRACTION_MAX_TOKENS), user_id, agent_role.
Agent Profiles & Sub-Agent Prompts
Retrieve enriched agent profiles with auto-generated activation prompts:
# List all agent profiles (with system_prompt and activation_prompt)
agents = client.list_agents(memory_id="mem-xxx")
for agent in agents["agents"]:
print(agent["agent_role"], agent["title"])
print(agent["activation_prompt"]) # Ready-to-use prompt for sub-agent spawning
# Get activation prompt for a specific role
prompt = client.get_agent_prompt(memory_id="mem-xxx", agent_role="backend_engineer")
If a profile has a custom system_prompt (set in the frontend Settings), it is used as-is. Otherwise, a prompt is auto-generated from the profile fields (identity, critical_rules, workflow, etc.).
Interceptor (Transparent Injection)
The AwarenessInterceptor wraps any OpenAI/Anthropic client to automatically inject memory context pre-call and store conversations post-call:
from memory_cloud import MemoryCloudClient, AwarenessInterceptor
import openai
client = MemoryCloudClient(base_url="...", api_key="...")
interceptor = AwarenessInterceptor(
client=client,
memory_id="mem-xxx",
min_relevance_score=0.5, # Filter low-score results (default 0.5)
max_inject_items=5, # Cap injected items (default 5)
query_rewrite="rule", # Query rewrite mode (default "rule")
)
oai = openai.OpenAI()
interceptor.wrap_openai(oai)
# Now all oai.chat.completions.create() calls get memory injection automatically
Interceptor options: retrieve_limit (default 8), max_context_chars (default 4000), min_relevance_score (default 0.5), max_inject_items (default 5), auto_remember (default True), enable_extraction (default True), extraction_model (default "gpt-4o-mini"), extraction_max_tokens (default 16384), query_rewrite (default "rule").
Query Rewrite Modes
The interceptor uses context-aware query rewriting to improve recall accuracy:
"rule"(default): Layer 1 (context-aware query from recent conversation turns) + Layer 2 (structural keyword extraction for full-text search). Zero additional latency or token cost."llm": Uses the wrapped LLM to generate optimal semantic_query + keyword_query. Best for ambiguous queries like "continue yesterday's work" or non-technical domains. Adds ~200-500ms latency per query."none": Disables query rewriting. Uses the raw last user message as-is (legacy behavior).
Framework adapters (LangChain, CrewAI, PraisonAI, AutoGen) also support query_rewrite via MemoryCloudBaseAdapter.
Injected Demo (Streaming + Recall)
Run the end-to-end injected-mode demo (client LLM extraction, server zero-LLM path):
python3 scripts/run_sdk_injected_conversation_demo.py --full-user-journey --stream
The demo validates:
- Prompt-only usage via interceptor injection (no manual recall/remember calls in business code)
- Background extraction request handling (
remember_step->extraction_request->submit_insights) - Cross-session recall in a fresh simulated follow-up session
- Streaming token output for runtime observability
Framework Integrations
All integrations share a unified adapter pattern based on MemoryCloudBaseAdapter. Each provides:
wrap_llm()/wrap_function()— transparent memory injectionawareness_recall()/awareness_record()/memory_insights()— explicit tool methodsinject_into_messages()— manual message-level injectionget_tool_functions()— tool definitions for manual registration
LangChain
from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.langchain import MemoryCloudLangChain
client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudLangChain(client=client, memory_id="memory_123")
# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())
# Or use as a LangChain Retriever
retriever = mc.as_retriever()
docs = retriever._get_relevant_documents("What did we decide yesterday?")
CrewAI
from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.crewai import MemoryCloudCrewAI
client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudCrewAI(client=client, memory_id="memory_123")
# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())
# Or use explicit tools
result = mc.awareness_recall("What happened?")
PraisonAI
from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.praisonai import MemoryCloudPraisonAI
client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudPraisonAI(client=client, memory_id="memory_123")
# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())
# Or get tool dicts for PraisonAI agent config
tools = mc.build_tools()
AutoGen / AG2
from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.autogen import MemoryCloudAutoGen
client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudAutoGen(client=client, memory_id="memory_123")
# Injection: hook into agent message processing
mc.inject_into_agent(assistant)
# Or register explicit tools
mc.register_tools(caller=assistant, executor=user_proxy)
Environment Variables
export AWARENESS_API_BASE_URL="https://your-domain.com/api/v1"
export AWARENESS_API_KEY="aw_xxx"
# Optional: configure extraction LLM behavior
export AWARENESS_EXTRACTION_MODEL="gpt-4o-mini" # Model used for insight extraction (default: gpt-4o-mini)
export AWARENESS_EXTRACTION_MAX_TOKENS="16384" # Max tokens for extraction output (default: 16384)
All environment variables can also be set via constructor parameters (which take priority over env vars).
See the AI Frameworks guide for complete integration examples with LangChain, CrewAI, PraisonAI, and AutoGen.