Awareness
Back to Home

Documentation

Guides, SDK references, and ecosystem integrations for Awareness.

Python SDK

Awareness Memory Cloud Python SDK

Python SDK for Awareness Memory Cloud APIs and MCP-style memory workflows.

Install

pip install awareness-memory-cloud

Framework extras:

pip install awareness-memory-cloud[langchain]
pip install awareness-memory-cloud[crewai]
pip install awareness-memory-cloud[frameworks]

Quickstart

import os
from memory_cloud import MemoryCloudClient

client = MemoryCloudClient(
    base_url=os.getenv("AWARENESS_API_BASE_URL", os.getenv("AWARENESS_BASE_URL", "https://awareness.market/api/v1")),
    api_key="YOUR_API_KEY",
)

client.write(
    memory_id="memory_123",
    content="Customer asked for SOC2 evidence and retention policy.",
    kwargs={"source": "python-sdk", "session_id": "demo-session"},
)

result = client.retrieve(
    memory_id="memory_123",
    query="What did customer ask for?",
    custom_kwargs={"k": 3},
)
print(result["results"])

API Coverage (SDK/API aligned)

MemoryCloudClient now includes:

  • Memory: create_memory, list_memories, get_memory, update_memory, delete_memory
  • Content: write, list_memory_content, delete_memory_content
  • Retrieval/Chat: retrieve, chat, chat_stream, memory_timeline
  • MCP ingest: ingest_events, ingest_content
  • Export: export_memory_package, save_export_memory_package
  • Async jobs & upload: get_async_job_status, upload_file, get_upload_job_status
  • Insights/API keys/wizard: insights, create_api_key, list_api_keys, revoke_api_key, memory_wizard

MCP-style Helpers (SDK/MCP aligned)

These helpers mirror MCP tool semantics:

  • begin_memory_session
  • recall_for_task
  • remember_step
  • remember_batch
  • backfill_conversation_history

Example:

session = client.begin_memory_session(memory_id="memory_123", source="python-sdk")
client.remember_step(
    memory_id="memory_123",
    text="Refactored auth middleware and added tests.",
)
ctx = client.recall_for_task(
    memory_id="memory_123",
    task="summarize latest auth changes",
    limit=8,
    multi_level=False,        # enable broader context retrieval across sessions and time ranges
    cluster_expand=False,     # enable topic-based context expansion for deeper exploration
)
print(ctx["results"])

Read Exported Packages

SDK includes export readers:

  • read_export_package(path)
  • read_export_package_bytes(bytes)
  • parse_jsonl_bytes(bytes)
from memory_cloud import read_export_package

parsed = read_export_package("memory_export.zip")
print(parsed["manifest"])
print(len(parsed["chunks"]))
print(bool(parsed["safetensors"]))
print(parsed.get("kv_summary"))

Client Auto-Extraction

Pass an OpenAI or Anthropic client to MemoryCloudClient to automatically extract insights when remember_step/remember_batch returns an extraction_request:

import openai
from memory_cloud import MemoryCloudClient

client = MemoryCloudClient(
    base_url="...",
    api_key="...",
    extraction_llm=openai.OpenAI(),  # or anthropic.Anthropic()
)

# Now remember_step automatically extracts insights in the background
client.remember_step(memory_id="mem-xxx", text="Fixed auth bug in login.py by adding JWT refresh.")

When extraction_llm is provided, every remember_step/remember_batch call that receives an extraction_request from the server will:

  1. Call the provided LLM with the extraction prompt (background thread, non-blocking)
  2. Parse the JSON response with brace-depth matching + retry
  3. Submit extracted insights via submit_insights()

Client extraction options: extraction_llm (OpenAI/Anthropic client), extraction_model (default: "gpt-4o-mini" for OpenAI, "claude-haiku-4-5-20251001" for Anthropic), extraction_max_tokens (default 16384, env: AWARENESS_EXTRACTION_MAX_TOKENS), user_id, agent_role.

Agent Profiles & Sub-Agent Prompts

Retrieve enriched agent profiles with auto-generated activation prompts:

# List all agent profiles (with system_prompt and activation_prompt)
agents = client.list_agents(memory_id="mem-xxx")
for agent in agents["agents"]:
    print(agent["agent_role"], agent["title"])
    print(agent["activation_prompt"])  # Ready-to-use prompt for sub-agent spawning

# Get activation prompt for a specific role
prompt = client.get_agent_prompt(memory_id="mem-xxx", agent_role="backend_engineer")

If a profile has a custom system_prompt (set in the frontend Settings), it is used as-is. Otherwise, a prompt is auto-generated from the profile fields (identity, critical_rules, workflow, etc.).


Interceptor (Transparent Injection)

The AwarenessInterceptor wraps any OpenAI/Anthropic client to automatically inject memory context pre-call and store conversations post-call:

from memory_cloud import MemoryCloudClient, AwarenessInterceptor
import openai

client = MemoryCloudClient(base_url="...", api_key="...")
interceptor = AwarenessInterceptor(
    client=client,
    memory_id="mem-xxx",
    min_relevance_score=0.5,  # Filter low-score results (default 0.5)
    max_inject_items=5,        # Cap injected items (default 5)
    query_rewrite="rule",      # Query rewrite mode (default "rule")
)

oai = openai.OpenAI()
interceptor.wrap_openai(oai)
# Now all oai.chat.completions.create() calls get memory injection automatically

Interceptor options: retrieve_limit (default 8), max_context_chars (default 4000), min_relevance_score (default 0.5), max_inject_items (default 5), auto_remember (default True), enable_extraction (default True), extraction_model (default "gpt-4o-mini"), extraction_max_tokens (default 16384), query_rewrite (default "rule").

Query Rewrite Modes

The interceptor uses context-aware query rewriting to improve recall accuracy:

  • "rule" (default): Layer 1 (context-aware query from recent conversation turns) + Layer 2 (structural keyword extraction for full-text search). Zero additional latency or token cost.
  • "llm": Uses the wrapped LLM to generate optimal semantic_query + keyword_query. Best for ambiguous queries like "continue yesterday's work" or non-technical domains. Adds ~200-500ms latency per query.
  • "none": Disables query rewriting. Uses the raw last user message as-is (legacy behavior).

Framework adapters (LangChain, CrewAI, PraisonAI, AutoGen) also support query_rewrite via MemoryCloudBaseAdapter.

Injected Demo (Streaming + Recall)

Run the end-to-end injected-mode demo (client LLM extraction, server zero-LLM path):

python3 scripts/run_sdk_injected_conversation_demo.py --full-user-journey --stream

The demo validates:

  • Prompt-only usage via interceptor injection (no manual recall/remember calls in business code)
  • Background extraction request handling (remember_step -> extraction_request -> submit_insights)
  • Cross-session recall in a fresh simulated follow-up session
  • Streaming token output for runtime observability

Framework Integrations

All integrations share a unified adapter pattern based on MemoryCloudBaseAdapter. Each provides:

  • wrap_llm() / wrap_function() — transparent memory injection
  • awareness_recall() / awareness_record() / memory_insights() — explicit tool methods
  • inject_into_messages() — manual message-level injection
  • get_tool_functions() — tool definitions for manual registration

LangChain

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.langchain import MemoryCloudLangChain

client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudLangChain(client=client, memory_id="memory_123")

# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())

# Or use as a LangChain Retriever
retriever = mc.as_retriever()
docs = retriever._get_relevant_documents("What did we decide yesterday?")

CrewAI

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.crewai import MemoryCloudCrewAI

client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudCrewAI(client=client, memory_id="memory_123")

# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())

# Or use explicit tools
result = mc.awareness_recall("What happened?")

PraisonAI

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.praisonai import MemoryCloudPraisonAI

client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudPraisonAI(client=client, memory_id="memory_123")

# Injection: wrap the LLM client
import openai
mc.wrap_llm(openai.OpenAI())

# Or get tool dicts for PraisonAI agent config
tools = mc.build_tools()

AutoGen / AG2

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.autogen import MemoryCloudAutoGen

client = MemoryCloudClient(base_url="...", api_key="...")
mc = MemoryCloudAutoGen(client=client, memory_id="memory_123")

# Injection: hook into agent message processing
mc.inject_into_agent(assistant)

# Or register explicit tools
mc.register_tools(caller=assistant, executor=user_proxy)

Environment Variables

export AWARENESS_API_BASE_URL="https://your-domain.com/api/v1"
export AWARENESS_API_KEY="aw_xxx"

# Optional: configure extraction LLM behavior
export AWARENESS_EXTRACTION_MODEL="gpt-4o-mini"        # Model used for insight extraction (default: gpt-4o-mini)
export AWARENESS_EXTRACTION_MAX_TOKENS="16384"          # Max tokens for extraction output (default: 16384)

All environment variables can also be set via constructor parameters (which take priority over env vars).

See the AI Frameworks guide for complete integration examples with LangChain, CrewAI, PraisonAI, and AutoGen.