In Part 1 , we explored the shift from simple, stateless copilots to more sophisticated, stateful agents. We demonstrated how an LLM can act as an autonomous planner, generating multi-step execution plans using the Observe-Think-Act pattern. We also explored how agents can integrate with real-world tools through the Model Context Protocol, transforming conceptual plans into actual actions. We defined an agent as an autonomous entity with a persistent internal state, a planning loop, and the ability to use external tools. This internal state encompassing memory, goals, and conversation history is what gives an agent its personality and purpose. Building one agent in isolation is simple. Building a system that can reliably manage thousands of them, each with its own unique state and long-running tasks, is a monumental challenge. This is where most conventional architectures begin to break down. So, how do we build a system that can reliably and concurrently manage not one, but thousands of these stateful, independent agents?

The answer does not include locks or complex threading logic, but a powerful architectural pattern: the Actor Model.

In Part 1, we demonstrated the MinimalAgent—a simple implementation where the LLM generates a complete execution plan in a single call. While this illustrated the core concept of autonomous planning, real-world agents need more: persistent state across multiple interactions, the ability to remember facts, and actual tool execution capabilities. In this post, we’ll evolve from that conceptual foundation to build SimpleAgent, a production-ready implementation that handles concurrent sessions, maintains memory, and integrates real tools. This is where the complexity of managing thousands of concurrent, stateful agents becomes apparent, and why we need a robust runtime foundation like Akka.

The Concurrency Crisis in an Agentic World

As of writing this blog post, technology is moving from single-shot, stateless LLM calls to long-running, stateful agentic workflows. From autonomous research agents that continuously monitor and analyze global financial markets in real-time to AI healthcare assistants that maintain persistent, personalized patient interactions across months of treatment, the need for robust, scalable agent architectures has never been more critical. In enterprise settings, we’re seeing customer service platforms that must orchestrate millions of concurrent, stateful conversations while maintaining strict data isolation and sub-second response times. Traditional approaches to concurrency, that rely heavily on shared memory and locks, are showing their cracks. They lead to a minefield of potential race conditions, deadlocks, and non-deterministic bugs that are notoriously difficult to debug in complex systems. Imagine two threads trying to update an agent’s memory simultaneously; one adds a new fact while the other summarizes the conversation. Without intricate and error-prone locking, the summary might be generated from a corrupted, incomplete state, leading to a nonsensical response.

These problems are amplified in agentic systems where thousands of individual agents must maintain their own internal state, execute plans, and interact with a volatile external world. This volatility is not just about network failures; it’s inherent in the core components of agentic AI. For example, calls to external LLM providers have non-deterministic response times—one request might take 500ms, while a slightly more complex one takes 10 seconds. An architecture based on blocking threads would grind to a halt, unable to serve other agents while waiting for one slow response. We need an architecture that embraces this reality. The Akka Actor Model, a staple of the JVM ecosystem, provides a compelling solution. Its core tenets of isolation and asynchronous message-passing offer a blueprint for building agents that are concurrent, stateful, and remarkably resilient. When combined with the high-level Akka Agentic SDK, this powerful model becomes accessible, allowing developers to focus on the agent’s intelligence rather than the low-level concurrency plumbing.

Why Actors Fit Agentic Systems

  • Isolation: Each agent encapsulates its own state and processes one message at a time. No shared mutable state or locks are needed, which makes state transitions deterministic and easier to reason about and test.

  • Backpressure: Agents receive work via mailboxes that can be bounded. Bounded queues prevent memory blow‑ups during spikes and enable predictable load shedding. Combine with timeouts/retries for external calls, and avoid blocking operations inside handlers.

  • Supervision: Parents define restart/stop/escalate policies. Failures are contained to the faulty agent, which can be restarted cleanly. Keep ephemeral state in‑memory and persist only what must survive restarts.

  • Asynchrony: Long I/O (LLMs, tools, databases) integrates via non‑blocking messaging/futures. Handlers stay responsive while awaiting results, and completed results are delivered back to the mailbox as new messages.

  • Elasticity: Clustering and sharding distribute agents across nodes with location transparency. Use stable routing keys so the same logical agent handles a given user/session while the platform balances thousands of agents across the cluster.

Architecture at a Glance

The Actor Model is built on a few simple components, which the Akka Agentic SDK abstracts into a cohesive framework.

  1. Agent as Actor: An agent is an actor. It encapsulates its own private state (memory, goals) and behavior within a protective boundary. No other part of the system can access this state directly, turning the actor into a secure, digital vault for the agent’s identity and knowledge.
  2. Messages: Agents communicate exclusively by sending immutable messages. A message could be a user query (like the Interact record) or an internal command. The immutability is key; since messages cannot be changed in transit, you eliminate the risk of one component accidentally modifying data that another component is relying on.
  3. Mailbox & Single Handler: Each agent has a mailbox, a queue for incoming messages. The Akka runtime ensures that the agent’s message handler (the interact method in our example) processes these messages one at a time. This sequential processing is the secret to state safety. By handling one message completely before starting the next, the actor guarantees that its internal state is never subject to a race condition, eliminating the need for any manual locking within the agent’s logic.
  4. Supervision: A parent actor can supervise its children. If an agent fails with an unexpected exception, its supervisor can apply a fault-tolerance strategy, such as restarting it, stopping it, or escalating the failure. This is Akka’s famous “let it crash” philosophy: instead of littering your business logic with defensive try-catch blocks for every conceivable error, you write the “happy path” logic and let a dedicated supervisor handle the inevitable failures, leading to cleaner code and a more resilient system.

This architecture creates a clear separation of concerns where each component has a distinct responsibility. The following diagram illustrates this flow:

flowchart LR %% Client subgraph Client U[User / Ingress] end %% Akka System with Agent internals subgraph AkkaSystem S[Supervisor] subgraph Agent MB[(Mailbox)] BH[interact] STM[(Short-term Memory)] DUR[(Durable Memory Cache)] EO[[Effects Orchestrator]] end end %% External services subgraph External LLM[(LLM API)] TOOL[(Tool API)] STORE[(Facts Store: file/DB)] VDB[(Vector DB)]:::optional end %% Message + data flow U --> MB MB --> BH BH <--> STM BH <--> DUR DUR <-->|load/save| STORE BH --> EO EO --> LLM LLM --> EO EO --> BH BH --> TOOL TOOL --> BH BH -->|reply| U %% Supervision S -. monitors .-> BH S -. restart on failure .-> BH %% Styles / notes classDef optional fill:#f6f6f6,stroke:#bbb,stroke-dasharray: 5 5; class MB bp; classDef bp fill:#fffbe6,stroke:#e0c200,stroke-width:1px; class EO async; classDef async fill:#e7f5ff,stroke:#4dabf7; class STORE dur; classDef dur fill:#eef7ee,stroke:#37b24d;

Figure 1: Component flow highlighting mailbox backpressure, behavior, short‑term and durable memory, non‑blocking effects, and supervision.

Request Lifecycle (Sequence)

sequenceDiagram participant U as User/Ingress participant MB as Mailbox participant B as Behavior participant M as LLM API participant T as Tool U->>MB: Interact(message) MB->>B: Dequeue message B->>B: Update short-term state B->>M: Non-blocking model request Note right of B: Actor remains free to handle
other queued messages M-->>B: Model response alt Tool required? B->>T: Invoke tool T-->>B: Tool result end B-->>U: Reply

Supervision Tree

flowchart TB S[Supervisor] A[Agent] E[Failure] R[Restart] S --> A A -->|throws| E S -. handles .-> R R --> A

Design Choices & Trade-offs

The primary alternative to the Actor Model is Shared-Memory Concurrency. In this model, multiple threads access and modify the same data structures (e.g., a shared HashMap of user sessions), using synchronized blocks or ReentrantLocks to prevent data corruption. To make the right choice, it’s crucial to understand the fundamental differences between these two philosophies.

  • Shared-Memory:
    • Pros: Potentially lower latency for fine-grained tasks where threads frequently need to access the same data. It can feel more intuitive for developers accustomed to standard object-oriented programming.
    • Cons: Extremely difficult to get right at scale. Prone to deadlocks, race conditions, and requires careful, explicit lock management. Debugging is a nightmare because bugs are often timing-dependent and hard to reproduce.
  • Actor Model (Share-Nothing):
    • Pros: Enforced isolation eliminates entire classes of concurrency bugs by design. State management is simplified. Supervision provides a clean, declarative model for fault tolerance. High-level SDKs can abstract away boilerplate, increasing developer productivity.
    • Cons: Communication is always asynchronous, which can be a mental shift for developers. Debugging the logical flow of a complex message chain can be challenging without the right tools and requires thinking in terms of message flows rather than call stacks.

For a system with thousands of independent agents, the safety, scalability, and resilience benefits of the Actor Model’s share-nothing architecture are a clear and decisive winner.

Building Our First Agent: A Practical Walkthrough

Theory is essential, but code is concrete. Let’s move from architectural diagrams to a practical implementation. We will now work through an example of building a simple but powerful agent using the Akka Agentic SDK. This hands-on example will show you how to manage memory, integrate tools, and orchestrate LLM calls in a robust, actor-based system.

Step 1: Setting Up the Environment

You’ll need a Java Development Kit (JDK) and a build tool like Maven or Gradle. A common and highly recommended practice in Akka projects is to use the akka-javasdk-parent POM. The <parent> element in pom.xml lets your project inherit a centralized configuration. Think of it as a “bill of materials” (BOM) that pins compatible versions of SDK modules and transitive dependencies so you don’t have to juggle them manually. For Maven, add this to your pom.xml:

  <parent>
    <groupId>io.akka</groupId>
    <artifactId>akka-javasdk-parent</artifactId>
    <version>3.5.2</version>
  </parent>

See repo config for a complete example: application.confhttps://github.com/PradeepLoganathan/akka-agent-sdk-demo/blob/main/src/main/resources/application.conf

Step 2: Implementing the SimpleAgent.java Code

Now, let’s dive into SimpleAgent.java. Every agent extends Agent, which provides the actor lifecycle and the effects() API for non‑blocking orchestration. Our minimal agent focuses on five implementation patterns. Full source: https://github.com/PradeepLoganathan/akka-agent-sdk-demo/blob/main/src/main/java/com/example/demo/application/SimpleAgent.java

  1. Type‑Safe Messaging: Define a compact protocol with an enum of actions and a record message. This prevents stringly‑typed mistakes and makes behavior easy to reason about.

    public enum Action { SAY, ASK, REMEMBER, RECALL, SUMMARY }
    public record Interact(String sessionId, Action action, String text, String key, String value) {}
    
  2. Single Message Handler: public Effect<String> interact(Interact req) is the only entry point. The runtime processes one message at a time, so you don’t need locks for internal state.

  3. Short‑Term Session Memory: Keep a per‑session Map<String,String> for small facts referenced during the conversation. It’s simple, fast, and ephemeral—perfect for a first agent.

    Evolution from MinimalAgent: In Part 1, we defined an AgentState class to demonstrate the structure needed for stateful agents (goal, memory list, step tracking, completion flags). Here in SimpleAgent, we’ve evolved that concept into a practical implementation using a Map<String,String> for session-based memory. This simpler approach is ideal for our example and demonstrates the actor’s ability to maintain state, while production systems might combine the structured AgentState pattern with persistence mechanisms like Akka Persistence for durable state management.

  4. Declarative Orchestration with effects(): Instead of managing threads and futures, declare your intent (model + memory + tools + messages). The SDK performs non‑blocking calls and delivers the result to your mailbox, allowing the agent to remain responsive.

    return effects()
      .model(ModelProvider.openAi().withApiKey(...).withModelName("gpt-4o-mini"))
      .memory(MemoryProvider.limitedWindow())
      .tools(this)
      .systemMessage(sys)
      .userMessage(prompt)
      .thenReply();
    
  5. Tool Integration: Annotate plain Java methods with @FunctionTool; the SDK exposes them to the model with a generated JSON schema and handles invocation and result injection.

    @FunctionTool(name = "upper", description = "Uppercase a string")
    public String toUpper(@Description("text to uppercase") String text) {
      return text == null ? "" : text.toUpperCase();
    }
    

    This simple annotation-based approach is ideal for lightweight, inline tools. For more complex integrations—such as querying databases, accessing knowledge bases, or orchestrating external APIs—you can leverage the Model Context Protocol (MCP) with Akka , which we introduced in Part 1. MCP provides a standardized way to expose rich tool capabilities that your agents can discover and use dynamically, transforming the conceptual tools from our MinimalAgent planning example into real, executable actions.

Optionally, you can enhance the system prompt with a simple, deterministic summary of known facts (from the session map) to guide the model without overstuffing context.

Step 3: Wiring HTTP Endpoints (Thin API Layer)

Expose five endpoints (say, ask, remember, recall/{key}, summary) that map directly to actions. The SDK’s ComponentClient routes each HTTP request to your agent with inSession(sessionId) to keep a user’s conversation isolated.

@HttpEndpoint("/agent/{sessionId}")
public class AgentEndpoint {
  public record SayRequest(String text) {}
  public record SayResponse(String reply) {}

  @Post("/say")
  public HttpResponse say(String sessionId, SayRequest req) {
    var out = client.forAgent().inSession(sessionId)
      .method(SimpleAgent::interact)
      .invoke(new SimpleAgent.Interact(sessionId, Action.SAY, req.text(), null, null));
    return HttpResponses.ok(new SayResponse(out));
  }

  @Get("/recall/{key}")
  public HttpResponse recall(String sessionId, String key) {
    var out = client.forAgent().inSession(sessionId)
      .method(SimpleAgent::interact)
      .invoke(new SimpleAgent.Interact(sessionId, Action.RECALL, null, key, null));
    return HttpResponses.ok(new RecallResponse(key, out));
  }
}

Full source: AgentEndpoint.javahttps://github.com/PradeepLoganathan/akka-agent-sdk-demo/blob/main/src/main/java/com/example/demo/api/AgentEndpoint.java

Step 4: Minimal UI (Optional but Helpful)

A tiny HTML page generates a sessionId, calls the five endpoints, and renders a micro chat transcript. This makes it easy to observe the mailbox → handler → model/tool → reply loop end‑to‑end.

<!-- static-resources/index.html (excerpt) -->
<script>
  function newSessionId() { document.getElementById('sessionIdInput').value = crypto.randomUUID(); }
  async function post(path, body) {
    const sessionId = document.getElementById('sessionIdInput').value;
    const res = await fetch(`/agent/${sessionId}${path}`, { method:'POST', headers:{'Content-Type':'application/json'}, body: JSON.stringify(body) });
    return res.json();
  }
  async function say() {
    const text = document.getElementById('sayText').value;
    const out = await post('/say', { text });
    document.getElementById('sayOut').textContent = out.reply;
  }
</script>

Full sources: index.htmlhttps://github.com/PradeepLoganathan/akka-agent-sdk-demo/blob/main/src/main/resources/static-resources/index.html and UiEndpoint.javahttps://github.com/PradeepLoganathan/akka-agent-sdk-demo/blob/main/src/main/java/com/example/demo/api/UiEndpoint.java

Deploying Our First Agent

We’ve built a functional agent and seen it work on our local machine. Now comes the critical step: moving from a local process to a scalable, deployable service. This is where the integrated tooling of the Akka platform shines. We’ll walk through how your code is automatically packaged into a container and deployed to a cluster with a single command.

  • Prerequisites: Akka CLI installed; repo built; OPENAI_API_KEY available.
  • Start (dev): deploy locally and open the console.

From Source Code to Container, Automatically

A key challenge in modern development is reliably packaging applications. The Akka SDK simplifies this with a “zero-config” approach to containerization. You don’t need to write or maintain a Dockerfile for the common use case.

This is made possible by the akka-javasdk-parent you added to your pom.xml. Think of this parent POM as a “bill of materials” that comes with a pre-configured, production-ready build process. When you initiate a deployment, here’s what happens under the hood:

  1. Build Trigger: The Akka CLI drives your Maven build (mvn package).
  2. Containerization: The parent POM’s integrated plugin (like Jib) takes over. It intelligently assembles your application into a lean, optimized OCI-compliant container image.
  3. Image Contents: This isn’t just a fat JAR. The image is constructed with best practices:
    • A minimal Java Runtime Environment (JRE) base image is used.
    • Your application’s dependencies, resources, and compiled classes are added in separate layers for efficient caching and faster rebuilds.
    • An entrypoint is configured to boot your service on the Akka runtime.
  4. Security: Crucially, secrets like your OPENAI_API_KEY are not embedded in the image. The container is built to receive configuration and secrets from the environment at runtime, making your images portable and secure.

However, If you provide a custom docker file the CLI can be configured to use it—but for most services the parent POM’s containerization is sufficient and simpler to maintain.

# From your project (with Akka CLI installed)
akka services deploy --env dev

Expected console snippet:

──────────────────────────────────────────────────────────────
│ SERVICE             │ STATE    │ ADDRESS                   │
──────────────────────────────────────────────────────────────
│ akka-agent-sdk-demo │ Running  │ localhost:9200            │
──────────────────────────────────────────────────────────────
Local console: http://localhost:9889
(use Ctrl+C to quit)
  • Verify and iterate:

    • UI: http://localhost:9200
    • Console: http://localhost:9889
    • Logs: akka services logs akka-agent-sdk-demo -f
    • Status: akka services status
  • Promote to prod:

# Deploy
akka services deploy --env prod

# Inspect
akka services status
akka services logs akka-agent-sdk-demo --env prod -f

# Scale replicas
akka services scale akka-agent-sdk-demo --replicas 3 --env prod

Notes:

  • The CLI builds, containers, and deploys for you—no Dockerfiles or raw manifests needed.
  • The CLI prints the public address/URL for the deployed service.

Operationalising Our First Agent

  • Secrets
    • Avoid committing keys. Set them via CLI per environment.
akka services secrets set OPENAI_API_KEY=... --env dev
akka services secrets set OPENAI_API_KEY=... --env prod
  • Config Profiles

    • Keep dev defaults in application.conf; promote overrides per environment (timeouts, logging, mailbox capacity).
  • Backpressure & Resilience

    • Keep handlers non‑blocking; do not perform blocking I/O in message handling.
    • Add timeouts/retries/backoff for model/tool calls; return clear fallback errors on persistent failure.
  • Observability

    • Structured logs with session/request IDs.
    • Consider tracing/metrics in a follow‑up (latency of model/tool calls; message throughput).
  • Scaling

    • Scale horizontally with the CLI; monitor latencies and mailbox depth; increase replicas before saturation.

Evaluation & Testing

Metrics & Harness:

  • Throughput: How many interact calls per second can the system process? This can be benchmarked using a load testing tool like Gatling to simulate thousands of concurrent users.

  • Latency: The end-to-end time from sending a request to the agent to receiving a reply from the Effect, especially for model() interactions which involve network calls.

  • Resilience: We can test the system’s resilience by creating a supervisor for the SimpleAgent. In a test, we can send a message that intentionally throws an exception and verify that the supervisor correctly restarts the agent, allowing it to serve new requests.

Testing: Using a framework like JUnit 5, you can test the agent’s logic in a controlled environment. For the memory functions (RECALL, REMEMBER), you can send Interact messages and assert that the reply is what you expect. For model interactions, you would typically mock the ModelProvider to return a canned response. For example, your test could configure a mock provider that, when called, immediately returns an Effect with a predefined string. This allows you to verify your agent’s logic in isolation, ensuring it behaves correctly without the cost and non-determinism of real LLM API calls.

Failure Modes & Mitigations

  • Agent Failure: A bug in the interact method (e.g., a NullPointerException) causes the actor to crash.

    • Mitigation: An Akka Supervisor will intercept this exception. A Restart strategy will create a new instance of the SimpleAgent, allowing it to process new messages. The state in the old instance (if not static) would be lost unless a persistence mechanism is used.
  • LLM API Failure: The external OpenAI API is down, returns a rate-limit error, or sends back malformed data.

    • Mitigation: The Akka Agentic SDK’s runtime should handle this gracefully. A well-designed system would incorporate retry logic with exponential backoff for transient network errors and route to a fallback model or response for persistent failures.
  • Backpressure: The system receives requests faster than the agent can process them (e.g., waiting for long-running LLM responses).

    • Mitigation: Akka actors have bounded mailboxes. When a mailbox is full, the system can choose to drop messages, crash, or block the sender. This prevents the agent from running out of memory and ensures system stability under high load. Akka also provides a Stash mechanism, allowing an agent to temporarily set aside messages it’s not ready to handle.

Operations

  • Observability: The logging within the SimpleAgent is a good start. In production, you would use structured logging and integrate with a distributed tracing tool like OpenTelemetry. Each Interact request would have a unique trace ID. This allows you to see a complete flame graph of a single user request, pinpointing whether latency is occurring in the agent’s logic, the LLM call, or a tool execution.

  • State Persistence: The file-based persistence for durable facts is simple but effective for this example. For a high-scale, clustered application, you would replace this with a distributed database (like Cassandra or PostgreSQL) or a key-value store (like Redis), managed via a dedicated module like Akka Persistence, which provides tools for event sourcing and state snapshots.

Ship It Checklist

  • Validate the Part 1 link with a local Hugo build and fix ref if needed.
  • Configure model credentials via env vars; add retry/backoff policy.
  • Enable structured logging + trace IDs; add basic metrics (latency/throughput).
  • Load test one agent with Gatling; observe mailbox/backpressure.
  • Add a bounded mailbox and supervision strategy in config.

Thought-Provoking Question

In a distributed system of AI agents, is the fundamental unit of intelligence the agent itself, with its encapsulated state and logic, or is it the message that passes between them, carrying intent and information that transforms the state of the entire system?