thespian/docs/remoting-design.md

429 lines
18 KiB
Markdown

# Thespian Remoting Layer - Design Document
## Overview
This document describes the design of a remoting layer for the Thespian
actor framework. The remoting layer allows actors in separate OS processes
- and eventually separate machines - to communicate transparently, as
if they were in the same actor system. Once established, the communication
channel is symmetric and bidirectional: either side can send messages,
establish links, and propagate exit signals to the other.
The implementation is written entirely in Zig and lives under
`src/remote/`. No new C or C++ primitives are expected to be required.
---
## Goals
- Enable transparent actor-to-actor communication across process
boundaries.
- Propagate actor links and exit signals across the boundary faithfully.
- Clean up fully on transport collapse, with best-effort notification on
both sides.
- Keep the initial scope narrow: child process transport only, core
infrastructure only.
- Design the abstractions so that Unix socket and TCP transports can be
added later with minimal disruption.
---
## Non-Goals (Today)
- Unix socket transport.
- TCP transport.
- Remote spawn (spawning an actor on a remote system).
- Transport recovery or reconnection.
- Multi-hop routing.
- Authentication or encryption.
---
## Core Concepts
### Actor System Identity
Each running Thespian instance is an _actor system_. For remoting purposes,
systems do not need globally unique identities in the initial
implementation (child process transport is strictly 1:1). Identity concerns
will become relevant when Unix socket and TCP transports introduce the
listen/accept/connect model, at which point a system identifier will be
assigned at connection time.
### Well-Known Actors
An actor can be registered under a name in the local `env` proc table (e.g.
`env.proc("log")`). Remoting extends this: a well-known actor on a remote
system can be looked up by name via the endpoint. The endpoint itself is
registered as a well-known actor in the local system under a configurable
name (e.g. `"remote"`), making it discoverable by other local actors.
### Remote Actor Identity
A remote actor is identified on the wire by an opaque 64-bit integer ID
assigned by its home system at the time the proxy is created. Well-known
actors are additionally reachable by name for the initial lookup, but
thereafter addressed by ID. This keeps message routing in steady state
uniform regardless of whether the original actor was named or anonymous.
### Proxies
A _proxy_ is a local actor that represents a remote actor within the local
system. It holds a `pid` that local actors can send to, link against, and
receive exit signals from, just as with any local actor. Internally, the
proxy forwards everything to the endpoint for transmission over the
transport.
Proxies are created on demand. The primary trigger is receiving a message
from a remote actor that carries a remote PID as its `from` address: the
endpoint creates a proxy for that remote PID if one does not already exist,
and substitutes the proxy's local `pid` as the `from` before delivering the
message to the local destination.
The endpoint maintains a table mapping remote actor IDs to local proxy
`pid`s. On transport collapse, all proxies are sent an exit signal.
---
## Architecture
```
┌─────────────────────────────────────┐ ┌─────────────────────────────────────┐
│ System A │ │ System B │
│ │ │ │
│ Actor A ──> Proxy B ──> Endpoint ──┼─────────┼──> Endpoint ──> Proxy A ──> Actor B │
│ │Transport│ │
│ Actor B <── Proxy A <── Endpoint <─┼─────────┼──> Endpoint <── Proxy B <── Actor A │
└─────────────────────────────────────┘ └─────────────────────────────────────┘
```
The two key actors are:
- **Endpoint** (`src/remote/endpoint.zig`) - one per remote connection.
Owns the transport I/O, manages the proxy table, serialises outgoing
messages and deserialises incoming ones.
- **Proxy** (`src/remote/proxy.zig`) - one per remote actor that is
currently referenced locally. Forwards messages and link/exit operations
to the endpoint.
---
## Transport: Child Process
The child process transport uses a pair of OS pipes: the parent writes to
the child's stdin and reads from the child's stdout. The child does the
reverse. Both sides run an endpoint actor; the parent spawns the child and
passes the file descriptors to its endpoint.
This is the simplest possible 1:1 connection topology. There is no
listen/accept phase - the connection is established at process spawn time
and is torn down when the child exits or the parent closes the pipes.
The existing `subprocess.zig` already handles spawning and pipe I/O at the
message level. The remoting endpoint will use the underlying file
descriptor stream primitives directly to get a byte-stream interface
suitable for framed CBOR.
### Future Transports
When Unix socket and TCP transports are added, the endpoint will accept a
_transport interface_ - a pair of read/write abstractions over a byte
stream. The endpoint logic (framing, message dispatch, proxy management) is
identical regardless of transport; only the connection establishment
differs. For Unix sockets and TCP, a separate _listener_ actor will accept
incoming connections and spawn an endpoint actor for each one.
---
## Wire Protocol
### Framing
Messages are framed with a 4-byte big-endian unsigned length prefix
followed by a CBOR-encoded payload. The length field encodes the byte
length of the CBOR payload only, not including itself.
```
┌───────────────┬─────────────────────────────┐
│ Length: u32BE │ CBOR Payload (Length bytes) │
└───────────────┴─────────────────────────────┘
```
Maximum payload size matches `thespian.max_message_size` (currently 32 KB).
Frames exceeding this limit cause the transport to be torn down with an
error.
### CBOR Message Envelope
Every wire message is a CBOR array. The first element is a tag identifying
the message type. The envelope format is:
```
["msg_type", ...fields]
```
The defined message types are:
| Tag | Direction | Fields | Meaning |
| ------------------- | --------- | ------------------------------------------------ | -------------------------------------------------------------------- |
| `"send"` | both | `from_id: u64`, `to_id: u64`, `payload: cbor` | Deliver payload to actor `to_id`, from proxy of `from_id` |
| `"send_named"` | both | `from_id: u64`, `to_name: text`, `payload: cbor` | Deliver to a well-known actor by name |
| `"link"` | both | `local_id: u64`, `remote_id: u64` | Establish a link between a local actor and a remote actor |
| `"exit"` | both | `id: u64`, `reason: text` | Remote actor `id` has exited with reason |
| `"proxy_id"` | both | `name: text`, `id: u64` | Response to `send_named`: here is the opaque ID for this named actor |
| `"transport_error"` | both | `reason: text` | Signal that the sending side is tearing down |
The `from_id` and `to_id` fields are opaque 64-bit integers assigned by the
_home system_ of the actor. ID `0` is reserved and invalid. Named lookups
(`send_named`) trigger the remote side to create or locate a proxy for the
named actor and reply with its assigned ID via `proxy_id`, after which the
sender can use the ID directly.
---
## Endpoint Actor
### Responsibilities
- Own the transport read/write loops.
- Assign and track remote actor IDs.
- Maintain the proxy table (`remote_id → local proxy pid`).
- Serialise outbound messages into framed CBOR.
- Deserialise inbound frames and dispatch to the correct local actor or
proxy.
- On transport collapse: exit all proxies, unregister from env, exit self.
### State
```zig
const Endpoint = struct {
allocator: std.mem.Allocator,
reader: *FileStream, // inbound byte stream
writer: *FileStream, // outbound byte stream
inbound: std.AutoHashMap(u64, tp.pid), // remote_id -> local proxy pid
outbound: std.AutoHashMap(usize, u64), // local pid handle -> assigned outbound ID
next_id: u64, // monotonically increasing ID allocator
name: [:0]const u8, // registered name in env (e.g. "remote")
};
```
The `outbound` table maps local actor handle identities to the IDs by which
they are known on the wire. When the proxy passes a `from` pid to the
endpoint, the endpoint looks up or assigns an outbound ID for it. The remote
side will create an inbound proxy for that ID on first reference.
### Message Interface (local actors → endpoint)
Local actors interact with the endpoint by sending messages to its `pid`.
The endpoint understands the following local messages:
| Message | Meaning |
| --------------------------------------------------------- | ------------------------------------------------------------ |
| `{"send", from_id: u64, remote_id: u64, payload: cbor}` | Forward payload to remote actor, with originating actor's ID |
| `{"send_named", from_id: u64, name: text, payload: cbor}` | Forward payload to remote well-known actor |
| `{"link", remote_id: u64}` | Link the calling actor to a remote actor |
| `{"proxy_exit", remote_id: u64, reason: text}` | A local proxy is reporting that its remote peer exited |
### Startup Sequence
1. Endpoint actor is spawned (by the parent process after forking, or by
the child after it starts).
2. Endpoint registers itself in the local `env` under its configured name.
3. Endpoint links to the `env` logger (if present) so it is cleaned up on
exit.
4. Endpoint starts the read loop: a dedicated receive of framed bytes from
the transport.
5. Endpoint is now ready to accept local send requests and inbound wire
messages.
### Read Loop
The endpoint issues a read on the transport stream. On each completion:
1. Accumulate bytes until a complete frame is available (length prefix
satisfied).
2. Decode the CBOR envelope.
3. Dispatch based on message type tag.
4. Issue the next read.
On any read error or EOF, the endpoint initiates teardown.
### Teardown
On transport collapse (read error, EOF, or `transport_error` received):
1. Send `transport_error` to the remote side if the connection is still
writable.
2. Send an exit signal to every proxy in the proxy table.
3. Clear the proxy table.
4. Unregister from the local `env`.
5. Exit self with reason `"transport_error"`.
Because the endpoint is linked to all its proxies, and the proxies are
linked to local actors that hold references to them, the exit propagates
naturally through the local actor graph.
---
## Proxy Actor
### Responsibilities
- Present a local `pid` that any local actor can send to or link against.
- Forward all received messages to the endpoint for transmission.
- Forward link requests to the endpoint.
- On exit signal received from the endpoint (transport collapse or remote
actor exit): exit self, propagating to any linked local actors.
### State
```zig
const Proxy = struct {
allocator: std.mem.Allocator,
endpoint: tp.pid, // the local endpoint actor
remote_id: u64, // the remote actor's opaque ID
};
```
### Lifecycle
A proxy is created by the endpoint in two situations:
1. **Inbound message with unknown `from_id`**: The endpoint receives a wire
message from a remote actor ID it has not seen before. It spawns a proxy
for that ID and records it in the proxy table before delivering the message
locally.
2. **Explicit lookup response (`proxy_id`)**: After a `send_named`
exchange, the endpoint now knows the remote ID for a named actor and
creates a proxy for it.
A proxy is destroyed when:
- It receives an exit signal from the endpoint (which forwards the remote
actor's exit reason).
- The transport collapses and the endpoint exits all proxies.
When a proxy exits, it notifies the endpoint via `{"proxy_exit", remote_id,
reason}` so the endpoint can remove it from the proxy table. This prevents
the table from growing unboundedly over a long-lived connection.
### Message Handling
Every message received by the proxy is forwarded to the endpoint, including
the `from` address:
```
receive(from, m):
endpoint.send({"forward", self.remote_id, from_id(from), m})
```
The `from` address is a local `pid_ref`. Before forwarding, the proxy must
resolve it to a local actor ID — or request that the endpoint assign one if
this is the first time this local actor has sent outbound. The remote side
will create a proxy for the `from_id` if one does not already exist, so
that the remote actor can reply directly to the originating actor.
---
## File Layout
```
src/remote/
├── endpoint.zig # Endpoint actor
├── proxy.zig # Proxy actor
├── framing.zig # Length-prefix read/write helpers
└── protocol.zig # Wire message encoding/decoding (CBOR envelopes)
```
The `framing.zig` module provides two functions:
```zig
pub fn write_frame(writer: anytype, payload: []const u8) !void;
pub fn read_frame(reader: anytype, buf: []u8) ![]u8;
```
The `protocol.zig` module provides encode/decode for each wire message
type, working directly with `cbor` values.
---
## Design Decisions and Rationale
### Why one endpoint per connection?
With child process transport the relationship is strictly 1:1, so there is
never more than one endpoint per remote system. When TCP is added, a
listener will spawn one endpoint per accepted connection - the same model.
Keeping endpoint state entirely local to one actor avoids shared mutable
state and fits the actor model cleanly.
### Why on-demand proxy creation?
Explicit proxy management (create before use, destroy explicitly) would
require a handshake protocol and additional message types. On-demand
creation based on observed `from_id` values is simpler and covers the
primary use case: an actor on the remote side sends you a message, and you
need to be able to reply to it. Explicit creation can be added later for
the named-lookup case.
### Child Process Transport and `subprocess.zig`
The parent-side endpoint uses `subprocess.zig` as-is to spawn the child and
communicate over pipes. `subprocess.zig` delivers incoming bytes as
`{"stream", "stdout", "read_complete", bytes}` messages which the endpoint
receives and accumulates into frames.
The child side cannot use `subprocess.zig` — it must read from its own
stdin and write to its own stdout. A separate _stdio endpoint_ variant
wraps file descriptor 0 (stdin) and file descriptor 1 (stdout) directly,
using the same `thespian/c/file_stream.h` primitives, and presents
identical behaviour to the parent-side endpoint once running.
### Child Process Endpoint Modes
The child process endpoint will eventually support two modes:
- **Fork+exec** (different binary): the parent spawns an entirely separate
executable. The child starts fresh, initialises a Thespian context, and
runs the stdio endpoint. This is the general case for connecting to actors
in a different binary.
- **Fork-only** (same binary, no exec, or re-exec): the child is a fork of
the parent process. This avoids the overhead of loading a new binary and
allows the child to share code with the parent. Re-exec (where the child
exec's itself with a flag indicating it should run as a remote endpoint)
is an alternative that gives a clean address space without a separate
binary. Both variants use the stdio endpoint on the child side.
The distinction between modes is an implementation detail of how the child
is launched; the endpoint protocol is identical in both cases.
### Why CBOR for the wire protocol?
CBOR is already the native message format throughout Thespian. Using it on
the wire means the payload of a `send` message is the actor message
verbatim - no transcoding required. The framing overhead is minimal (4
bytes per message).
### Why 64-bit opaque IDs?
A simple monotonic counter per endpoint is collision-free within a
connection lifetime and requires no coordination. Named actors get an ID
assigned at first reference. IDs are connection-scoped, not globally
unique, which is sufficient for the 1:1 child process model.
---
## Open Questions (Deferred)
- **System identity**: When TCP is added, endpoints will need to identify
themselves to each other (to detect loops, to route correctly in multi-hop
scenarios). A UUID or similar token exchanged in a handshake is the likely
approach.
- **Backpressure**: The current model has no backpressure - a fast sender
can overwhelm a slow transport. This is acceptable for the initial
implementation but will need attention under load.
- **Named actor re-registration**: If a well-known actor exits and is
restarted under the same name, proxies on the remote side will hold stale
IDs. A generation counter or re-lookup mechanism will be needed.