The Initialization phase allows Clients and Agents to negotiate protocol versions, capabilities, and authentication methods.

Before a Session can be created, Clients MUST initialize the connection by calling the initialize method with:
{
  "jsonrpc": "2.0",
  "id": 0,
  "method": "initialize",
  "params": {
    "protocolVersion": 1,
    "clientCapabilities": {
      "fs": {
        "readTextFile": true,
        "writeTextFile": true
      }
    }
  }
}
The Agent MUST respond with the chosen protocol version and the capabilities it supports:
{
  "jsonrpc": "2.0",
  "id": 0,
  "result": {
    "protocolVersion": 1,
    "agentCapabilities": {
      "loadSession": true,
      "promptCapabilities": {
        "image": true,
        "audio": true,
        "embeddedContext": true
      }
    },
    "authMethods": []
  }
}

Protocol version

The protocol versions that appear in the initialize requests and responses are a single integer that identifies a MAJOR protocol version. This version is only incremented when breaking changes are introduced. Clients and Agents MUST agree on a protocol version and act according to its specification. See Capabilities to learn how non-breaking features are introduced.

Version Negotiation

The initialize request MUST include the latest protocol version the Client supports. If the Agent supports the requested version, it MUST respond with the same version. Otherwise, the Agent MUST respond with the latest version it supports. If the Client does not support the version specified by the Agent in the initialize response, the Client SHOULD close the connection and inform the user about it.

Capabilities

Capabilities describe features supported by the Client and the Agent. All capabilities included in the initialize request are OPTIONAL. Clients and Agents SHOULD support all possible combinations of their peer’s capabilities. The introduction of new capabilities is not considered a breaking change. Therefore, Clients and Agents MUST treat all capabilities omitted in the initialize request as UNSUPPORTED. Capabilities are high-level and are not attached to a specific base protocol concept. Capabilities may specify the availability of protocol methods, notifications, or a subset of their parameters. They may also signal behaviors of the Agent or Client implementation.

Client Capabilities

The Client SHOULD specify whether it supports the following capability:

File System

The Client MAY expose its File System abstraction to varying degrees:
readTextFile
boolean
The fs/read_text_file method is available.
writeTextFile
boolean
The fs/write_text_file method is available.

Agent Capabilities

The Agent SHOULD specify whether it supports the following capabilities:
loadSession
boolean
default: false
The session/load method is available.
promptCapabilities
PromptCapabilities Object
Object indicating the different types of content that may be included in session/prompt requests.

Prompt capabilities

As a baseline, all Agents MUST support ContentBlock::Text and ContentBlock::ResourceLink in session/prompt requests. Optionally, they MAY support richer types of content by specifying the following capabilities:
image
boolean
default: false
The prompt may include ContentBlock::Image
audio
boolean
default: false
The prompt may include ContentBlock::Audio
embeddedContext
boolean
default: false
The prompt may include ContentBlock::Resource

Once the connection is initialized, you’re ready to create a session and begin the conversation with the Agent.