HyperAgent SDK
HyperAgent Class
The HyperAgent
class provides an interface for running autonomous web agents. It manages browser instances, task execution, and integration with Model Context Protocol (MCP) servers.
Creating a HyperAgent Instance
Initializing
Description: Initializes a new instance of the
HyperAgent
. Configures the language model, browser provider (local or Hyperbrowser), debug mode, and custom actions.Parameters:
params
(HyperAgentConfig
, optional): Configuration object.llm
(BaseChatModel
, optional): The language model instance to use. Defaults to using GPT-4o ifOPENAI_API_KEY
environment variable is set.browserProvider
("Hyperbrowser" | "Local"
, optional): Specifies whether to use Hyperbrowser's cloud browsers or a local Playwright instance. Defaults toLocal
.debug
(boolean
, optional): Enables detailed logging if set totrue
. Defaults tofalse
. Logs are dumped to./debug
directory.
Throws:
HyperagentError
if no LLM is provided andOPENAI_API_KEY
env var is not set.
Browser and Page Management
List all current pages
async getPages(): Promise<HyperPage[]>
Description: Retrieves all currently open pages within the agent's browser context. Each page is enhanced with
ai
andaiAsync
methods for task execution.Returns:
Promise<HyperPage[]>
- An array ofHyperPage
objects.Usage:
Creating a newPage
async newPage(): Promise<HyperPage>
Description: Creates and returns a new page (tab) in the agent's browser context. The returned page is enhanced with
ai
andaiAsync
methods.Returns:
Promise<HyperPage>
- A newHyperPage
object.Usage:
Get current page
async getCurrentPage(): Promise<Page>
Description: Gets the agent's currently active page. If no page exists or the current page is closed, it creates a new one. Note: This returns a standard Playwright
Page
object, not aHyperPage
.Returns:
Promise<Page>
- The current or a new PlaywrightPage
.Usage:
Close agent
async closeAgent(): Promise<void>
async closeAgent(): Promise<void>
Description: Closes the agent, including the browser instance, browser context, and any active MCP connections. Cancels any tasks that are still running or paused.
Returns:
Promise<void>
Usage:
Task Execution
Execute a task
async executeTask(task: string, params?: TaskParams, initPage?: Page): Promise<TaskOutput>
Description: Executes a given task instruction synchronously. The agent uses its LLM and configured actions to perform the task on a browser page. It waits for the task to complete (or fail) and returns the final output.
Parameters:
task
(string
): The natural language instruction for the task (e.g., "Find the contact email on this page").initPage
(Page
, optional): A specific PlaywrightPage
to start the task on. Defaults to the agent'scurrentPage
.
Returns:
Promise<TaskOutput>
- The result of the task, typically a string summary or structured data ifoutputSchema
was provided.Throws: Rethrows any error encountered during task execution.
Usage:
Execute a task asynchronously
async executeTaskAsync(task: string, params?: TaskParams, initPage?: Page): Promise<Task>
Description: Executes a given task instruction asynchronously. It immediately returns a
Task
control object, allowing management (pause, resume, cancel) of the background task.Parameters
task
(string
): The natural language instruction for the task (e.g., "Find the contact email on this page").initPage
(Page
, optional): A specific PlaywrightPage
to start the task on. Defaults to the agent'scurrentPage
.
Returns:
Promise<Task>
- ATask
control object with methods:getStatus(): TaskStatus
pause(): TaskStatus
resume(): TaskStatus
cancel(): TaskStatus
Usage:
Model Context Protocol (MCP) Integration
MCP allows extending the agent's capabilities by connecting to external servers that provide additional tools/actions.
Initialize a MCP client
async initializeMCPClient(config: MCPConfig): Promise<void>
Description: Initializes the MCP client and attempts to connect to all servers specified in the configuration. Actions provided by successfully connected servers are registered with the agent.
Parameters:
config
(MCPConfig
): Configuration object containing an array ofservers
(each with connection details like URL and ID).
Returns:
Promise<void>
Usage:
Connect to a single MCP server
async connectToMCPServer(serverConfig: MCPServerConfig): Promise<string | null>
Description: Connects to a single MCP server at runtime. Registers actions provided by the server if the connection is successful.
Parameters:
serverConfig
(MCPServerConfig
): Configuration for the specific server to connect to.
Returns:
Promise<string | null>
- The server ID if connection was successful, otherwisenull
.Usage:
Disconnect from a MCP server
disconnectFromMCPServer(serverId: string): boolean
Description: Disconnects from a specific MCP server identified by its ID. Note: This does not automatically unregister the actions provided by that server.
Parameters:
serverId
(string
): The ID of the server to disconnect from.
Returns:
boolean
-true
if disconnection was successful or the server wasn't connected,false
if an error occurred.Usage:
Check if a MCP server is connected
isMCPServerConnected(serverId: string): boolean
Description: Checks if the agent is currently connected to a specific MCP server.
Parameters:
serverId
(string
): The ID of the server to check.
Returns:
boolean
-true
if connected,false
otherwise.Usage:
List MCP server ids
getMCPServerIds(): string[]
Description: Retrieves the IDs of all currently connected MCP servers.
Returns:
string[]
- An array of connected server IDs.Usage:
Get MCP server info
getMCPServerInfo(): Array<{ id: string; toolCount: number; toolNames: string[]; }> | null
Description: Gets information about all connected MCP servers, including their IDs and the tools (actions) they provide.
Returns:
Array<{ id: string; toolCount: number; toolNames: string[]; }> | null
- An array of server info objects, ornull
if the MCP client isn't initialized.Usage:
Utility Methods
Pretty print contents of an action output
pprintAction(action: ActionType): string
Description: Generates a human-readable string representation of an agent action, if a pretty-print function is defined for that action type. Useful for logging or debugging.
Parameters:
action
(ActionType
): The action object (containingtype
andparams
).
Returns:
string
- A formatted string representing the action, or an empty string if no specific pretty-print function exists for the action type.Usage:
Last updated