HyperAgent Types

HyperbrowserConfig

When browserProvider is set to Hyperbrowser, this object configures the cloud browser session. It accepts parameters defined in the Hyperbrowser Session Parameters documentation. Key categories include:

  • browserConfig : Accepts parameters used for Playwright connectOverCDP method.

  • sessionConfig : Parameters used when creating a Hyperbrowser session. A detailed list of parameters can be found in the session configuration parameters

  • hyperbrowserConfig : Parameters used when creating a Hyperbrowser instance. This can be used to pass the API key to hyperbrowser if passing it through the environment isn't possible.

LocalOptions

When browserProvider is Local, this object passes configuration directly to Playwright's browserType.launch() method.

Refer to the Playwright launch options documentation for all available parameters.

TaskParams

This object provides additional configuration and callback hooks when calling executeTask or executeTaskAsync or running tasks through page.ai

  • maxSteps (number, optional): Sets a maximum limit on the number of steps the agent can take to complete the task. Helps prevent infinite loops or excessive execution time.

  • debugDir (string, optional): Specifies a directory path where debug information (like agent outputs and screenshots) should be saved during task execution. Used only if the debug flag was set in the HyperAgent config. If unset, will default to ./debug

  • outputSchema (z.AnyZodObject, optional): A Zod schema defining the desired structure for the task's final output. When provided, the agent attempts to format its result according to this schema. Essential for extracting structured data. More information is available in the output-to-schema example

    import { z } from "zod";
    
    // Example: Expecting an object with name and email
    const schema = z.object({
      name: z.string(),
      email: z.string().email()
    });
    
    const result = await agent.executeTask(
        "Find the contact person and their email on this page.",
        { outputSchema: schema }
    );
    // result might look like: { name: "Jane Doe", email: "[email protected]" }
  • onStep ((step: AgentStep) => void, optional): A callback function that is executed after each step the agent completes. Receives an AgentStep object containing details about the agent's output and the actions taken in that step.

  • onComplete ((output: TaskOutput) => void, optional): A callback function executed when the task successfully completes. Receives the final TaskOutput object.

  • debugOnAgentOutput ((step: AgentOutputType) => void, optional): A more granular callback specifically for debugging. It's executed with the raw AgentOutputType (agent's thoughts, next goal, planned actions) before actions are executed in a step.

AgentOutput

  • Description: Represents the structured output generated by the agent's language model in a single reasoning cycle. It includes the agent's thought process, intermediate memory, the next goal it aims to achieve, and the specific actions it plans to execute to reach that goal.

  • Properties:

    • thoughts (string): The agent's reasoning about the current state and the previous action's success.

    • memory (string): Information the agent decides to retain for subsequent steps.

    • nextGoal (string): The immediate objective the agent sets for the upcoming actions.

    • actions (Array<ActionType>): An array of actions the agent intends to perform.

AgentStep

  • Description: Encapsulates the details of a single step performed by the agent during task execution. It links the agent's reasoning output (AgentOutputType) with the actual results (ActionOutput[]) obtained from executing the planned actions.

  • Properties:

    • idx (number): The sequential index of this step within the task.

    • agentOutput (AgentOutputType): The agent's reasoning output for this step.

    • actionOutputs (ActionOutput[]): An array containing the results from executing each action planned in agentOutput.

TaskOutput

  • Description: Represents the final outcome of a completed agent task. It includes the overall status, a history of all steps taken, and the final output string, if generated.

  • Properties:

    • status (TaskStatus, optional): The final status of the task (e.g., 'completed', 'failed').

    • steps (AgentStep[]): An array containing all the steps executed during the task.

    • output (string, optional): The final textual output or result produced by the task (e.g., a summary, extracted data).

Last updated