OpenAI CUA

OpenAI's Computer-Using Agent (CUA) is an advanced AI model designed to interact with computer interfaces much like a human user. It can navigate graphical user interfaces (GUIs), perform tasks such as clicking buttons, typing text, and managing multi-step processes. This capability allows CUA to automate complex workflows without relying on specialized APIs. Actually, this is the model that powers Operator, OpenAI's AI agent capable of executing various web-based tasks like filling out forms, ordering products, and scheduling appointments. Operator utilizes CUA to mimic human interactions within a browser, allowing it to handle repetitive or intricate tasks on behalf of users.

Hyperbrowser's CUA agent allows you to easily execute agent tasks on the web utilizing OpenAI's Computer-Using Agent with just a simple call. Hyperbrowser exposes endpoints for starting/stopping a CUA task and for getting it's status and results.

By default, CUA tasks are handled in an asynchronous manner of first starting the task and then checking it's status until it is completed. However, if you don't want to handle the monitoring yourself, our SDKs provide a simple function that handles the whole flow and returns the data once the task is completed.

Installation

npm install @hyperbrowser/sdk

yarn add @hyperbrowser/sdk

pip install hyperbrowser

uv add hyperbrowser

Usage

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await hbClient.agents.cua.startAndWait({
    task: "what are the top 5 posts on Hacker News",
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    resp = hb_client.agents.cua.start_and_wait(
        StartCuaTaskParams(
            task="what are the top 5 posts on Hacker News"
        )
    )

    print(f"Output:\n\n{resp.data.final_result}")


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

Start CUA task

curl -X POST https://app.hyperbrowser.ai/api/task/cua \
    -H 'Content-Type: application/json' \
    -H 'x-api-key: <YOUR_API_KEY>' \
    -d '{
        "task": "what are the top 5 posts on Hacker News"
    }'

Get CUA task status

curl https://app.hyperbrowser.ai/api/task/cua/{jobId}/status \
    -H 'x-api-key: <YOUR_API_KEY>'

Get CUA task

curl https://app.hyperbrowser.ai/api/task/cua/{jobId} \
    -H 'x-api-key: <YOUR_API_KEY>'

Stop CUA task

curl -X PUT https://app.hyperbrowser.ai/api/task/cua/{jobId}/stop \
    -H 'x-api-key: <YOUR_API_KEY>'

Task parameters

task - The instruction or goal to be accomplished by CUA.

sessionId - An optional existing browser session ID to connect to instead of creating a new one.

maxFailures - The maximum number of consecutive failures allowed before the task is aborted.

maxSteps - The maximum number of interaction steps CUA can take to complete the task.

keepBrowserOpen - When enabled, keeps the browser session open after task completion.

The agent may not complete the task within the specified maxSteps. If that happens, try increasing the maxSteps parameter.

Reuse Browser Session

You can pass in an existing sessionId to the OpenAI CUA task so that it can execute the task on an existing session. Also, if you want to keep the session open after executing the task, you can supply the keepBrowserOpen param.

In the examples below, the keepBrowserOpen field is not set to true in the second call to the AI Agent so it will close the browser session after execution, and the session is being closed at the end with the stop function to make sure it gets closed.

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const session = await hbClient.sessions.create();

  try {
    const result = await hbClient.agents.cua.startAndWait({
      task: "What is the title of the first post on Hacker News today?",
      sessionId: session.id,
      keepBrowserOpen: true,
    });

    console.log(`Output:\n${result.data?.finalResult}`);

    const result2 = await hbClient.agents.cua.startAndWait({
      task: "Tell me how many upvotes the first post has.",
      sessionId: session.id,
    });

    console.log(`\nOutput:\n${result2.data?.finalResult}`);
  } catch (err) {
    console.error(`Error: ${err}`);
  } finally {
    await hbClient.sessions.stop(session.id);
  }
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    session = hb_client.sessions.create()

    try:
        resp = hb_client.agents.cua.start_and_wait(
            StartCuaTaskParams(
                task="What is the title of the first post on Hacker News today?",
                session_id=session.id,
                keep_browser_open=True,
            )
        )

        print(f"Output:\n{resp.data.final_result}")

        resp2 = hb_client.agents.cua.start_and_wait(
            StartCuaTaskParams(
                task="Tell me how many upvotes the first post has.",
                session_id=session.id,
            )
        )

        print(f"\nOutput:\n{resp2.data.final_result}")
    except Exception as e:
        print(f"Error: {e}")
    finally:
        hb_client.sessions.stop(session.id)


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

Session Configurations

The sessionOptions will only apply if creating a new session when no sessionId is provided.

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await hbClient.agents.cua.startAndWait({
    task: "what are the top 5 posts on Hacker News",
    sessionOptions: {
      acceptCookies: true,
    }
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams, CreateSessionParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    resp = hb_client.agents.cua.start_and_wait(
        StartCuaTaskParams(
            task="what are the top 5 posts on Hacker News",
            session_options=CreateSessionParams(
                accept_cookies=True,
            ),
        )
    )

    print(f"Output:\n\n{resp.data.final_result}")


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

curl -X POST https://app.hyperbrowser.ai/api/task/cua \
    -H 'Content-Type: application/json' \
    -H 'x-api-key: <YOUR_API_KEY>' \
    -d '{
        "task": "what are the top 5 posts on Hacker News",
        "sessionOptions": {
            "acceptCookies": true
        }
    }'

Hyperbrowser's CAPTCHA solving and proxy usage features require being on a PAID plan.

Using proxy and solving CAPTCHAs will slow down the web navigation in the CUA task so use it only if necessary.

PreviousClaude Computer Use NextAbout HyperAgent

Last updated 23 days ago

OpenAI CUA

Installation

npm install @hyperbrowser/sdk

yarn add @hyperbrowser/sdk

pip install hyperbrowser

uv add hyperbrowser

Usage

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await hbClient.agents.cua.startAndWait({
    task: "what are the top 5 posts on Hacker News",
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    resp = hb_client.agents.cua.start_and_wait(
        StartCuaTaskParams(
            task="what are the top 5 posts on Hacker News"
        )
    )

    print(f"Output:\n\n{resp.data.final_result}")


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

Start CUA task

curl -X POST https://app.hyperbrowser.ai/api/task/cua \
    -H 'Content-Type: application/json' \
    -H 'x-api-key: <YOUR_API_KEY>' \
    -d '{
        "task": "what are the top 5 posts on Hacker News"
    }'

Get CUA task status

curl https://app.hyperbrowser.ai/api/task/cua/{jobId}/status \
    -H 'x-api-key: <YOUR_API_KEY>'

Get CUA task

curl https://app.hyperbrowser.ai/api/task/cua/{jobId} \
    -H 'x-api-key: <YOUR_API_KEY>'

Stop CUA task

curl -X PUT https://app.hyperbrowser.ai/api/task/cua/{jobId}/stop \
    -H 'x-api-key: <YOUR_API_KEY>'

CUA tasks can be configured with a number of parameters. Some of them are described briefly here, but a list can be found in our .

Task parameters

task - The instruction or goal to be accomplished by CUA.

sessionId - An optional existing browser session ID to connect to instead of creating a new one.

maxFailures - The maximum number of consecutive failures allowed before the task is aborted.

maxSteps - The maximum number of interaction steps CUA can take to complete the task.

keepBrowserOpen - When enabled, keeps the browser session open after task completion.

sessionOptions - .

For detailed usage/schema, check out the .

The agent may not complete the task within the specified maxSteps. If that happens, try increasing the maxSteps parameter.

Additionally, the browser session used by the AI Agent will time out based on your team's default Session Timeout settings or the session's timeoutMinutes parameter if provided. You can adjust the default Session Timeout in the .

Reuse Browser Session

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const session = await hbClient.sessions.create();

  try {
    const result = await hbClient.agents.cua.startAndWait({
      task: "What is the title of the first post on Hacker News today?",
      sessionId: session.id,
      keepBrowserOpen: true,
    });

    console.log(`Output:\n${result.data?.finalResult}`);

    const result2 = await hbClient.agents.cua.startAndWait({
      task: "Tell me how many upvotes the first post has.",
      sessionId: session.id,
    });

    console.log(`\nOutput:\n${result2.data?.finalResult}`);
  } catch (err) {
    console.error(`Error: ${err}`);
  } finally {
    await hbClient.sessions.stop(session.id);
  }
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    session = hb_client.sessions.create()

    try:
        resp = hb_client.agents.cua.start_and_wait(
            StartCuaTaskParams(
                task="What is the title of the first post on Hacker News today?",
                session_id=session.id,
                keep_browser_open=True,
            )
        )

        print(f"Output:\n{resp.data.final_result}")

        resp2 = hb_client.agents.cua.start_and_wait(
            StartCuaTaskParams(
                task="Tell me how many upvotes the first post has.",
                session_id=session.id,
            )
        )

        print(f"\nOutput:\n{resp2.data.final_result}")
    except Exception as e:
        print(f"Error: {e}")
    finally:
        hb_client.sessions.stop(session.id)


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

Session Configurations

You can also provide configurations for the session that will be used to execute the CUA task just as you would when creating a new session itself. These could include using a proxy or solving CAPTCHAs. To see the full list of session configurations, checkout the .

The sessionOptions will only apply if creating a new session when no sessionId is provided.

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";

config();

const hbClient = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const main = async () => {
  const result = await hbClient.agents.cua.startAndWait({
    task: "what are the top 5 posts on Hacker News",
    sessionOptions: {
      acceptCookies: true,
    }
  });

  console.log(`Output:\n\n${result.data?.finalResult}`);
};

main().catch((err) => {
  console.error(`Error: ${err.message}`);
});

import os
from hyperbrowser import Hyperbrowser
from hyperbrowser.models import StartCuaTaskParams, CreateSessionParams
from dotenv import load_dotenv

load_dotenv()

hb_client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


def main():
    resp = hb_client.agents.cua.start_and_wait(
        StartCuaTaskParams(
            task="what are the top 5 posts on Hacker News",
            session_options=CreateSessionParams(
                accept_cookies=True,
            ),
        )
    )

    print(f"Output:\n\n{resp.data.final_result}")


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

curl -X POST https://app.hyperbrowser.ai/api/task/cua \
    -H 'Content-Type: application/json' \
    -H 'x-api-key: <YOUR_API_KEY>' \
    -d '{
        "task": "what are the top 5 posts on Hacker News",
        "sessionOptions": {
            "acceptCookies": true
        }
    }'

Hyperbrowser's CAPTCHA solving and proxy usage features require being on a PAID plan.

Using proxy and solving CAPTCHAs will slow down the web navigation in the CUA task so use it only if necessary.

PreviousClaude Computer Use NextAbout HyperAgent

Last updated 23 days ago