| title | Computer Action |
|---|---|
| openapi | POST /computer-use |
| description | Execute computer actions in the virtual desktop environment |
Execute actions like mouse movements, clicks, keyboard input, and screenshots in the Bytebot desktop environment.
The type of computer action to perform. Must be one of: `move_mouse`, `trace_mouse`, `click_mouse`, `press_mouse`, `drag_mouse`, `scroll`, `type_keys`, `press_keys`, `type_text`, `wait`, `screenshot`, `cursor_position`. The target coordinates to move to.<Expandable title="coordinates properties">
<ParamField body="x" type="number" required>
X coordinate (horizontal position)
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate (vertical position)
</ParamField>
</Expandable>
Example Request
{
"action": "move_mouse",
"coordinates": {
"x": 100,
"y": 200
}
}<Expandable title="path">
<ParamField body="0" type="object">
<Expandable title="properties">
<ParamField body="x" type="number" required>
X coordinate
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate
</ParamField>
</Expandable>
</ParamField>
</Expandable>
{" "}
Keys to hold while moving the mouse along the path.Example Request
{
"action": "trace_mouse",
"path": [
{ "x": 100, "y": 100 },
{ "x": 150, "y": 150 },
{ "x": 200, "y": 200 }
],
"holdKeys": ["shift"]
}<Expandable title="coordinates properties">
<ParamField body="x" type="number" required>
X coordinate (horizontal position)
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate (vertical position)
</ParamField>
</Expandable>
{" "}
Mouse button to click. Must be one of: `left`, `right`, `middle`.{" "}
Number of clicks to perform.{" "}
Keys to hold while clicking (e.g., ['ctrl', 'shift']) Key nameExample Request
{
"action": "click_mouse",
"coordinates": {
"x": 150,
"y": 250
},
"button": "left",
"clickCount": 2
}<Expandable title="coordinates properties">
<ParamField body="x" type="number" required>
X coordinate (horizontal position)
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate (vertical position)
</ParamField>
</Expandable>
{" "}
Mouse button to press/release. Must be one of: `left`, `right`, `middle`.{" "}
Whether to press or release the button. Must be one of: `up`, `down`.Example Request
{
"action": "press_mouse",
"coordinates": {
"x": 150,
"y": 250
},
"button": "left",
"press": "down"
}<Expandable title="path">
<ParamField body="0" type="object">
<Expandable title="properties">
<ParamField body="x" type="number" required>
X coordinate
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate
</ParamField>
</Expandable>
</ParamField>
</Expandable>
{" "}
Mouse button to use for dragging. Must be one of: `left`, `right`, `middle`.{" "}
Keys to hold while dragging.Example Request
{
"action": "drag_mouse",
"path": [
{ "x": 100, "y": 100 },
{ "x": 200, "y": 200 }
],
"button": "left"
}<Expandable title="coordinates properties">
<ParamField body="x" type="number" required>
X coordinate
</ParamField>
<ParamField body="y" type="number" required>
Y coordinate
</ParamField>
</Expandable>
{" "}
Scroll direction. Must be one of: `up`, `down`, `left`, `right`.{" "}
Number of scroll steps to perform.{" "}
Keys to hold while scrolling.Example Request
{
"action": "scroll",
"direction": "down",
"scrollCount": 5
}{" "}
Delay between key presses in milliseconds.Example Request
{
"action": "type_keys",
"keys": ["a", "b", "c", "enter"],
"delay": 50
}{" "}
Whether to press or release the keys. Must be one of: `up`, `down`.Example Request
{
"action": "press_keys",
"keys": ["ctrl", "shift", "esc"],
"press": "down"
}{" "}
Delay between characters in milliseconds.Example Request
{
"action": "type_text",
"text": "Hello, Bytebot!",
"delay": 50
}Example Request
{
"action": "paste_text",
"text": "Special characters: ©®™€¥£ émojis 🎉"
}Example Request
{
"action": "wait",
"duration": 2000
}Example Request
{
"action": "screenshot"
}Example Request
{
"action": "cursor_position"
}Example Request
{
"action": "application",
"application": "firefox"
}Available Applications:
firefox- Mozilla Firefox web browser1password- Password managerthunderbird- Email clientvscode- Visual Studio Code editorterminal- Terminal/console applicationdesktop- Switch to desktopdirectory- File manager/directory browser
Responses vary based on the action performed:
Most actions return a simple success response:
{
"success": true
}Returns the screenshot as a base64 encoded string:
{
"success": true,
"data": {
"image": "base64_encoded_image_data"
}
}Returns the current cursor position:
{
"success": true,
"data": {
"x": 123,
"y": 456
}
}{
"success": false,
"error": "Error message"
}import requests
def control_computer(action, **params):
url = "http://localhost:9990/computer-use"
data = {"action": action, **params}
response = requests.post(url, json=data)
return response.json()
# Move the mouse
result = control_computer("move_mouse", coordinates={"x": 100, "y": 100})
print(result)const axios = require("axios");
async function controlComputer(action, params = {}) {
const url = "http://localhost:9990/computer-use";
const data = { action, ...params };
const response = await axios.post(url, data);
return response.data;
}
// Move mouse example
controlComputer("move_mouse", { coordinates: { x: 100, y: 100 } })
.then((result) => console.log(result))
.catch((error) => console.error("Error:", error));