OpenAI has released an AI agent named “Operator”
The new tool lets you look at and interact with web pages by typing, scrolling, and clicking on buttons.
OpenAI, a startup, has debuted its proprietary AI agent, “Operator.” It has the capacity to execute internet duties on behalf of the user. The new tool has the ability to interact with web pages, input text, navigate, and select icons, as well as browse them.
A variety of repetitious fundamental duties, such as completing out forms, ordering supplies, or reserving hotels, may be requested of the operator. “OpenAI’s announcement asserts that the capacity to employ the same interfaces and tools that individuals interact with on a daily basis broadens the scope of AI applications, thereby opening up new opportunities for business interaction and saving time on routine duties.”
“OpenAI’s announcement posits that the capacity to utilize the same interfaces and tools that individuals interact with on a daily basis broadens the scope of AI applications, thereby facilitating the completion of routine tasks and presenting new opportunities for business interaction.”
Computer-Using Agent (CUA) is a novel AI model that powers Operator. Through reinforcement learning, it integrates GPT-4o’s capacity to perceive the screen with sophisticated reasoning. The agent is capable of performing the same actions as a human using a mouse and keypad, and it perceives information through snapshots.
Before accomplishing duties such as reserving a hotel or sending an email, the model is instructed to request confirmation.
At present, a preliminary research version is operational and will be further developed in accordance with user feedback. The AI agent is available to ChatGPT Pro subscribers in the United States for $200 on a dedicated resource. In the future, there are plans to increase the number of individuals who have access.
In this phase, the agent is not functioning optimally, and it will request that the user complete the mission if it encounters any issues.
Please be advised that in October 2024, the AI startup Anthropic released an updated version of the Claude 3.5 Sonnet model. This model is capable of interacting with a computer in a manner similar to that of a human, including the ability to move the cursor, select images, and type text.
Also Read: Elon Musk is integrating blockchain technology into his D.O.G.E. department