Dark
Light

Amazon’s Nova Act: Bringing Smarter AI to Your Browser

April 2, 2025

Amazon has just introduced Nova Act, a cutting-edge AI system that promises to outshine existing technologies in both reliability and efficiency. What makes Nova Act truly special is its ability to break down complex tasks into simpler steps, like performing searches, processing payments, and answering questions about what’s on your screen. It’s a powerful tool for developers, who can enhance its capabilities by adding custom instructions, calling APIs, and using the Playwright library for direct interactions with web browsers.

In Amazon’s internal tests, Nova Act has achieved a success rate of over 90% in handling user interface tasks such as selecting dates and managing popups. This performance puts it ahead of competitors like Anthropic and OpenAI on benchmarks like ScreenSpot and GroundUI Web. Impressively, Nova Act remains effective even in environments it wasn’t specifically trained for, like browser games. It’s also been integrated into Amazon’s Alexa+ voice assistant, showcasing its versatility.

Amazon sees Nova Act as a stepping stone towards more advanced AI agents. While it currently relies on supervised fine-tuning, the focus is shifting towards reinforcement learning in diverse environments. This approach is similar to OpenAI’s Computer-Using Agent, which was trained using web data. The ultimate aim is to create AI agents capable of executing multi-step tasks autonomously, from wedding planning to complex IT operations. However, for now, significant human oversight is still necessary.

The Nova Act SDK is now available in preview, allowing U.S. developers and customers to access Amazon’s language models, including Nova Micro, Lite, and Pro, as well as image and video generation models like Nova Canvas and Nova Reel. These models are accessible through Amazon Bedrock, but the nova.amazon.com platform is designed to make them even more accessible. As Rohit Prasad, SVP of Amazon Artificial General Intelligence, puts it, “Nova.amazon.com puts the power of Amazon’s frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova.”

Nova Act empowers developers to create AI agents that can navigate browsers and perform actions much like OpenAI’s Operator. According to Amazon, this system helps developers simplify complex processes into manageable tasks, such as web searches and payment handling. The platform also includes detailed instruction features to enhance task reliability.

“We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user. Today, such agents are still in an early stage,” Amazon explains. This launch marks Amazon’s strategic entry into the rapidly growing field of AI agents, which have the potential to automate various white-collar jobs by executing tasks more efficiently than humans.

For those interested in exploring the future of AI, Nova Act offers a glimpse into what’s possible. It’s an exciting time to be a developer or tech enthusiast, as tools like these open up new possibilities for innovation and efficiency.

 

Don't Miss