The Future of the Web & Web Browsers in the New Era of Ai Agents
It’s Time To Build a Modern Day Web Agent in 2025
My weekend feed was filled with all the Ai headlines related to both China’s new DeepSeek model and OpenAi’s new Operator launch. While Deepseek grabbed the majority of the headlines (links in footnotes below), Operator and it’s potential impact on the Web deserves more airtime.
TL;DR OpenAi’s new Operator and others like Anthropic’s Computer Use, Google Chrome’s upcoming Mariner, and Microsoft’s CoPilot are worth paying attention to and eventually paying for.
Reminder: My goal with Cook’s PlayBooks posts is to do the curated research for you and to combine these “Best Of‘s” with my own historical context and knowledge. So stick with my YALP (yet another long post) and you’ll find demo videos and my source links for future reference.
A little over 60 days ago, (November 21, 2024), I wrote a post titled 2025: Enter the Ai 1st Era discussing the upcoming AI Agent Revolution. January 2025 is not yet over and OpenAi’s Operator launch appears to have the ability to “jump the shark” by taking web browsers from being a personal “User Agent” to being a more broad “Web Agent” and eventually an “Agent of Agents”.
WHAT IS OpenAi’s OPERATOR? … I’m copying/pasting from their website below:
“Today we’re releasing Operator, an agent that can go to the web to perform tasks for you. Using its own browser, it can look at a webpage and interact with it by typing, clicking, and scrolling.
Operator is one of our first agents, which are Ai’s capable of doing work for you independently—you give it a task and it will execute it.
Operator can be asked to handle a wide variety of repetitive browser tasks such as filling out forms, ordering groceries, and even creating memes.
The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of Ai, helping people save time on everyday tasks while opening up new engagement opportunities for businesses.
Operator can “see” (through screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.
My Summary:
Operator is OpenAi’s first Ai agent targeted at web browsing, a critical software we’ve all been using for over 30 years to navigate and extract information from our world wide web of information on the internet and more recently our cloud service providers.
Operator has the potential of marrying all the capabilities of ChatGPT (or other LLMs - Large Language Models) with the Browser and our Cloud-based “Browser-Service” architectures.
WHAT IF?
What if Operator can become your Agent of the Web - not just your User-Agent but your new Agent-Agent?
What if Operator can now browse, type, and click on browser buttons for you?
What if you really don’t need all those browser tabs?
What if you don’t need pages of Google Search links or all those clicks?
What if you don’t need to do all that copying/pasting and bookmarking in your own personal attempt to organize all the information?
What if many of today’s SaaS Enterprise applications and their underlying API’s to their proprietary cloud-based software, services, and databases can be acted upon by Ai Agents and simply bring you the answers?
FROM WEB BROWSER TO WEB AGENT
I believe we are witnessing a new fundamental shift in how we use browsers and interact with the web….not much different than Firefox’s rebirth of browsers in the 2004-2005 era which in turn were the seeds for cloud-based “client-server” (“browser-server”?) architecture in the 2010 era which today has evolved into mobile apps and cloud-based apps.
Operator and its competitors Google Mariner, Anthropic’s Computer Use, and Microsoft’s CoPilot are general-purpose Ai agents meant to be customized by the human user to take actions on the user’s behalf and allow the computer to now use the keyboard, mouse, and touch screen. Yes, the computer is now acting on its own based on a simple prompt by the human user (Que: 2001 A Space Odyssey……“I’m sorry Dave, I’m afraid I can’t do that!”).
I continue to believe these new Ai capabilities are just as powerful as Mobile Apps were on our small phone touchscreens (circa 2010) which eventually disrupted the prior Era of Desktop Browsing.
I believe Operator is a re-imagination of what a “browser” can be and how it can serve us as we evolve yet again into a new digital interface.
Here’s a quick 1-minute demo:
And another quick 1-minute clip from that demo:
Finally, here’s a demo of Google’s DeepMind Mariner, launching soon according to insiders.
Destined to be built into the Google Chrome browser, this will be Google’s version of OpenAi’s Operator. I may have to revise this post when Google releases Mariner in their Chrome Browser, which I predict will be in the next 30-60 days max - February or March 2025.
My personal backstory on Browsers:
In 2005, I joined the small Mozilla team shortly after the launch of the Firefox 1.0 web browser in November 2004. The Mozilla team had kept the original Netscape code base alive and their team of a dozen or so engineers and thousands of volunteers were determined not to let Microsoft and their horrific blue IE (Internet Explorer Browser) win. To be clear, Microsoft had won in 2005, with reports of 98.6% browser market share by mid-2004.
The small, nimble, and passionate Mozilla-Firefox team changed all that in a flash in November of 2004. They launched a faster, more secure, and easier user interface (including tabbed browsing) and took out an ad in the NY Times. Mozilla-Firefox exploded onto the scene, and over the next 4 years, hit a peak of 30% browser market share and brought back to life a software category left for dead.
The link below is a great history of browsers and browser wars for any of you who are history nerds like me.
https://hackernoon.com/how-the-browser-wars-changed-the-landscape-of-the-internet
The Browser’s Past and Present Role: User Agents
To understand the evolutionary inflection point that today’s Operator launch represents, it’s important to revisit a core component of historical browser functionality: the User Agent.
A User Agent is your web browser’s identifier of “You”. Your digital passport that tells websites what platform, device, and software you’re using. It enables websites to adapt content for your specific computing setup. Key elements it identifies include:
Platform or OS: Are you on Windows, MacOS, iOS, Android, ChromeOS, Linux, etc.
What Browser Are You Using?: Firefox, Chrome, Safari, Edge, Opera, etc.
What Hardware Are You Using?: Desktop, Mac, Intel, ARM, iPhone, iPad, Watch, Android, etc.
Software Versions: The version of your browser to ensure compatibility.
All of this information is embedded in your browser’s “User Agent” code buried behind that lovely graphical user interface you use every day.
Ultimately, browser User Agents allow the browser and underlying website to tailor the design and functionality of the information being delivered. This results in mobile-optimized pages or delivers different user experiences for desktops vs. tablets. Fundamentally, the User Agent was built for humans— to help the browser mediate between a human operator and the internet/web.
If you really want to go down the tech rabbit hole, “cookies” are also a historical browser invention to remember who you are, what you’ve viewed before, etc., to make it easier for the website to serve you better and with the same power, able to track you and offer up psychic like suggestions or ads of what to click next.
Maybe we should now think of the new era of Ai Agents like Operator, Google’s Mariner, and Microsoft’s CoPilot, as Version 2.0 of Browser Cookies.
From User Agent to Web Agent to Agent of Agents
Operator-like Ai Agents change the game. Unlike a User Agent acting as a passive identifier, Operator is an active agent for your digital life. It doesn’t just fetch content—it can act on your behalf without you touching the keyboard. A world with no more PEBKAC?
OpenAi calls it a Computer-Using Agent (CUA), and it performs tasks autonomously within a dedicated web browser. Instead of requiring APIs or manual input, Operator interacts with websites just as you would. Examples:
Navigating web pages: Clicking buttons, filling forms, scrolling, searching.
Executing complex multi-step tasks: Booking travel, making reservations, shopping online.
Add in GPT-4’s or Apple Intelligence’s or Microsoft Remote or Google vision and text reasoning capabilities and Operator should be able to communicate with others on your behalf.
Think of it this way. If your browser was the solution to finding all the information available on the internet’s World Wide Web and its client-server architecture, then Operator is poised to be the next evolution of your Web Agent - browsing and taking actions on your behalf.
The next evolution of Operator or Google or Microsoft or other competitors will be an “Agent Agent” where your Agent will be interacting with my Agent, or the Company’s Agent will be interacting with the Customer’s Agent.
Humans will be and should still be “in the loop”, but like a great Agent, a vast majority of the manual task-oriented work will be done for us and we will remain in the position of making the final decisions offered up to us.
Dharmesh Shah says in my weekend’s feed:
“Current websites were (mostly) built for humans, and APIs were (mostly) built for developers (a special type of human). Mobile sites were built for humans using mobile devices. I wonder if we’ll see websites created specifically for use by agent AIs.”
The idea of websites designed not for human users but for Ais like Operator opens up a new chapter for the internet. Imagine websites optimized for autonomous Agents that streamline interaction, prioritize efficiency, and allow seamless integration with Ai capabilities.
Why This Matters: Ramifications of Agent-Centric Browsing
New Standards for Website Design: Traditional web design will now evolve at a much faster pace. Sites optimized for Ai Agents may use simplified layouts, better semantic markup, and direct Ai-readable HTML to deliver information. Just as the world moved from desktop to mobile in the 2010 era and the mantra was requiring a mobile-first design, it seems inevitable that websites will now need to incorporate Agent-first design.
Expanded Roles for Browsers: Browsers will no longer be mere gateways to the internet. They’ll become dynamic platforms for Ai Agents, evolving to support features requiring new browser code written assuming an Ai Agent is the one browsing.
Massive Shift in Business Workflows: Tasks that used to require hours of manual effort—due diligence, scheduling, e-commerce comparisons—can now be delegated to an Agent. Aaron Levie’s example of using Operator for M&A due diligence showcases this potential. Instead of a team poring over documents, an Ai Agent compiles insights autonomously. Click this link and make sure to remember what you see on the screen is Operator moving the mouse and clicking…NOT Aaron.
Operator Building a Due Diligence Portal for an M&A Transaction
Trust and Accountability: I write a lot about trust and accountability in my posts. As operators handle sensitive tasks, trust becomes paramount. How do we ensure actions are secure, ethical, and accurate? This will drive innovation in transparency tools and Agent oversight systems.
What Comes Next: Operator as Your Personal Web Concierge
Operator isn’t just a tool; it’s a shift in mindset. As Dharmesh Shah and Aaron Levie and many others are quickly highlighting, we need to pay attention.
We’re moving from browsing with a browser to delegating tasks to an Ai Agent within the browser.
Think about your daily routines:
Shopping
Booking Travel
Researching
With an Operating Ai Agent, the information retrieved will no longer be passive but active. What is likely to pop up on your screen are “Confirmations” of the optional actions to be taken.
In the Mobile First Era, mobile apps became the focused “code” to interact with the internet and the cloud behind the scenes in a containerized and focused “app.”
Operator introduces a new “convenience layer” and likely will introduce its own new design interface. This new interface will likely require more “space” than an app can provide. Maybe, just maybe we’ll see the second coming of the browser as your new Agent-Agent?
Closing Thoughts: The New Era of the Web Agent:
Pay attention to the coming Ai Agent Era. Browsers are about to become more than a software tool. They will now evolve into intelligent Agents capable of reshaping how we work, live, and interact with the internet and the cloud going forward.
Today, you use a browser daily. Tomorrow, you’ll use Operator and others like Chrome’s Mariner and Microsoft’s CoPilot as your regular web Agent. Tasks that once required your human hands on a keyboard or touchscreen will be seamlessly handled by your agent, with you as the final decision maker.
The implications are massive and the opportunity is clear: businesses, developers, and leaders need to adapt now to thrive in this new era of Agent-centric design. Just as the internet transitioned from desktop-first to mobile-first, we’re now moving into the Era of an Agent-first Web.
The only question right now? Are you paying attention? Are you ready to evolve and change the way you use the web?
If you have a podcast and want to go more in-depth on this or other subjects, contact me here: www.benchboard.com/contact
===
Sources and further reading:
OPERATOR SOURCES: Add’l Reading
Introduction to Operator and Agents
Cook’s PlayBooks: 1 minute Clip of the above (for those short on time)
Google’s Project Mariner Demo:
The History of Browsers and How They Changed the Landscape of the Internet