> We’re also bringing the creative power of Nano Banana directly into Chrome, allowing you to transform images on the fly without needing to download and re-upload images or open another tab.
Are there really people who are like "Man, if only I could this straight in Chrome" ? Is this something worth bloating a browser (further) with?
Yes. AI is the next abstraction layer. By analogy, instead of wielding the wrench (application) yourself, you have a plumber (AI) fix your pipes. The browser is just a rich interface to that AI.
Honestly? Yeah. Not really Gemini or AI chat bot stuff, but I definitely appreciate when the software I use that interacts with images includes at least some minor drawing/highlighting/text overlay capabilities. So, while I don't like this feature in particular, I'm not against the general idea.
I was trying to use Gemini connected apps with YouTube to get details about my channels performance. It could not do it, I instead had to settle for what it could see publicly on my channel.
Am I being too cynical, or does anyone else envision a future where you ask Chrome to buy you something, anything, online and instead of it actually buying you the “best” item, you end up with items it “prefers” where Google make money from suggestions and/or completion of sale?
I know it calls out that there’ll need to be user confirmation before the final purchase, but if you’re already not expending the effort to find the product or service yourself, are you really going to sit and research what it’s given you? If you are, then what’s the point of using the agent?
Just seems like the next evolution in Google’s ad revenue generation.
Serious question, not snark: Does anyone actually want this? I honestly can't imagine a use for these features even among people tech savvy enough to understand them.
I like doing side projects, I don't like wasting a day of work potential on any of these web apps: Google Cloud, AWS, Azure, Appstore Connect, Google's Android App Store, RevenueCat, Stripe, etc
I dread having to log in to these systems and waste hours achieving the simplest tasks.
This is what I'm using Claude for. E.g. I log in to AppStore connect, tell it what I need (3 subscription tiers), it will do all the clicking and editing and Apple's stupid UI, then I will ask it to create a summary for RevenueCat, and use another Claude session in there to click all the buttons to configure based on what just happened in Appstore connect.
I've seen at least one decent use case from "normies" around me: Bypassing stupid company processes to achieve actual automated productivity in your rote processes instead of the theatre of it.
Sounds like a contrived situation, but there's a surprising amount of "thought leader" CEOs out there who make completely nonsensical decisions under the banner of "saving costs and automating things".
(Real-world example I know of) company pays for cheapest tier they can find of Gemini, tell everyone to use it. But won't pay for Asana seats, so every user in your 100-person startup is a guest, and can't use the connector in any AI app to TRULY do useful task management with AI.
Having some better access to AI in the browser would pave over that pain for someone who currently doesn't want to spend their own money on something like Claude for Cowork and the Chrome extension to drive the browser, or open a terminal to have Claude Code do it.
Having a browser that works for me would be useful yeah. Stuff like skip the story and give me the recipe, or click through the pointless extra steps, or reformat my address into the bizarre format the website wants.
Of course the single biggest thing my browser can do to help me is blocking ads, which means it's curious to see this just after Google killed adblock in chrome.
Adblock continues to be just as effective as it ever was in Chrome.
Even before the removal of MV2, the claims that it would kill adblock were ridiculous as many adblockers had already switched to MV3 but it was at least understandable that people could be ignorant of that fact. Now that everything is on MV3 how can people still be claiming that Google killed adblock when Chrome users still have working adblockers?
I like having AI in my browser, I use Claude quite a bit.
Examples: using my budgeting app directly to figure out why some forecasting event went wrong, or helping me correlate SOC2 tickets with GitHub pull requests and flagging all that are older than $date.
It’s surprisingly convenient for a narrow set of tasks.
In one of the previous companies i worked at, we were automating a very valid use case of a bunch of people crawling though a set of urls daily/weekly and find the pdfs and summarise the changes from the previous week. I'm guessing these features are geared towards them.
It’ll give me a list of what’s available, the searching process isn’t made any more fun by including restaurants which will be a pain in the ass to book for a given date.
I use claude's chrome plugin all the time. As well Chatgpt's agent mode. I prefer Agent mode when I don't need to login but want it to do search.
However, Gemini in Chrome requires you to allow them to use your data to improve their model, which I won't consent to. Google workspace account seems exempt so I plan to try it out there.
Right now I paste screenshots of AWS/Azure/GCP into Claude and ask it questions on how to navigate around / what to do / how to set things up. This seems like a much better experience solely to not have to deal with the weird mac screenshot UX.
Me personally: absolutely not - and I fundamentally do not understand the need for something like this. I would never use such a tool under any possible circumstance knowing what I know about the current technology underpinning these clankers.
These feels on par with Microsoft's push to shove Copilot down everyone's neck at every step possible whether we like/need it or not
I wish this executive (author of that post) https://xcancel.com/laparisa?lang=en will show their browser in REAL LIFE everyday use. Really do they use it?
One more reason not to use chrome. I don't want this AI bloat. And for sure I don't want to redirect my personal website content through googles privacy sucking data pipelines.
What I just discovered: The good old google search without AI bloat but with privacy via https://www.startpage.com. Highly recommended!
I must be using web browsers completely wrong. Like browsing a page isn't a problem for me. I can do it at the speed of my needs.
I'm having a hard time understanding why I will tell gemini to create an account on some website for me or send an email. Those are usually just a tab away. That's why I feel like I'm missing something here.
Basically none of their examples are just "browse a page"? They're multi-step tasks combining data from multiple pages.
Like the first example in the demo carousel (the Y2K party) starts from a photo and a prompt of roughly "buy the props needed for replicating this photo from Etsy". It first analyzes the image in the current tab, identifies a bunch of things to buy, searches for them on Etsy, customizes the orders, adds them to the shopping basket, and then asks for a confirmation to actually send an order.
The second one auto-fills a form with a couple of dozen fields from the data that's in a pdf in another tab. (And in the fiction of a demo, presumably a pdf that's you already had around, not one that you made just for the purposes of using it to auto-fill the form.)
I'm not the target market for this: automating a browser with my credentials is just too scary, but I can certainly see the utility. There's a huge amount of tasks taking a minute or two are not worth creating bespoke automation for but that are also pretty mechanical processes.
Maybe I’m a curmudgeon who can’t imagine throwing an elaborate Y2K party because all my friends were alive and threw parties at the real Y2K, but… these all feel extremely contrived.
It’s as if they used AI to generate use cases for their AI tool because they weren’t really sure what it’s for…
I feel that way about IDEs too, though. My text editor has snippets, my file manager shows me what files are where, and my terminal lets me run programs. Why it's important to people that these functions to be grafted into a single window escapes me.
Maybe you're only using well-designed sites? Try making a booking with a Chinese airline and you'll quickly wish for an assistant to delegate it all to.
funny you say that, I was literally just booking a flight with air china yesterday and the UX was 10x better than the average wizzair/ryanair experience - a clear, readable UI (with a great table comparison of prices +-3 days from the selected dates), no ads, no random services getting pushed in your face, no booking tabs automatically opening in the background
Gee thanks, now I have a big Ask Google buttong in the url bar but only on Youtube for some reason, how can I disable it? Could not figure how to disable like the others.
Are there really people who are like "Man, if only I could this straight in Chrome" ? Is this something worth bloating a browser (further) with?
reply