In the Google Pixel 2024 Keynote, one of the presenters, Dave Citron, had to try three times to get Gemini to perform the preapproved, practiced demonstration task - 33.3% success rate for something that Google KNOWS will work.
Most of the demo fails they had during today's keynote felt more like beta supporting software and bad network kinds of failures. The only failure I saw that felt more like a generative AI-level failure was one of the hot air balloon image gens they tried which generated a circle of garbage in the sky; which was one variation presented among five or six which the AI spat out in seconds.
At https://www.youtube.com/live/N_y2tP9of8A?t=4119 in the Google Pixel 2024 keynote, the generative AI in Magic Editor still produces wacky objects - a hot air balloon that looks deflated?! The gen AI did produce two decent-looking hot air balloons.
at this stage, I'd be happy with google assistant "only" understanding what I'd like to play on youtube (+music) while driving. Far-fetched dream would be to understand streets I'd like to navigate to via google maps (not US, not english). Barrier is so low compared to rest.
UX-wise, I'm not sure why everyone is so keen on voice commands for AI.
Why let everyone in earshot know what you're up to? Sure, it's easier than typing all that text given crappy feedback of these smartphone onscreen keyboards.
If you show a pic to Google Lens app, the app will proceed to identify text and highlight those text passages.
In the AI-verse, Lens could detect the concert tour.
- if the tour was in the past, Lens could look up in Photos any media with GPS locations at those concert locations/dates and show them.
- if the tour is now/in the future, Lens could detect the dates and locations, and look up in your calendar to find when you are available. If you are available when the tour comes to your town, Lens would show a big green checkmark next to that date/location pair or highlight that date/location. Otherwise, it goes into search mode and crosses out dates/location when you aren't available. For out-of-town locations when you are available, Lens could start up Maps to search for flights to those locations. Perhaps a search of your contacts to see if you have friends/relatives in/near those out-of-town locations. or look in your calendar to see if you've been to any of those out-of-town locations before, etc. etc.
I mostly agree, but partially because I can't tell (even though I'm paying $20/month) what tier of Gemini I am getting. I know most people won't care, but because they won't tell me, I will assume I'm getting Gemini Flash from 6 months ago, and I'm not paying $20/month for that. I'm sure if they were honest about the model people wouldn't pay for it.
They're making the mistake of optimizing for general case users (who don't care about model version) when they need to attract power users so that they can find product-market fit.
There's a vanilla 2TB plan for $10/month, the plan with 2TB and Gemini is $20/month. You could say that Gemini is $10/month then, but if you don't actually need 2TB or the other benefits then you're effectively paying $20/month because you can't unbundle Gemini from everything else.
Wait, where's this $10/month plan? I don't see it. I see it at https://one.google.com/about, but I can't actually choose that.
Edit: I found the answer on Reddit, as usual. You need to go to your plan settings and only then is the plan available under one.google.com/settings. That's the only way to downgrade.
I have Gemini Advanced but I can't tell which model it is, yeah. Same problem. Whatever it is, it's useless. GPT-2 level. Can't do anything reasonable.
Claude 3.5 Sonnet and ChatGPT-4o are roughly the same with the former pipping it for me and then out there is this shitty Google product that is worse than Llama-3 running on my own laptop. Even Llama-3 is better at remembering what's going on.
Fortunately, this time I managed to look it up and I have it for free because I have Google One 5TB, apparently. And there's no way to spend more money so that's what I have. It's so bad that when Claude and ChatGPT run out of messages for me I just use local Llama rather than use Gemini. Atrocious product.
In my experience, the latest experimental model is a bit better than the latest Claude/ChatGPT at creativity, but a little worse at general reasoning. They're still mostly comparable and certainly of the same generation.
Where it truly stands out is the 2M context window. That's game-changing for things like analyzing publications and books.
Yeah, in practice, for the tasks I set it. High hallucination rate. Low context window. Frequently refuses to act and suggests Googling. If the other guys didn't exist, it could be useful, but as it stands it's as useful as GPT-2 because neither of them hit the threshold of usefulness.
I'm sure some benchmarks are decent but when Google finally shutters the chatbot I'll be glad because then I won't constantly be wondering if I'm paying for it.
It's a shame because Google's AI features otherwise are incredible. Google Photos has fantastic facial recognition, and I can search it with descriptions of photos and it finds them. Their keyboard is pretty good. But Gemini Advanced is better off not existing. If it's the same team, I suppose they can't keep making hits. If it's a different team, then they're two orders of magnitude less capable.
It doesn't actually work. I pasted in a House Resolution and asked it a question and it immediately spazzed and asked me to Google. I used Claude and it just worked. That's the thing about Gemini: it has a lot of stats but it doesn't work. With Claude I could then ask it the section and look at the actual text. With Gemini it just doesn't do it at all.
This feels a lot like when people would tell me how the HP laptop had more gigahertz and stuff and it would inevitably suck compared to a Mac.
The output from an LLM is like the path a marble takes across a surface shaped by its training data and answers to “why” questions just continue the marble’s path. You may get good sounding answers to your why questions but they are just good sounding answers and not the real reasons because LLM’s lack the ability to introspect their own thinking. Note: Humans do not have this ability either unless using formal step by step reasoning.
I pay for Gemini Advanced and it's much better than GPT-2, I think. I often search the same thing between Gemini and GPT-4 and it's a toss up which is better (they each get questions right when the other gets it wrong, sometimes).
But recently I asked Gemini "Bill Clinton is famous for his charisma and ability to talk with people. Did he ever share tips on how he does it?" and Gemini responded with some generic "can't talk about politics" answer, which was a real turn-off.
Bottom left. Settings > Manage Subscription. It turns out I have it included till the end of the year because of the 5TB Google One sub https://news.ycombinator.com/item?id=41237943
Some VP of AI wants to become an SVP? And some marketing director on phones wants a "SAVE $250" sticker on all the new phones and is happy to play along?
Phone with scuba glasses and the woke AI. In fact it looks like Bender from Futurama, only with a nondescript personality. Thanx but no. No need for an AI to lecture me or avoid questions because it thinks it might remotely offend a hypothetic alien civilization.
Can't stand the iPhone either. At least it has a good camera and audio, but most of the apps are either junk or you have to pay for Apple rent seeking. A lot of shady Chinese apps. I spent hours trying to find a decent calculator like the stock one on Android. And the on screen keyboard is outright annoying. Couldn't turn off haptics either.
So I'm still using my Pixel 4a until it dies. The iPhone is garthering dust on my nightstand.
Who thinks they are booking phone sales as AI revenue to juice the numbers?