Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People keep saying it because that's literally how LLMs work. They run Montecarlo sampling over a very impressive latent linguistic space. These models are not fundamentally different than the Markov chains of yore except that these latent representations are incredibly powerful.

We haven't even started to approach the largest problem which is moving beyond what is essentially a greedy token level search of this linguistic space. That is, we can't really pick an output that maximized the likelihood of the entire sequence, rather we're simply maximizing the likelihood of each part of the sequence.

LLMs are not reasoning machines. They are basically semantic compression machines with a build in search feature.



> LLMs are not reasoning machines. They are basically semantic compression machines with a build in search feature.

This is just a god of the gaps argument. Understanding is a form of semantic compression. So you're saying we have a system that can learn and construct a database of semantic information, then search it and compose novel, structured and coherent semantic content to respond to an a priori unknown prompt. Sounds like a form of reasoning to me. Maybe it's a limited deeply flawed type of reasoning, not that human reason is perfect, but that doesn't support your contention that it's not reasoning at all.


It’s basically an argument that boils down to “it’s not because I don’t like it”


I bite the bullet on the god of the gaps


The best compression is some form of understanding


The best compression relies on understanding. What LLM is is mostly data how humans use words. We understand how to make this data (which is a compression of human text) and use it (generate something). AKA it’s “production rules”, but statistical.

The only issue is ambiguity. What can be generated strongly depends on the order of the tokens. A slight variation can change the meaning and the result is worthless. Understanding is the guardrail against meaningless statement and LLMs lack it.


You seem to entirely miss how attention layers work...


That's a fascinating insight and it sound so true!

Can you compress for me Van Gogh's Starry Night, please? I'd like to send a copy to my dear old mother who has never seen it. Please make sure when she decompresses the picture she misses none of the exquisite detail in that famous painting.


Okay yes so not really having an artists vocabulary I couldn't compress it as well as someone who has a better understanding of Starry Night. An artist that understands what makes Starry Night great could create a work that evokes similar feelings and emotions. I know this because Van Gogh created many similar works playing with the same techniques, colors, and subjects such as Cypresses in Starry Night and Starry Night over the Rhone. He was clearly working from a concise set of ideas and techniques which I would argue is understanding/compression.


Fine, but we were talking about compression, not about imitation, or inspiration, and not about creating "a work that evokes similar feelings and emotions". If I compress an image, what I get when I decompress it is that image, not "feelings and emotions", yes? In fact, that's kind of the whole point: I can send an image over the web and the receiver can form their own feelings and emotions, without having to rely on mine.


Simple reasoning is a side effect of compression. That is all.

I see from your profile you are focused on your own personal and narrow definition of reasoning. But I’d argue there is a much broader and simpler definition. Can you summarize and apply learnings. This can.


To clarify, what I have in my profile is not my "own personal" definition of reasoning. It's how reasoning is understood in computer science and AI, and I am an expert on the subject through my doctoral studies and my current post-doc research.

That's important to understand. What I have in my profile is not some idiosyncratic idea about reasoning, it's the standard, formal understanding of what reasoning means, as it has developed in practice, in AI research in the last many decades.

I appreciate that there are many people who opine about reasoning who are not aware of that prior work and come up with their own ideas about what "reasoning" means, and some are even AI researches which is very concerning but I can't do anything about that except push back against such uninformed opinions.

>> This can.

I'm sorry, what can?


Academics have gotten AI wrong since its inception and now are relegated to the trailing edges of the field. Mostly because increasingly insist on theory-as-fact in soft arenas that are clearly still in motion. Reasoning has been one thing, it can continue to grow to be another. But even from your defition, I can provide abductive, inductive, and other examples of it reasoning to this degree just fine. However tour examples are a bit... silly to be honest.

But keep lecturing everyone -- its very common for post-grads to be so up their own behind in their research that they've closed their world off until they are the only ones right in it.


Unfortunately I'm used to people on the internet wearing their ignorance on their sleeve like a badge of honour and so I'm not surprised by the insults in your comment. Just a bit sad to be honest :(


I don't think you can evaluate if an LLM is reasoning by looking purely at the mechanics, because if we looked inside a human brain we wouldn't be able to conclude that it can reason either (our test is 'i think therefore i am', not all these neurons look like they are plugged together in such a way that it enables reason).


Exactly right and well said.


This type of self affirmation has a quality of denial.

Also the above description is reductive to the point of "Cars can't get you anywhere because they aren't horses."


Beam search.

Sophisticated folks aren't doing simplistic/stupid decoding.

Gotta go beyond LLMs 101 to see what's actually happening. Even in training folks are building models which predict several tokens ahead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: