"generalize to its dataset" is a contradiction, especially as these models are trained in the one epoch regimen on datasets of the scale of all of the internet. if you think being able to generalize in ways similar to the whole of the internet does not give your meaningful abilities to reason, I'm not sure what I can tell you
Not "to" but over, example the same code written in one language over the other language.
> if you think being able to generalize in ways similar to the whole of the internet does not give your meaningful abilities to reason, I'm not sure what I can tell you
If after reading papers below that show empirically that they can't reason, you will still think they can reason, then I don't know what I can tell you.