That's not what they are saying. SOTA models include much more than just language, and the scale of training data is related to its "intelligence". Restricting the corpus in time => less training data => less intelligence => less ability to "discover" new concepts not in its training data