That's not what they are saying. SOTA models include much more than just languag... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		wasabi991011 37 days ago \| parent \| context \| favorite \| on: TimeCapsuleLLM: LLM trained only on data from 1800... That's not what they are saying. SOTA models include much more than just language, and the scale of training data is related to its "intelligence". Restricting the corpus in time => less training data => less intelligence => less ability to "discover" new concepts not in its training data

withinboredom 36 days ago | [–]

Could always train them on data up to 2015ish and then see if you can rediscover LLMs. There's plenty of data.

franktankbank 37 days ago | [–]

Perhaps less bullshit though was my thought? Was language more restricted then? Scope of ideas?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact