It feels to me that there may be even more potential in flipping this idea around - human coders write tests to exact specifications, then an llm-using coding system evolves code until it passes the tests.
My concern with having the LLM's write tests is that it's hard to be convinced that they've written the right tests. Coupling human TDD with a genetic algorithm of some sort that uses LLM's to generate candidate populations of solutions, one could be assured that once a solution gets far enough through the tests [assuming one ever does], it is guaranteed to have the correct behavior (as far as "correct" has been defined in the tests).
the idea with llm tests first is tests should be extremely easy to read. of course ideally so should production code, but it's not always possible. if a test is extremely complicated, it could be a code smell or a sign that it should be broken up.
this way it's very easy to verify the llm's output (weird typos or imports would be caught by intellisense anyway)