It's basically impossible to implement a CPython-compatible language without a GIL (or you lose single thread performance by using very fine-grained atomics/locking). Python has very specific multithreading semantics that are a function of the CPython bytecode and the GIL, and programs rely on this.
Other than refcounting (which is not a part of the Python language spec - it even specifically says that conforming code shouldn't rely on it), what other semantics did you have in mind?
Guarantee or not, it constrains whether something is usable as a drop-in replacement interpreter, especially if people can't tell which programs will break, and doubly so if the breakage is a subtle data corruption race that doesn't show up in tests.