> but I think that the problem with the way it manages side effects is that it does so through lazy evaluation, and lazy evaluation is hard to wrap your head around.
Haskell manages side effects by making assertions about whether a function has side effects part of its type. Lazy evaluation isn't quite orthogonal to that, but the connection goes in the other direction from what you imply. Because evaluation order in a non-strict language can be complex to reason about, such languages become impractical if unrestricted side effects are allowed. In other words, restricted side effects helps non-strictness, but non-strictness is not necessary to restrict side effects.
Though as SPJ described in "Wearing The Hair Shirt", non-strictness might be necessary to motivate language designers to sufficiently restrict side effects...
Haskell manages side effects by making assertions about whether a function has side effects part of its type. Lazy evaluation isn't quite orthogonal to that, but the connection goes in the other direction from what you imply. Because evaluation order in a non-strict language can be complex to reason about, such languages become impractical if unrestricted side effects are allowed. In other words, restricted side effects helps non-strictness, but non-strictness is not necessary to restrict side effects.