Interesting, how does it compare to Inngest and DBOS?

p10jkle · 2025-03-27T19:40:19 1743104419

Hey, I work on Restate. There are lots of differences throughout the architecture and the developer experience, but the one most relevant to this article is that Restate is itself a self-contained distributed stream-processing engine, which it uses to offer extremely low latency durable execution with strong consistency across AZs/regions. Other products tend to layer on top of other stores, which will inherit the good things and the bad things about those stores when it comes to throughput/latency/multi-region/consistency.

We are putting a lot of work into high throughput, low latency, distributed use cases, hence some of the decisions in this article. We felt that this necessitated a new database.

ALLTaken · 2025-03-28T20:58:05 1743195485

Hi,

I'm building a distributed application based on Hypergraphs, because the data being processed is mostly re-executable in different ways.

It's so refreshing to read this, I was also sitting down many nights and was thinking up about the same problem that you guys solved. I'm so glad about this!

Would it be possible to plug other storage engines into Restate? The data-structure that needs to be persisted allows multiple-path execution and instant re-ordering without indexing requirements.

I'm mostly programming in Julia and would love to see some little support for it too =)

Great work guys!

sewen · 2025-03-28T21:09:55 1743196195

Thank you for the kind words!

The storage engine is pretty tightly integrated with the log, but the programming model allows you to attach quasi arbitrary state to keys.

So see whether this fits your use case, would be great to better understand the data and structure you are working with. Do you have a link where we could look at this?

ALLTaken · 2025-03-30T21:21:31 1743369691

> Do you have a link where we could look at this? Hi, thank you for your reply, highly appreciated.

Happy to explain in more detail =) But it's not public yet.

I'm working on the consensus module optimised for hypergraphs with a succinct data-structure. The edges serve as an order-free index (FIT). Achieving max-flow, flow-matching, graph-reduction via circuits is amongst the goals.

Targeting low-latency/hig-performance distributed inference enabling layer-combination of distinct models and resumable multi-use computations as a sort of distributed compute cache.

A data-structure follows a data-format for persistence, resumeability and achieving service resilience. But although I've learned quite a bit about state management, it's still a topic I have much respect for and think using restate.dev maybe better than re-inventing the wheel. I didn't have in mind to also build a cellular automaton for state management, it maybe trivial, but I currently don't feel like having the capacity for it. Restate looks like a great production ready solution before delaying a release.

I intend to open-source it once it's mature. (But I believe binary distribution will be the more popular choice.)

bluelightning2k · 2025-03-28T21:28:08 1743197288

I find this type of thing very interesting technically, but not very interesting commercially.

It would seem to me that durable execution implies long running jobs, but this kind of work suggests micro optimisation of a couple of ms. The applications inherently don't care about this stuff?

What am I missing. Or is it just that at a big enough scale anything matters.

sewen · 2025-03-29T00:58:22 1743209902

The way we think about durable execution is that it is not just for long-running code, where you may want to suspend and later resume. In those cases, low-latency implementations would not matter, agreed.

But durable execution is immensely helpful for anything that has multiple steps that build on each other. Anytime your service interacts with multiple APIs, updates some state, keeps locks, or queues events. Payment processing, inventory, order processing, ledgers, token issuing, etc. Almost all backend logic that changes state ultimately benefits from a durable execution foundation. The database stores the business data, but there is so much implicit orchestration/coordination-related state - having a durable execution foundation makes all of this so much easier to reason about.

The question is then: Can we make the overhead low enough and the system lightweight enough such that it becomes attractive to use it for all those cases? That's what we are trying to build here.

secondrow · 2025-03-29T02:43:46 1743216226

(from DBOS) Great question. For better or worse, it seems like discussions about workflows and durable execution often intertwine. Usually ending up in what types of jobs or workflows require durable exec.

But really, any system that runs the risk of failing or committing an error should have something in place to observe it, undo it, resume it. Your point about "big enough scale" is true - you can write your own code to handle that, and manually troubleshoot and repair corrupted data up to a certain point. But that takes time.

By making durable execution more lightweight/seamless (a la DBOS or Restate), the use of durable execution libs become just good programming practice for any application where cost of failure is a concern.

popalchemist · 2025-03-28T22:45:51 1743201951

How does it compare against Trigger or Hatchet?

sewen · 2025-03-29T01:18:15 1743211095

Here is a comparison to Temporal, maybe that helps with a comparison to those systems as well? https://news.ycombinator.com/item?id=43511814