Redis is brilliant for simple job queues but it doesn’t have the structures for more advanced features. Things like scheduled jobs can be done through sorted sets and persistent jobs are possible by shifting jobs into backup queues, but it is all a bit fragile. Streams, available in 5+ can handle a lot more use cases fluently, but you still can’t get scheduled jobs in the same queue.
After replicating most of Sidekiq’s pro and enterprise behavior using older data structures I attempted to migrate to streams. What I discovered is that all the features I really wanted were available in SQL (specifically PostgreSQL). I’m not the first person to discover this, but it was such a refreshing change.
All the goodies only possible by gluing Redis structures together through lua scripts were much more straightforward in an RDBMS. Who knows, maybe the recent port of disque to a plug-in will change things.
It seems like PG has been pigeonholed as only suitable for “small-medium” size queues, but without numbers to define what “small-medium” means. A few million jobs an hour is entirely reasonable for PG based on anectdata (and I’ve load tested up to 54 million jobs an hour).
PG is an amazing tool that can handle more than most people think (or at least more than I thought).
There has been some progress on that front. For index size there is a new rebuild command in PG12 which works concurrently. There is also pluggable storage now, which enables much more efficient row deletion. I can’t find a link to the new storage format, but here is the announcement for 12 https://www.postgresql.org/about/news/1943/
Just wanted to say that Oban is really fantastic. Been using it in production for a couple weeks to handle recurring scheduled jobs and it was a real joy to work with. Thank you!
> Things like scheduled jobs can be done through sorted sets and persistent jobs are possible by shifting jobs into backup queues, but it is all a bit fragile.
What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.
> What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.
Using sorted sets for scheduled tasks isn't the fragile part, gluing it all together to prevent losing jobs by shifting into backup lists (or hashes) is the fragile part in my experience.
Yep, TaskTiger also uses Lua quite extensively ([0]) to ensure atomicity when moving tasks between various stages of processing (queued, active, scheduled, errored).
Someone mentioned about dramatiq-pg (https://gitlab.com/dalibo/dramatiq-pg) which also uses PG as the task queue backend and jsonb for message payload & result. How similar is that comparing to Oban?
I don't know about the technical differences, but having used dramatiq in production for 3 years now I can say it's 100% awesome, it's been rock solid for us handling many thousands of tasks daily.
I love PG but would never use it for message queueing. Postgres needs periodic table maintenance, autovacuum tuning for tables with high number of writes, and IMO can never be as good as RabbitMQ for distributed task queues. RabbitMQ is purpose built around messages and it's extremely reliable and performant.
> Streams, available in 5+ can handle a lot more use cases fluently
Stream does solve the backup queue issue, but not being able to filter messages from the consumer's pending list makes implementing retry mechanism and exponential backoff hard (without reaching to sorted set).
Though, I can understand the Redis team's decision to keep it simple.
After replicating most of Sidekiq’s pro and enterprise behavior using older data structures I attempted to migrate to streams. What I discovered is that all the features I really wanted were available in SQL (specifically PostgreSQL). I’m not the first person to discover this, but it was such a refreshing change.
That led me to develop a Postgres based job professor in Elixir: https://github.com/sorentwo/oban
All the goodies only possible by gluing Redis structures together through lua scripts were much more straightforward in an RDBMS. Who knows, maybe the recent port of disque to a plug-in will change things.