Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Redis is brilliant for simple job queues but it doesn’t have the structures for more advanced features. Things like scheduled jobs can be done through sorted sets and persistent jobs are possible by shifting jobs into backup queues, but it is all a bit fragile. Streams, available in 5+ can handle a lot more use cases fluently, but you still can’t get scheduled jobs in the same queue.

After replicating most of Sidekiq’s pro and enterprise behavior using older data structures I attempted to migrate to streams. What I discovered is that all the features I really wanted were available in SQL (specifically PostgreSQL). I’m not the first person to discover this, but it was such a refreshing change.

That led me to develop a Postgres based job professor in Elixir: https://github.com/sorentwo/oban

All the goodies only possible by gluing Redis structures together through lua scripts were much more straightforward in an RDBMS. Who knows, maybe the recent port of disque to a plug-in will change things.



Oban looks fantastic! PG is fantastic for a small-medium job queue IMHO. pg_notify with elixirs postgrex is fantastic.

We're currently using ecto_job which works really well for us, so we have no reason to switch. Plus we like it's in different tables.


It seems like PG has been pigeonholed as only suitable for “small-medium” size queues, but without numbers to define what “small-medium” means. A few million jobs an hour is entirely reasonable for PG based on anectdata (and I’ve load tested up to 54 million jobs an hour).

PG is an amazing tool that can handle more than most people think (or at least more than I thought).


What about VACUUM? Did they add something to PostgreSQL to lower space consumption?


There has been some progress on that front. For index size there is a new rebuild command in PG12 which works concurrently. There is also pluggable storage now, which enables much more efficient row deletion. I can’t find a link to the new storage format, but here is the announcement for 12 https://www.postgresql.org/about/news/1943/


Just wanted to say that Oban is really fantastic. Been using it in production for a couple weeks to handle recurring scheduled jobs and it was a real joy to work with. Thank you!


> Things like scheduled jobs can be done through sorted sets and persistent jobs are possible by shifting jobs into backup queues, but it is all a bit fragile.

What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.


> What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.

Using sorted sets for scheduled tasks isn't the fragile part, gluing it all together to prevent losing jobs by shifting into backup lists (or hashes) is the fragile part in my experience.


eval + multi-exec solves that as the issue


Sure, you can glue things together with lua. Here is a script that handles descheduling from a sorted set: https://github.com/sorentwo/kiq/blob/master/priv/scripts/des...

That definitely does the job. My point is that it is much more complex than a select/update clause in SQL.


Yep, TaskTiger also uses Lua quite extensively ([0]) to ensure atomicity when moving tasks between various stages of processing (queued, active, scheduled, errored).

[0] https://github.com/closeio/tasktiger/blob/master/tasktiger/r...


Oban seems to be Elixir-specifc, is there a Python port?


No, unfortunately not. It does use jsonb for job arguments to make interop easy. That’s entirely on the enqueuing side though and not processing.


Someone mentioned about dramatiq-pg (https://gitlab.com/dalibo/dramatiq-pg) which also uses PG as the task queue backend and jsonb for message payload & result. How similar is that comparing to Oban?


I don't know about the technical differences, but having used dramatiq in production for 3 years now I can say it's 100% awesome, it's been rock solid for us handling many thousands of tasks daily.


I love PG but would never use it for message queueing. Postgres needs periodic table maintenance, autovacuum tuning for tables with high number of writes, and IMO can never be as good as RabbitMQ for distributed task queues. RabbitMQ is purpose built around messages and it's extremely reliable and performant.


this is exactly my complaint with RQ as well. They are not leveraging the built-in Redis features, even though they are entirely based on Redis only.


> Streams, available in 5+ can handle a lot more use cases fluently

Stream does solve the backup queue issue, but not being able to filter messages from the consumer's pending list makes implementing retry mechanism and exponential backoff hard (without reaching to sorted set).

Though, I can understand the Redis team's decision to keep it simple.


Could you run this as an instance with a HTTP API to perform actions? Then I could use this with Python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: