Redis is brilliant for simple job queues but it doesn’t have the structures for ...

thejosh · on Jan 3, 2020

Oban looks fantastic! PG is fantastic for a small-medium job queue IMHO. pg_notify with elixirs postgrex is fantastic.

We're currently using ecto_job which works really well for us, so we have no reason to switch. Plus we like it's in different tables.

sorentwo · on Jan 3, 2020

It seems like PG has been pigeonholed as only suitable for “small-medium” size queues, but without numbers to define what “small-medium” means. A few million jobs an hour is entirely reasonable for PG based on anectdata (and I’ve load tested up to 54 million jobs an hour).

PG is an amazing tool that can handle more than most people think (or at least more than I thought).

heavenlyblue · on Jan 3, 2020

What about VACUUM? Did they add something to PostgreSQL to lower space consumption?

sorentwo · on Jan 3, 2020

There has been some progress on that front. For index size there is a new rebuild command in PG12 which works concurrently. There is also pluggable storage now, which enables much more efficient row deletion. I can’t find a link to the new storage format, but here is the announcement for 12 https://www.postgresql.org/about/news/1943/

lytedev · on Jan 3, 2020

Just wanted to say that Oban is really fantastic. Been using it in production for a couple weeks to handle recurring scheduled jobs and it was a real joy to work with. Thank you!

wojcikstefan · on Jan 3, 2020

> Things like scheduled jobs can be done through sorted sets and persistent jobs are possible by shifting jobs into backup queues, but it is all a bit fragile.

What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.

sorentwo · on Jan 3, 2020

> What exactly is fragile about this approach? We've supported scheduled tasks in TaskTiger (https://github.com/closeio/tasktiger) via Redis's sorted sets and haven't had any issues.

Using sorted sets for scheduled tasks isn't the fragile part, gluing it all together to prevent losing jobs by shifting into backup lists (or hashes) is the fragile part in my experience.

notyourday · on Jan 3, 2020

eval + multi-exec solves that as the issue

sorentwo · on Jan 3, 2020

Sure, you can glue things together with lua. Here is a script that handles descheduling from a sorted set: https://github.com/sorentwo/kiq/blob/master/priv/scripts/des...

That definitely does the job. My point is that it is much more complex than a select/update clause in SQL.

wojcikstefan · on Jan 3, 2020

Yep, TaskTiger also uses Lua quite extensively ([0]) to ensure atomicity when moving tasks between various stages of processing (queued, active, scheduled, errored).

[0] https://github.com/closeio/tasktiger/blob/master/tasktiger/r...

devy · on Jan 3, 2020

Oban seems to be Elixir-specifc, is there a Python port?

sorentwo · on Jan 3, 2020

No, unfortunately not. It does use jsonb for job arguments to make interop easy. That’s entirely on the enqueuing side though and not processing.

devy · on Jan 3, 2020

Someone mentioned about dramatiq-pg (https://gitlab.com/dalibo/dramatiq-pg) which also uses PG as the task queue backend and jsonb for message payload & result. How similar is that comparing to Oban?

nikisweeting · on Jan 3, 2020

I don't know about the technical differences, but having used dramatiq in production for 3 years now I can say it's 100% awesome, it's been rock solid for us handling many thousands of tasks daily.

wakatime · on Jan 4, 2020

I love PG but would never use it for message queueing. Postgres needs periodic table maintenance, autovacuum tuning for tables with high number of writes, and IMO can never be as good as RabbitMQ for distributed task queues. RabbitMQ is purpose built around messages and it's extremely reliable and performant.

sandGorgon · on Jan 3, 2020

this is exactly my complaint with RQ as well. They are not leveraging the built-in Redis features, even though they are entirely based on Redis only.

akashakya · on Jan 3, 2020

> Streams, available in 5+ can handle a lot more use cases fluently

Stream does solve the backup queue issue, but not being able to filter messages from the consumer's pending list makes implementing retry mechanism and exponential backoff hard (without reaching to sorted set).

Though, I can understand the Redis team's decision to keep it simple.

jokull · on Jan 3, 2020

Could you run this as an instance with a HTTP API to perform actions? Then I could use this with Python.