Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem with that is that, even though PostgreSQL has some great functionality for pub/sub and queues, most Django queue systems don't use them by default. Most of the ones I've seen resort to polling (which might be fine, depending on the use case), which is suboptimal.

Otherwise, I agree, I would love to be able to use Postgres for queues, and I think there is a Dramatiq Postgres backend that will do queueing properly (EDIT: Yes, Dramatiq-PG).



Postgres PubSub is great for triggering job dispatch, but it isn’t good enough on its own. There are a few caveats that make polling an additional requirement:

1. Message compaction — notifications are deduplicated, so if two jobs are inserted in the same queue the system may only get one notification. 2. Message saturation — with high levels of activity you’ll need to start discarding messages, essentially denouncing, otherwise the database gets throttled. 3. Dedicated connections — pubsub requires a dedicated connection for listening, which requires a single connection with custom dispatching or one connection per queue.

Relying on PG for everything is awesome regardless!


Is this a problem for low-traffic scenarios? My thinking is basically that I should save the extra dependency until I have more than a few tasks per second, at which point I will switch to something like RabbitMQ.


I don’t think it is much of a problem until you are pushing 2-3k jobs a second. Only the message saturation is a bottleneck anyhow, everything else I mentioned is handled by how a library is architected.


That makes sense, thanks. I generally switch long before I get anywhere near to where the database is the bottleneck, I use Postgres as a broker only for low-traffic sideprojects, where it's much more worthwhile to not bother with an extra broker.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: