Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure

RoundSparrow@lemmy.ml · edit-2 1 year ago

Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure

RoundSparrow@lemmy.ml · 1 year ago

I agree there is potential to reuse the child_count from child/grandchild rows. But there has to be some sense to the order they are updated in so that the deepest child gets count updated first?

bahmanm@lemmy.ml · 1 year ago

potential to reuse

I have a feeling that it’s going to make a noticeable difference; it’s way cheaper than a JOIN ... GROUP BY query.

order they are updated in so that the deepest child gets count updated first

Given the declarative nature of SQL, I’m afraid that’s not possible - at least to my knowledge.

But worry not! That’s why there are stored procedures in almost every RDBMS; to add an imperative flare to the engine.

In purely technical terms, Implementing what you’re thinking about is rather straight-forward in a stored procedure using a CURSOR. This could be possibly the quickest win (plus the idea of COUNT(*) if applicable.)

Now, I’d like to suggest a possibly longer route which I think may be more scalable. The idea is based around the fact that comments themselves are utterly more important than the number of child comments.

The first priority should be to ensure INSERT/UPDATE/SELECT are super quick on comment and post.
The second priority should be to ensure child_count is eventually correctly updated when (1) happens.
The last priority should be to try to make (2) as fast as we can while making sure (3) doesn’t interfere w/ (1) and (2) performance.

Before rambling on, I’d like to ask if you think the priorities make sense? If they do, I can elaborate on the implementation.

bahmanm@lemmy.ml · 1 year ago

How did it go @RoundSparrow@lemmy.ml? Any breakthroughs/

RoundSparrow@lemmy.ml · 1 year ago

I found the total table update wasn’t as bad performing as I thought and the API gateway was timing out. I’m still generating larger amounts of test data to see how it performs in edge worst-case situations.

bahmanm@lemmy.ml · 1 year ago

Can you keep this thread posted please? Or you can share a PR link so I can follow up the progress there. Am very interested.