I like the idea with Lemmy/kbin and the fediverse but theres something I dont understand perhaps.
If in the future Lemmy is very popular and someone wants to add their own server and federate with everyone then from that moment that new instance will get all new comments, posts, etc. from all other instances its federated with and must save them in its db. This means if Lemmy gets popular forget about little guys helping out spread the “load” because every intance still must take and save all new data. Thats a lot of processing power and storage. How can this work? I see in the future only a few instances will survive.
If somehow each instance was a node and only took care of its posts and comments and forward them to others upon request I can understand scaling but this is not how it works AFAIK. Another way would be with consensus algorithms where a node saves more thsn its own data but still not all.
Servers do only follow communities where a local user is subscribed. So the scalability issue is really on the size of a community, rather than the size of the Lemmyverse as a whole.
One option could be to discourage really big communities, and have lots of smaller ones.
You’ve misunderstood. Every instance does not contain all content from every other instance. Only that which at least one user has specifically requested by entering the id of a community in the !name@instan.ce format in search.
This means that the star trek instance, will only ever need to mostly host start trek content. It wont get flooded with everything else on the entire network, as it grows. Some portion of it, yes, as users on the star trek instance will inevitably sub to at least some stuff outside it, too.
Additionally, pictures and media are cached, but not permanently federated. When you upload a picture, you may have noticed the link becoming one that points to the instance you’re posting from. This doesn’t change even when that post gets federated to other instances, they are still fetching that image from the instance it was posted from (unless its a recent post, in which case the image may well be cached, as well).
This means that whats gets federated, is mostly just a bunch of text data, and even then, just a subset that is needed. A much lighter load.
At the smallest scale, you could have a node with just one user, perhaps that user creates a community or two. But this means that that instance will ONLY EVER store the subs of that one user, and the content of the communities they created. Not even close to the total content of the entire fediverse.