mardi 5 mai 2020

How to scale a bidirectional relationship table?

I was doing some research on how to store a user's friends in a database and I created my table based on a model I got from this answer here

The person also suggested to set up indexes on the (UserIDLink1, UserIDLink2) and vice versa (UserIDLink2, UserIDLink1). This is a bidirectional relationship also noted by the answer

(A is a friend of B if B is a friend of A)

I'm fairly new to databases but I made these indexes in Postgresql with a btree type and I'm selecting all of the users friends with this:
SELECT * FROM friends WHERE user_id_link1 = '6762' OR user_id_link2 = '6762'

When I use EXPLAIN I see it's still doing a sequence scan but maybe this is because I only have one entry in that table right now.

Either way this doesn't seem efficient nor does it look to scale so well. If I have n = 10,000 users the edge case here would be (n^2) entries if every user was friends with every user. Not very likely but if I had 1,000,000 users that would still be a lot of entries in one table for every bidirectional relationship.

The way I'm selecting all these users doesn't look too great either. I have and an OR operation which is constant complexity but it doubles the amount of columns it's trying to match.

Maybe I'm wrong but this looks like a future disaster.

Here are my schemas

CREATE TABLE friends(
    user_id_link1         TEXT         NOT NULL,
    username_link1        VARCHAR(15)  NOT NULL,
    display_name_link1    VARCHAR(50)  NOT NULL,
    avatar_url_link1      TEXT         NOT NULL,
    user_id_link2         TEXT         NOT NULL,
    username_link2        VARCHAR(15)  NOT NULL,
    display_name_link2    VARCHAR(50)  NOT NULL,
    avatar_url_link2      TEXT         NOT NULL
);
CREATE INDEX  link_user_id2 ON friends 
(
    // also created the vice versa of this
    user_id_link2 DESC,
    user_id_link1 DESC
);

Can this table be spread out on multiple disks?
Is there a better way to optimize this table?
Would it be better to just create a table for each user that way I can use a simple SELECT * FROM 6762_friends?

Aucun commentaire:

Enregistrer un commentaire