I was doing some research on how to store a user's friends in a database and I created my table based on a model I got from this answer here
The person also suggested to set up indexes on the (UserIDLink1, UserIDLink2) and vice versa (UserIDLink2, UserIDLink1). This is a bidirectional relationship also noted by the answer
(A is a friend of B if B is a friend of A)
I'm fairly new to databases but I made these indexes in Postgresql with a btree type and I'm selecting all of the users friends with this:
SELECT * FROM friends WHERE user_id_link1 = '6762' OR user_id_link2 = '6762'
When I use EXPLAIN
I see it's still doing a sequence scan but maybe this is because I only have one entry in that table right now.
Either way this doesn't seem efficient nor does it look to scale so well. If I have n = 10,000 users the edge case here would be (n^2) entries if every user was friends with every user. Not very likely but if I had 1,000,000 users that would still be a lot of entries in one table for every bidirectional relationship.
The way I'm selecting all these users doesn't look too great either. I have and an OR
operation which is constant complexity but it doubles the amount of columns it's trying to match.
Maybe I'm wrong but this looks like a future disaster.
Here are my schemas
CREATE TABLE friends(
user_id_link1 TEXT NOT NULL,
username_link1 VARCHAR(15) NOT NULL,
display_name_link1 VARCHAR(50) NOT NULL,
avatar_url_link1 TEXT NOT NULL,
user_id_link2 TEXT NOT NULL,
username_link2 VARCHAR(15) NOT NULL,
display_name_link2 VARCHAR(50) NOT NULL,
avatar_url_link2 TEXT NOT NULL
);
CREATE INDEX link_user_id2 ON friends
(
// also created the vice versa of this
user_id_link2 DESC,
user_id_link1 DESC
);
Can this table be spread out on multiple disks?
Is there a better way to optimize this table?
Would it be better to just create a table for each user that way I can use a simple SELECT * FROM 6762_friends
?
Aucun commentaire:
Enregistrer un commentaire