design-patterns: Suggestion required to design partitioning in cassandra

lundi 10 février 2020

Suggestion required to design partitioning in cassandra

I have to design the database for customers having prices for millions of materials they acquire through multiple suppliers for the next 24 months. So the database will store prices on a daily basis for every material supplied by a specific supplier for the next 24 months. So we keep past data. Now lookups will happen on: 1. Find the price for a material by the supplier as of the specific date by a customer. 2. Find the price for a material by the supplier for a time period by a customer.

I can think for the primary key as: 1. Partition key: (customer Id, material Id, supplier Id, date) -- will this end up in perf issues as it will make so many partitions in long run? 2. Partition Key: (customer Id, material Id, supplier Id, monthbucket), clustering key: date -- monthbucket will store data for material on monthly basis on same partition and will have value like '202002' for Feb 2020 dates.

Another question is how can I make sure my data distributes evenly across nodes.

Note: 1. combination of customer, material, supplier and date is unique. 2. two customer can have similar material ids.

do you have any suggestions or questions?

Thanks.

design-patterns

lundi 10 février 2020

Suggestion required to design partitioning in cassandra

Aucun commentaire:

Enregistrer un commentaire