mercredi 13 septembre 2023

Single-Machine Distributed Graph Processing EDU: Use Existing Storage or Implement Custom Solution?

I need to design and implement a distributed system for graph processing for edu purposes. The primary objective is to develop distributed algorithms for traversing graphs and collecting information from them.

I'm facing a challenge in selecting the right level of granularity for the implementation. I have only one machine, and I need to simulate process failures as node failures in a distributed system. Additionally, I need to somehow simulate the distributed placement of data on a single disk.

Two questions arise:
  • How should I implement distributed processing? Would it be better to use an existing storage solution and build a layer on top of it using its API, or should I implement my own simple storage system to showcase my methods?

  • If I choose to implement my own simple storage system, how can I simulate that the data is located on different nodes (in my context, processes, as it's on a single machine)?

Aucun commentaire:

Enregistrer un commentaire