samedi 30 mai 2020

Design Pattern for lazy Fingerprint evaluation and comparison

First off, as "Fingerprint" is a rather broad term, this question is not about finger tips nor specifically about Hashing.

So here is the scenario: I'm developing a fingerprint for special types of graphs (won't go into much detail about the the actual data of this fingerprint as it is not relevant) and need to compute a fingerprint that identifies these graphs according to certain features. I have a POD (struct) that holds this fingerprint data. As an example consider this fingerprint:

struct Crossing; //unimportant for the question
struct Fingerprint{
  int nodes;
  vector<vector<bool>> adjacencies;
  vector<Crossing> crossings;
};

This is not my actual fingerprint but rather a much simpler example that (kind of) works for my question.

To compute my fingerprint I have to consider all possible labelings (ways to label nodes) of my graph and the one that yields the "smallest" fingerprint is used to generate the actual fingerprint. Now as some of the members of this struct are much faster to compute than others (and can be used to throw out labelings that definitely don't generate a minimal fingerprint) I have decided to evaluate the members of this struct lazily. For now I have done this as follows:

Fingerprint getFingerprint(Graph graph){
  auto labelings = graph.generateLabelings();//type of labelings not relevant to question
  auto min_labeling = labelings[0]; //labeling yielding minimum fingerprint
  bool first = true;
  bool adjacencies_evaluated = false;
  bool crossings_evaluated = false;
  int min_nodes;
  vector<vector<bool>> min_adjacencies;
  vector<Crossing> min_crossings;
  //note that labeling is only needed to calculate fingerprint but not part of it:
  for(auto& labeling: labelings)
  {
    auto nodes = getNodes(graph, labeling); 
    //yes I know the number of nodes is independent of the labeling but in my case
    //the labeling matters for all parts of the fingerprint (this example isn't perfect)
    if(first)
    {
      min_nodes = nodes;
      first = false;
      continue;
    }
    if(nodes < min_nodes)
    {
      min_nodes = nodes;
      min_labeling = labeling;
      bool adjacencies_evaluated = false;
      bool crossings_evaluated = false;
    }
    else
      if(nodes == min_nodes)
      {
        if(!adjacencies_evaluated)
          min_adjacencies = getAdjacencies(graph, min_labeling);
        auto adjacencies = getAdjacencies(graph, labeling);
        adjacencies_evaluated = true;
        if(adjacencies < min_adjacencies)
        {
          min_adjacencies = adjacencies;
          min_labeling = labeling;
          crossings_evaluated = false;
        }
        else
          if(adjacencies == min_adjacencies)
          {
            if(!crossings_evaluated)
              min_crossings = getCrossings(graph, min_labeling);
            auto crossings = getCrossings(graph, labeling);
            crossings_evaluated = true;
            if(crossings < min_crossings)//comparison operators for UDT are available
            {
              min_crossings = crossings;
              //not necessary to update min_labeling here
            }
          }
      }
  }
  //min_nodes is already evaluated
  if(!adjacencies_evaluated)
    min_adjacencies = getAdjacencies(graph, min_labeling);
  if(!crossings_evaluated)
    min_crossings = getCrossings(graph, min_labeling)

  return Fingerprint{min_nodes, min_adjacencies, min_crossings};
}

I have simplified the fingerprint for this question a lot, but the code to generate it is still rather large and most importantly repetitive. The basic idea is to evaluate the first parts of the fingerprint, check if they are sufficient for a comparison, and if not to compute more parts of it. And once the minimum labeling is found the not evaluated parts of the fingerprint are set.

Now my question is, if there is a way to generalize the above idea, so that if I ever change my fingerprint not much has to be changed in this function.

My idea would be to provide some kind of container with function-pointers to the functions calculating the members of fingerprint like:

functions = {&getNodes, &getAdjacencies, &getCrossings}

However I'm not sure how to actually store this to

  1. be able to access them like functions[i](graph, labeling) and iterate over them
  2. have the compiler know which element returns which type

On top of that, later in the process I would need to store the return values in the right fingerprint members. In C++ without reflection this approach seems impossible to me, so is there a way to do this (or any other way) to generalize my fingerprint computation?

Aucun commentaire:

Enregistrer un commentaire