mardi 26 juin 2018

Simplify an extensible "Perform Operation X on Data Y" framework

tl;dr: My goal is to conditionally provide implementations for abstract virtual methods in an intermediate workhorse template class (depending on template parameters), but to leave them abstract otherwise so that classes derived from the template are reminded by the compiler to implement them if necessary.

I am working on an extensible framework to perform "operations" on "data". One main goal is to allow XML configs to determine program flow, and allow users to extend both allowed data types and operations at a later date, without having to modify framework code.

If either one (operations or data types) is kept fixed architecturally, there are good patterns to deal with the problem. If allowed operations are known ahead of time, use abstract virtual functions in your data types (new data have to implement all required functionality to be usable). If data types are known ahead of time, use the Visitor pattern (where the operation has to define virtual calls for all data types).

Now if both are meant to be extensible, I could not find a well-established solution.

My solution is to declare them independently from one another and then register "operation X for data type Y" via an operation factory. That way, users can add new data types, or implement additional or alternative operations and they can be produced and configured using the same XML framework.

If you create a matrix of (all data types) x (all operations), you end up with a lot of classes. Hence, they should be as minimal as possible, and eliminate trivial boilerplate code as far as possible, and this is where I could use some inspiration and help.

There are many operations that will often be trivial, but might not be in specific cases, such as Clone() and some more (omitted here for "brevity"). My goal is to conditionally provide implementations for abstract virtual methods if appropriate, but to leave them abstract otherwise.

Here is a (fairly) minimal compilable example of the situation:

//Base class for all operations on all data types. Will be inherited from. A lot. Base class does not define any concrete operation interface, nor does it necessarily know any concrete data types it might be performed on.

class cOperation
{
public:
  virtual ~cOperation() {}

  virtual std::unique_ptr<cOperation> Clone() const = 0;
  virtual bool Serialize() const = 0;
  //... more virtual calls that can be either trivial or quite involved ...

protected:
  cOperation(const std::string& strOperationID, const std::string& strOperatesOnType)
    : m_strOperationID()
    , m_strOperatesOnType(strOperatesOnType)
  {
    //empty
  }

private:
  std::string m_strOperationID;
  std::string m_strOperatesOnType;
};

//Base class for all data types. Will be inherited from. A lot. Does not know any operations that might be performed on it.
struct cDataTypeBase 
{
  virtual ~cDataTypeBase() {}
};

Now, I'll define an example data type.

//Some concrete data type. Still does not know any operations that might be performed on it.
struct cDataTypeA : public cDataTypeBase
{
  static const std::string& GetDataName()
  {
    static const std::string strMyName = "cDataTypeA";
    return strMyName;
  }
};

And here is an example operation. It defines a concrete operation interface, but does not know the data types it might be performed on.

//Some concrete operation. Does not know all data types it might be expected to work on.
class cConcreteOperationX : public cOperation
{
public:
  virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) = 0;

protected:
  cConcreteOperationX(const std::string& strOperatesOnType)
    : cOperation("concreteOperationX", strOperatesOnType)
  { 
    //empty
  }
};

The following template is meant to be the boilerplate workhorse. It implements as much trivial and repetitive code as possible and is provided alongside the concrete operation base class - concrete data types are still unknown, but are meant to be provided as template parameters.

//ConcreteOperationTemplate: absorb as much common/trivial code as possible, so concrete derived classes can have minimal code for easy addition of more supported data types
template <typename ConcreteDataType, typename DerivedOperationType, bool bHasTrivialCloneAndSerialize = false>
class cConcreteOperationXTemplate : public cConcreteOperationX
{
public:
  //Can perform datatype cast here:
  virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) override
  {
    const ConcreteDataType* pCastData = dynamic_cast<const ConcreteDataType*>(&dataBase);
    if (pCastData == nullptr)
    {
      return false;
    }
    return doSomeConcreteOperationXOnCastData(*pCastData);
  }

protected:
  cConcreteOperationXTemplate()
    : cConcreteOperationX(ConcreteDataType::GetDataName()) //requires ConcreteDataType to have a static method returning something appropriate
  {
    //empty
  }

private:
  //Clone can be implemented here via CRTP
  virtual std::unique_ptr<cOperation> Clone() const override
  {
    return std::unique_ptr<cOperation>(new DerivedOperationType(*static_cast<const DerivedOperationType*>(this)));
  }

  //TODO: Some Magic here to enable trivial serializations, but leave non-trivials abstract
  //Problem with current code is that virtual bool Serialize() override will also be overwritten for bHasTrivialCloneAndSerialize == false
  virtual bool Serialize() const override
  {
    return true;
  }

  virtual bool doSomeConcreteOperationXOnCastData(const ConcreteDataType& castData) = 0;
};

Here are two implementations of the example operation on the example data type. One of them will be registered as the default operation, to be used if the user does not declare anything else in the config, and the other is a potentially much more involved non-default operation that might take many additional parameters into account (these would then have to be serialized in order to be correctly re-instantiated on the next program run). These operations need to know both the operation and the data type they relate to, but could potentially be implemented at a much later time, or in a different software component where the specific combination of operation and data type are required.

//Implementation of operation X on type A. Needs to know both of these, but can be implemented if and when required.
class cConcreteOperationXOnTypeADefault : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeADefault, true>
{
  virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
  {
    //...do stuff...
    return true;
  }
};

//Different implementation of operation X on type A.
class cConcreteOperationXOnTypeASpecialSauce : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeASpecialSauce/*, false*/>
{
  virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
  {
    //...do stuff...
    return true;
  }

  //Problem: Compiler does not remind me that cConcreteOperationXOnTypeASpecialSauce might need to implement this method
  //virtual bool Serialize() override {}
};

int main(int argc, char* argv[])
{
  std::map<std::string, std::map<std::string, std::unique_ptr<cOperation>>> mapOpIDAndDataTypeToOperation;
  //...fill map, e.g. via XML config / factory method...

  const cOperation& requestedOperation = *mapOpIDAndDataTypeToOperation.at("concreteOperationX").at("cDataTypeA");
  //...do stuff...

  return 0;
}

Aucun commentaire:

Enregistrer un commentaire