First, I apologize the for title, because it probably doesn't describe the problem well. I couldn't come up with a better one.
I'll a simplified example of the real problem I'm trying to solve.
At the core, I have a benchmark that is surrounded by "before" and "after" calls, which record relevant information for the benchmark. The obvious example of something I record is the current timestamp, but there are many more interesting things such as cycle count, memory use, whatever. I call the action of recording these values a stamp, so we have something like this:
Stamp before = stamper.stamp();
// benchmark code goes here
Stamp after = stamper.stamp();
// maybe we calculate (after - before) here, etc
There are a lot of possible things we might want to record, and the information we need is specified at runtime. For example, we may want to calculate the wall-clock time using std::chrono::high_resolution_clock
. We may want to calculate the CPU time using clock(3)
and so on. We may want to calculate the number of instructions executed and branches mispredicted using platform specific performance counters.
Most of these need only a small snippet of code, and a lot of them share the same code except for a parameter value (e.g., the "instructions" and "branches" counters use the same code except they pass a different identifier for the performance counter to read).
More importantly, many of the values the end user might choose to see are composed as a function of mutiple values - e.g., we might report an "instructions per nanosecond" value or a "branches mispredicted per instruction" value which each need two values and then calculate their ratio.
Let's call this type of value that we want to output a metric (so "branches per instruction" is a metric) and the underlying values tht we record directly a measurement (so "cycles" or "nanoseconds wall clock time" are measurements). Some metrics are as simple as a single measurement, but in general they can be more complicated (as in the ratio examples). In this framework, a stamp is simply a collection of measurements.
What I'm struggling with is how to create a mechanism where given a list of desired metrics, a stamper
object can be created whose stamp()
method records all the necessary measurements, which can be then translated into metrics.
One option is something like this:
/* something that can take a measurement */
struct Taker {
/* return the value of the measurement at the
current instant */
virtual double take() = 0;
};
// a Stamp is just an array of doubles, one
// for each registered Taker
using Stamp = std::vector<double>;
class Stamper {
std::vector<Measurement> takers;
public:
// register a Taker to be called during stamp()
// returns: the index of the result in the Stamp
size_t register_taker(Taker* t) {
takers.push_back(t);
return takers.size() - 1;
}
// return a Stamp for the current moment by calling each taker
Stamp stamp() {
Stamp result;
for (auto taker : takers) {
result.push_back(taker->take());
}
}
}
Then you have `Taker` implementations for all the measurements you need (including stateful shared implementation for those that vary only in a parameter like so):
struct ClockTaker : public Taker { double take() override { return clock(); } }
struct PerfCounterTaker : public Taker { int counter_id; double take() override { return read_counter(counter_id); } }
Finally you have a `Metric` interface and implementations which know which measurements they need and how to register the correct `Taker` objects and consume the result. A simple example is the clock metric:
struct Metric { virtual void register_takers(Stamper& stamper) = 0; double get_metric(const Stamp& delta) = 0; }
struct ClockMetric : public Metric { size_t taker_id;
void register_takers(Stamper& stamper) { taker_id = stamper.register_taker(new ClockTaker{}); }
double get_metric(const Stamp& delta) { return delta[taker_id]; } }
A more complex Metric may register multiple `Takers`, e.g., for the ratio of two performance counters:
class PerfCounterRatio : public Metric { int top_id, bottom_id; size_t top_taker, bottom_taker; public: PerfCounterRatio(int top_id, int bottom_id) : top_id{top_id}, bottom_id{bottom_id} {}
void register_takers(Stamper& stamper) { top_taker = stamper.register_taker(new PerfCounterTaker{top_id }); bottom_taker = stamper.register_taker(new PerfCounterTaker{bottom_id}); }
double get_metric(const Stamp& delta) { return delta[taker_id]; } }
Without fleshing out some additional details not show, e.g., how the delta is taken, memory management, etc, this basically _works_, but it has the following problems:
- The same Taker object may be regsitered multiple times. For example, if you calculate "instructions per cycle" and "branches per cycle", the "cycles" performance counter will be registered twice. In practice, this is a serious problem because there can be a limit to the number of performance counters you can read, and even without a limit, the more stuff that happens in `stamp()`, the more overhead and noise is added to the measurement.
- The return type of `take()` is contrained by the `Taker` interface to `double` or some other "single" choice. In general, different `Taker` objects may have different types which naturally represent the result and they would like to use them. Only at the very end, e.g., in `get_metric` do we need to convert to a common numeric type for display (or maybe not even then since polymorphic print code could handle different types).
The first problem is the main on and the one I'd like to solve. The second could already be solved by some kind of type erasure or whatever, but the solution to the first should also accommodate the second.
In particular, the `Metric` and `Measurement` instances have a many-to-many relationship, but I want the minimal number of measurements to taken.
Any pattern that works well here? Type safety should be preserved as much as possible. The `stamp()` method should be as efficient as possible, but the efficiency of the other methods doesn't matter.
Aucun commentaire:
Enregistrer un commentaire