mercredi 23 septembre 2015

How to transform heterogenous flat data to data structure

I am looking for a way to transform flat data to a data structure. The input for this transformation is not homogenous. Some data contain too much info, other data contain info that needs processing.

Let me explain with an example. Suppose I have some Excel files with car data. The files contain info about cars and their engines.

File 1:

Name | Type | EngineId | Manufacturer | Power (hp) | Torque
Opel | Adam | I4       | Opel         | 69         | 115

File 2:

Brand | Type  | Engine | Power (kW) | Manufacturer
Fiat  | Punto | 1.2-L  | 44         | Chrysler    

As you can see, the files differ slightly: Name and Brand for the first column, different units of measure for Power, Manufacturer is at different positions and Torque is missing in File 2.

I'd like to transform this to something like:

public class Car {
    string Name;
    string Type;
    Engine Engine;
}

public class Engine {
    string Id;
    string Manufacturer;
    double Power; 
    Dictionary<string,string> OtherAttributes;
}

I think the transform needs classes for transformation rules as well:

public class MappingRules {
    string FileType; // File 1 vs File 2
    List<MappingRule> MappingRules;
}

public class MappingRule<T> {
    string SourceColumnName;
    string Target;
    ITranslate<T> Translator;
}

interface ITranslate<T> {
    T Convert(T sourceValue);
}

My problem is: how can I achieve this, but even more: how do I research this?

Aucun commentaire:

Enregistrer un commentaire