samedi 18 mai 2019

Designing an iterator with two pointers to nested objects

Container description

I am building a container that is meant to store sorted unsigned integer values in a very RAM efficient manner. The idea is to group values by common radix. Instead of

std::vector<unsigned int> V = {1234,1254,1264,1265,1267,1268,1271,1819,1832,1856,
                                  1867,1892,3210,3214,3256,3289};

I would have something like

MyContainer V =
{
   {12..
      ..34, ..54, ..64, ..65, ..67, ..68, ..71
   },
   {18..
      ..19, ..32, ..56, ..67, ..92
   }
   {32..
      ..10, ..14, ..56, ..89
   }
};

Briefly, here is the organization of this container...

class MyContainer
{
   // attribute
   std::vector<Block> data;

   // methods
   // ..
}

class Block
{
   // attributes
   unsigned short radix;
   std::vector<unsigned short> suffixs; // contains all the suffix associated with this radix

   // methods
   // ..
}

While advice about this data structure would be welcome, the core of my question is about the implementation of the iterator.

Iterator

I am having issues when building the iterator. It is my first time building a classic iterator design pattern and I am likely making mistakes. Here are the attributes of my iterator

class Iterator
{
   // attributes
   std::vector<Block>::iterator bigP;             // points to a Block
   std::vector<unsigned short>::iterator smallP;  // points to an unsigned int within the Block pointed by bigP
   std::vector<Block>::iterator bigPEnd;

  // methods
  // ...
}

Q1: Should MyContainer::iterator contains iterators (as it is currently the case) or should it contain pointers? Why?

I thought that when the iterator points to the last unsigned int of a Block, then the operator++() should push bigP to the next Block and push smallP to the first element of the

It seems wrong to me that I have to include an iterator to the data.end() (called bigPEnd) but I ended up adding it when I realize that when the operator++() is called while the MyContainer::iterator points to the last unsigned int of the last Block, I had to know that I can't set smallP to bigP->begin() as this would lead to a segmentation fault as *bigP does not exist.

Q2: Do I need a pointer to the last element of data? How to avoid it?

I am also facing a similar issue when building an MyContainer::iterator for an empty vector. Typically, I would construct an iterator with

MyContainer::iterator MyContainer::begin()
{
   return iterator(data.begin(), data.front().suffixs.begin(), data.end());
        //            ^^                 ^^                       ^^
        //           bigP              smallP                    bigPEnd
}

However, data.front() will lead to a segmentation fault when data is empty. If I used pointers I could set smallP to nullptr when data is empty and when bigP == data.end() whether or not the data is empty.

Q3: How can I deal with smallP when there is nothing to point to?

Can you please give me some advice on the implementation of this iterator?

Aucun commentaire:

Enregistrer un commentaire