Using The Pigeonhole Principle in C++ Metaprogramming

The Pigeonhole Principle is one of the most obvious fundamentals in mathematics. It is so obvious that you may be surprised that there is even a name for it. It states that:

"If n items are put into m containers, with n > m, then at least one container must contain more than one item."

Alternatively,

"If there are n items and m containers, with n > m, and only one item can fit in a container, then at least one item must remain out."

For those who prefer visuals and really hate math:

Even though the principle is simple it has been used to prove many complex mathematical theorems and lemmas. Here is one I find quite interesting:

"Incompressible strings of every length exist."

Alternatively,

"There is a file of every size that your favorite zip program can't compress."

The solution is left to the reader as an exercise.

So, does the Pigeonhole Principle show up in programming. Of course it does. That's why std::vector must allocate memory when its capacity is full. OK, but does it manifest in more interesting ways? As it turns out, it has been used in compile-time meta-programming to achieve interesting results. It manifests in preprocessor meta-programming and in template meta-programming in two distinct flavors.

The Pigeonhole Principle in C++ Preprocessor Meta-programming

Check out the following example. Also available here. The original author of this trick is unknown to me.

#include <iostream>

#define COUNT_ARGS(...)     PP_NARG_IMPL(__VA_ARGS__,PP_RSEQ_N()) 
#define PP_NARG_IMPL(...)   PP_ARG_N(__VA_ARGS__) 
#define PP_ARG_N( _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, N, ...) N 
#define PP_RSEQ_N() 10,9,8,7,6,5,4,3,2,1,0 

int main()
{
  std::cout << COUNT_ARGS(a,b,c,d); // prints 4
}

COUNT_ARGS is a "simple" macro that counts the number of variadic arguments it is called with. It does that by using a preprocessing programming trick based on the Pigeonhole principle. Here is how the macro expands:

The COUNT_ARGS macro substitutes the arguments (a,b,c,d) in the __VA_ARGS__ part before calling PP_NARG_IMPL. The PP_RSEQ_N macro is a list of integers from 10 to 0, which is substituted in the PP_NARG_IMPL. Therefore, the PP_NARG_IMPL macro is "called" with actual arguments = a,b,c,d,10,9,8,7,6,5,4,3,2,1,0
The PP_NARG_IMPL macro simply forwards its arguments to the PP_ARG_N macro.
The PP_ARG_N macro is where the Pigeonhole Principle comes in to play. It has 11 named arguments: From _1, _2, _3, etc. and N. Note that _1, _2, etc. are not special. They are just macro arguments with an underscore at the beginning. You may want to rename them as one, two, three, four, etc. It won't make a difference. The PP_ARG_N always expands to its 11th argument because of N.
The original argument list has 15 arguments but there are only 11 arguments to the PP_ARG_N macro. Obviously, not all are going to fit. The PP_ARG_N macro only "picks up" the first actual argument that does not get a slot (i.e., 11th)
As N always coincides with the 11th actual argument, the PP_ARG_N results in that value producing the count.

Needless to say, that's clever! Now let's proceed with template meta-programming.

The Pigeonhole Principle in C++ Template Meta-programming

Check out the following example. Also available here.

int main()
{
 auto x = ::nth<7>(0,"1",'2',3,"4",'5',6,"7");
 std::cerr << x << std::endl;
}

The goal is to access the N-th element in a variadic function argument list. The output of the above program should be 7.

There are many ways to go about implementing it, most using recursion of some sort. However, there is one implementation I came across, which I find particularly interesting. Why? You guessed it ... It uses the Pigeonhole Principle to avoid recursion.

The code was originally written by Richard Smith. I found it through a post by Roland Bock on the boost developers mailing list. If you prefer more comments, please see the same example with comments by LJEvans.

#include <utility>
#include <iostream>

namespace detail
{
  struct any { template<typename T> any(T &&) {} };

  template<typename T, typename U> struct first { typedef T type; };

  template<typename ...Ts>
  struct select_impl 
  {
    template<typename U, typename ...Vs>
 static U &&select(typename first<any, Ts>::type..., U &&u, Vs &&...) 
    {
    return static_cast<U&&>(u);
    }
  };

  template<std::size_t... Idx, typename... Ts>
  static auto select(const std::index_sequence<Idx...>&, Ts&&... ts)
  {
     return select_impl<decltype(Idx)...>::select(static_cast<Ts&&>(ts)...);
  }
}

template<std::size_t N, typename ...Ts>
auto nth(Ts &&...ts)
{
  return detail::select(std::make_index_sequence<N>(), static_cast<Ts&&>(ts)...);
}

int main()
{
 auto x = ::nth<7>(0,"1",'2',3,"4",'5',6,"7"); // prints 7
 std::cerr << x << std::endl;
}

Here is how the nth<7>(...) function works in the example above.

N is 7 and Ts is a variadic parameter pack of integers, character strings, and plain characters.
The std::make_index_sequence is a new addition in C++14 that produces an instance of std::index_sequence given a compile-time integral constant. Here, it produces std::index_sequence<0,1,2,3,4,5,6>.
The formal arguments to the nth function (captured in the parameter pack ts) are forwarded to detail::select using a static_cast. This function must return the nth argument among the forwarded arguments.
In detail::select, the Idx parameter pack represents the indices from 0 to 6. Its deduced by the compiler looking at the type of the index_sequence instance.
The select_impl class template is instantiated with the decltype of each member in the Idx parameter pack. decltype(ts)... expands in to a list of types for every member in Ids. In this case, it is just 'int, int, int,... 7 times. The remaining arguments to select_impl::select are just being forwarded as before.
The select_impl::select has access to Ts parameter pack, which is at the class-template level. Recall that it is 'int,int,int,....'. The list of formal arguments to select_impl::select is broken down in to 3 parts: a variadic piece of N-1 arguments at the beginning, U&& in the middle, and everything else in Vs.
The first N-1 arguments to select_impl::select are "absorbed" using the detail::any class. The detail::any has a single argument constructor that converts argument of any type to any. The first N-1 arguments are thus converted to any. In our example, all the arguments from 0 to 6 are converted to any. The conversion is achieved using an in place parameter pack expansion 'typename first::type...'. For every argument in the Ts parameter pack, the 'first' meta-function is applied, which results into the 'any' type every time.
As the first N-1 arguments are out of the way, U&& necessarily fits the N-th argument. This is where the Pigeonhole Principle springs back in to action.
The remaining argument after the N-th (if any) are left unused in the Vs parameter pack.

So, there it is: returning the N-th argument in a argument list without using recursion. In practice, however, std::make_index_sequence is implemented using recursion. So, the above code is not truly recursion-free.

OK ... So you read it all! I'm sure you found the use of the Pigeonhole Principle in processing variadics in C++ very interesting.

Using The Pigeonhole Principle in C++ Metaprogramming

0 comments:

Post a Comment

Recent Post

Recent Comments

Blog Archive