CLI in C++: Native Designs
This is the fifth installment in the series of posts about designing a Command Line Interface (CLI) parser for C++. The previous posts were:
Today we will start exploring the possible design alternatives for a CLI parser. But first, let’s divide all the possible designs into two categories. In the first category there are designs that define the command line interface in the C++ source code itself. We will call them native. In the second category there are designs that define the command line interface outside of C++, in the so-called domain-specific language (DSL). Such a definition is then translated to C++ using a DSL compiler. We will call these types of design DSL-based. The first approach is preferable since it is more flexible, easier to maintain, and, overall, keeps things simple. If we cannot achieve the ideal solution using this design, then we will need to decide whether the drawbacks of the best solutions from the first category outweigh the trouble of going the DSL route. Today we will concentrate on the native designs.
Let’s also reiterate the properties of the ideal solution that we have established so far:
- Aggregation: options are stored in an object
- Static naming: option accessors have names derived from option names
- Static typing: option accessors have return types fixed to option types
- No repetition: the option name and option type are specified only once for each option
The two native solutions that we have seen so far and that have come closest to the ideal are the functor-based design and the template-based design. Here is the recap of the functor-based CLI definition:
struct options: cli:options { options () : help (false, "--help"), version (false, "--version"), compression (5, "--compression") { } cli::option<bool> help; cli::option<bool> version; cli::option<unsigned short> compression; };
And here is the template-based version:
extern const char help[] = "help"; extern const char version[] = "version"; extern const char compression[] = "compression"; typedef cli::options<help, bool, version, bool, compression, unsigned short> options; typedef cli::options_spec<options> options_spec; int main () { options_spec spec; spec.option<compression> ().default_value (5); ... }
Both solutions satisfy the first three properties but fail the “No repetition” one. In both cases we have to repeat the option name at least three times.
To see whether we can improve on the functor-based design, we can try to analyze it on a more elementary level. To satisfy the second rule (static naming), we will have to have a C++ identifier (i.e., a function or a functor name) corresponding to the option name. We will also need to have a string representation of the option name so that we can compare it to command line array elements during parsing. Since there is no easy way to get one from the other (the easiest method would probably be to use the debug information), we will have to repeat the option name at least twice. Thus the best definition that we can hope to achieve would be something along these lines (pseudo C++):
struct options: cli:options { cli::option<bool, "--help"> help; cli::option<bool, "--version"> version; cli::option<unsigned short, "--compression", 5> compression; };
Unfortunately, string literals cannot be template arguments, neither in the current C++98 nor in the upcoming C++x0. As a result, the function/functor declaration and the place where it is “connected” to the string representation of the option name have to be separated. As a result, the number of required option name repetitions becomes three.
With the template-based design, even if we could use string literals directly as template arguments, it would violate the second property (static naming). The use of variable names in accessing the option values guarantees that if we misspell any of them, it will be detected by the compiler.
Each approach also has a number of implementation-related problems. In the functor-based design the use of functors instead of normal member functions makes the resulting options
class harder to understand. Functors cannot be easily overridden should we decide to make some of the accessors virtual. This design also needs a global (or thread-local) variable to implement automatic option registration. There is nothing we can do about either of these drawbacks without greatly increasing the verbosity of the CLI definition.
As we have discussed in the previous post, the template-based approach does not scale to a large number of options. But can its implementation be improved using C++x0? At the first glance the variadic templates look promising . However, this feature only supports a single unbounded template argument. In other words there is no way to have a “parallel” pair of unbounded template arguments (option type and option name in our case). One way to resolve this is to wrap each option declaration into a separate type, for example:
typedef cli::options<cli::option<help, bool>, cli::option<version, bool>, cli::option<compression, unsigned short>> options;
So with the help of C++x0 we can make the template-based implementation scale but this comes at the cost of increased verbosity.
In the next post we will explore possible DSL-based design alternatives. Once this is done we will have to weigh the pros and cons of using native vs DSL-based designs and decide which way to go. If you have any thoughts or maybe another promising native design that I have missed, feel free to add them as comments.