type traits and element specific functions: design problem

  • Follow


I am writing a framework for writing an installer. The system should
port across Solaris (SPARC/x86), Linux (x86) and AIX (RS6k).

I am faced with a basic problem. Currently we only plan to support a
UNIX terminal based character-based UI. The whole set of dialog strings
or prompt strings which I am supposed to throw at the user is quite
large. As of now, each of these has been given a numeric ID and mapped
from a message catalog using catopen / catgets calls.

Now each of these dialogs is a request to the user for an input. There
are well-defined rules as to what is acceptable and what is not
acceptable as part of the input in response to each Dialog.

In other words, each Dialog Id has associated with it a validation rule
for the input that is read in response to that Dialog.

Now it is not difficult to create a set of Input Validator classes,
each doing one kind of validation. Some do validation on date time
strings, others on File names and directories, and still others on
numeric values. We have been able to keep the number of such generic
validators to 8 at present.

e.g.
class DateInputValidator : AbstractInputValidator
{
public:
	bool validate ( std::string& ) const
	{
		...
	}

	...
};

Now let us suppose that each of my (say) 250 Dialogs can be validated
with one of these 8 validators.

Now my Dialogs are not first class C++ objects. They are merely numeric
Ids. On the other hand, the knowledge of which validator to apply to
the input for a dialog should be with the Dialog. A validator need not
know which dialogs it should validate. But since the Dialogs are not
first class C++ objects, we cannot make give them the intelligence.

So I am wondering how to approach this problem.
I tried the following approach:

Create a dialog_traits class template. For each dialog Id, specialize
it and define a typedef in the specializtion which refers to the type
of the validator appropriate for this dialog. Something like:

template <int DLG_ID> struct dialog_traits;

....

template <>
struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File == 5
{
	typedef FileNameInputValidator validator_t;
};

 - This still creates a pretty large number of template classes - one
for each validator and the executable size can increase significantly
because of this.
 - This seems unintrusive and elegant.

Is there a way to check the code-bloat but.



Cheers,
Andy


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply garth_rockett (50) 10/16/2006 4:26:10 PM

Andy wrote:
>
> template <int DLG_ID> struct dialog_traits;
> 
> ...
> 
> template <>
> struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File ==
> 5
> {
> typedef FileNameInputValidator validator_t;
> };
> 
>  - This still creates a pretty large number of template classes -
>  one
> for each validator and the executable size can increase
> significantly because of this.

Why? There isn't any code generated, just type definitions. Or is that
because of RTTI? Is this really a problem?

>  - This seems unintrusive and elegant.
> 
> Is there a way to check the code-bloat but.

Can you sort the IDs by validator type? If so, you can use a recursive
definition to cut down on code bloat, at a cost of (potentially a lot
of) compilation time:

template <int DLG_ID> struct dialog_traits {
    typedef dialog_traits<DLG_ID - 1>::validator_t validator_t;
};

template <>
struct dialog_traits<0> {
    typedef FileNameInputValidator validator_t;
};

template <>
struct dialog_traits<X> {
    typedef DateTimeInputValidator validator_t;
};

Where X is the id of each first dialog using a certain validator type.

Lourens


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Lourens 10/16/2006 8:52:05 PM


Andy wrote:
> I am writing a framework for writing an installer. The system should
> I am faced with a basic problem. Currently we only plan to support a
> UNIX terminal based character-based UI. The whole set of dialog strings
> or prompt strings which I am supposed to throw at the user is quite
> large. As of now, each of these has been given a numeric ID and mapped
> from a message catalog using catopen / catgets calls.
>
> Now each of these dialogs is a request to the user for an input. There
> are well-defined rules as to what is acceptable and what is not
> acceptable as part of the input in response to each Dialog.
>
> In other words, each Dialog Id has associated with it a validation rule
> for the input that is read in response to that Dialog.
>
> Now it is not difficult to create a set of Input Validator classes,
> each doing one kind of validation. Some do validation on date time
> strings, others on File names and directories, and still others on
> numeric values. We have been able to keep the number of such generic
> validators to 8 at present.
>
> Now let us suppose that each of my (say) 250 Dialogs can be validated
> with one of these 8 validators.
>
> So I am wondering how to approach this problem.
> I tried the following approach:
>
> Create a dialog_traits class template. For each dialog Id, specialize
> it and define a typedef in the specializtion which refers to the type
> of the validator appropriate for this dialog. Something like:
>
> template <int DLG_ID> struct dialog_traits;
>
> template <>
> struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File == 5
> {
> 	typedef FileNameInputValidator validator_t;
> };
>
>  - This still creates a pretty large number of template classes - one
> for each validator and the executable size can increase significantly
> because of this.

What "code" could a typedef declaration be adding to a program? Given
that that the program allocates no dialog_traits class template
objects, and thereby instantiates no dialog_traits class template
methods (just a typedef declaration), there is no code "bloat" to be
concerned about  - because there is no code at all. The class template
dialog_traits generates no code - and allocates no storage - so it
would be difficult to write a class template whose resource
requirements were any lighter those of dialog_traits.

Granted, all of these dialog_traits template instantiations may
significantly increase the size of a debug build of the program. But
the entire amount of any increase in binary size is simply added
strings. The compiler adds the names of the dialog_traits class
template instantiations to the executable for benefit of the debugger.
But none of this additional data contributes anything during the
program's execution - and all of this added debugging overhead should
be removed in the program's final, release build anyway. So when it
comes time to release the software, the dialog_traits class template
would have ballooned the final build - increasing both the program's
code size and its storage requirements - by precisely zero added bytes
of useless overhead.

Greg


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Greg 10/17/2006 4:57:10 AM

Greg Herlihy wrote:
>
> Granted, all of these dialog_traits template instantiations may
> significantly increase the size of a debug build of the program. But
> the entire amount of any increase in binary size is simply added
> strings. The compiler adds the names of the dialog_traits class
> template instantiations to the executable for benefit of the debugger.
> But none of this additional data contributes anything during the
> program's execution - and all of this added debugging overhead should
> be removed in the program's final, release build anyway. So when it
> comes time to release the software, the dialog_traits class template
> would have ballooned the final build - increasing both the program's
> code size and its storage requirements - by precisely zero added bytes
> of useless overhead.
>
> Greg
>
>

I am posting below some dummy code which defines a traits template
class and adds a singe typedef. The typedef refers to some defined
class in the progam - I have defined a few dummy classes for that
purpose. Sorry for using the macro to generate code for the dummy
classes.

I compile this code using:

g++ trait_exesize.cpp -o trait_exesize_test [-O2]

followed by

strip trait_exesize_test.

The findings are reported following the code snippet.

// --- code start --- file: trait_exesize.cpp --- //

#include <iostream>

#define dummy_class(class_name) struct class_name \ {\
        class_name::class_name(){\
                std::cout << #class_name" constructed"
<< std::endl;\
        }\
\
        class_name::~class_name(){\
                std::cout << #class_name" destructed"
<< std::endl;\
        }\
}

dummy_class(A);
dummy_class(B);
dummy_class(C);
dummy_class(D);
dummy_class(E);
dummy_class(F);


enum {
        Dialog_1 = 1,
        Dialog_2,
        Dialog_3,
        Dialog_4,
        Dialog_5,
        Dialog_6,
        Dialog_7,
        Dialog_8,
        Dialog_9,
        Dialog_10,
        Dialog_11,
        Dialog_12
};



template <int DLG_ID> struct dialog_traits {
        typedef int val_t;
};

template <>
struct dialog_traits<1>
{
        typedef A val_t;
};

template <>
struct dialog_traits<2>
{
        typedef B val_t;
};

template <>
struct dialog_traits<3>
{
        typedef C val_t;
};

template <>
struct dialog_traits<4>
{
        typedef D val_t;
};

template <>
struct dialog_traits<5>
{
        typedef E val_t;
};

template <>
struct dialog_traits<6>
{
        typedef F val_t;
};

template <>
struct dialog_traits<7>
{
        typedef A val_t;
};

template <>
struct dialog_traits<8>
{
        typedef B val_t;
};

template <>
struct dialog_traits<9>
{
        typedef C val_t;
};

template <>
struct dialog_traits<10>

{
        typedef D val_t;
};

template <>
struct dialog_traits<11>
{
        typedef E val_t;
};

template <>
struct dialog_traits<12>
{
        typedef F val_t;
};


int main()
{
        /**/dialog_traits<1>::val_t a;
        dialog_traits<2>::val_t b;
        dialog_traits<3>::val_t c;
        dialog_traits<4>::val_t d;
        dialog_traits<5>::val_t e;
        dialog_traits<6>::val_t f;
        dialog_traits<7>::val_t g;
        dialog_traits<8>::val_t h;
        dialog_traits<9>::val_t i;
        dialog_traits<10>::val_t j;
        dialog_traits<11>::val_t k;
        dialog_traits<12>::val_t l;/**/

        return 0;
}


// --- code end --- file: trait_exesize.cpp --- //


Instantiations               Size (dbg)    Stripped Size (dbg)
Size (O2)    Stripped Size (O2)
----------------------------------------------------------------------------
-----------------------

12                               18161           7212
       16138         5532
11                               18129           7180
       16010         5404
10                               18097           7148
       15882         5276
 9                                18065           7116
        15738         5132
 8                                18033           7084
        15610         5004
 7                                18001           7052
        15466         4860
 6                                17969           7020
        15338         4732
 5                                17790           6884
        15182         4576
 4                                16283           5420
        15010         4404
 3                                15872           5052
        14858         4252
 2                                15457           4680
        14686         4080
 1                                14904           4200
        14408         3832
 0                                13650           3276
        13650         3276

What I see here is that based on the number of instances I create of
the traits, the size increases. And even after an optimized build
followed by a strip on the exe, there is a siginficant increase in the
executable size.


Lourens -

Your idea is simple and saves me quite a bit of headache.


Cheers,
Andy


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Andy 10/17/2006 1:39:27 PM

Andy wrote:

> I am faced with a basic problem. Currently we only plan to support a
> UNIX terminal based character-based UI. The whole set of dialog strings
> or prompt strings which I am supposed to throw at the user is quite
> large. As of now, each of these has been given a numeric ID and mapped
> from a message catalog using catopen / catgets calls.

> Now each of these dialogs is a request to the user for an input. There
> are well-defined rules as to what is acceptable and what is not
> acceptable as part of the input in response to each Dialog.

> In other words, each Dialog Id has associated with it a validation rule
> for the input that is read in response to that Dialog.

> Now it is not difficult to create a set of Input Validator classes,
> each doing one kind of validation. Some do validation on date time
> strings, others on File names and directories, and still others on
> numeric values. We have been able to keep the number of such generic
> validators to 8 at present.

> e.g.
> class DateInputValidator : AbstractInputValidator
> {
> public:
>       bool validate ( std::string& ) const
>       {
>               ...
>       }
>
>       ...
> };

> Now let us suppose that each of my (say) 250 Dialogs can be validated
> with one of these 8 validators.

> Now my Dialogs are not first class C++ objects. They are merely numeric
> Ids. On the other hand, the knowledge of which validator to apply to
> the input for a dialog should be with the Dialog. A validator need not
> know which dialogs it should validate. But since the Dialogs are not
> first class C++ objects, we cannot make give them the intelligence.

In that case, you need some sort of map.

One simple solution, if the dialog id's don't have to be
contiguous (and they don't have to be with Solaris' message
catalogs), is to allocate them in blocks, according to the type
of input validation needed.  Thus, for example, dialog id's from
1-1000 require dates, from 1001-2000 filenames, etc.  It's a
hack, but it's a time proven one, and you recover the necessary
validator by simply using dialogId/1000 as an index into an
array of validators.

More elegant and cleaner would be to use a std::map< int,
AbstractInputValidator* >, initialized with all of the mappings.
Presumably, each distinct validator is a static variable.  In
that case, you could make the map a singleton, with the
constructor of the validator responsible for registering itself
for all of the message types it can handle.  Alternatively, in
this case, I think there's a good argument for keeping the
information separately, in a text file with lines like:

     messageid valiatortype

A simple AWK script could then generate both the enum for the
message id's and the initialization code for the map.  (If the
AWK script takes care of assigning the numeric values for the
symbolic message id's, you could actually use a simple
AbstractInputValidator[] as the map.  This is probably the way
I'd go.)

> So I am wondering how to approach this problem.
> I tried the following approach:

> Create a dialog_traits class template. For each dialog Id,
> specialize it and define a typedef in the specializtion which
> refers to the type of the validator appropriate for this
> dialog. Something like:

> template <int DLG_ID> struct dialog_traits;
>
> ...

> template <>
> struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File == 5
> {
>       typedef FileNameInputValidator validator_t;
> };

>  - This still creates a pretty large number of template classes - one
> for each validator and the executable size can increase significantly
> because of this.

The template itself won't cause code bloat, since it has not
code.

The problem is that it requires static resolution.  I'm not sure
that this is what you want: your function will be passed a
dialog id as a parameter, I would imagine.  Otherwise, you will
get code bloat, since you will need a different function for
each dialog.

Of course, if you do decide on something like this, you could
use the AWK script to generate the specializations.

>  - This seems unintrusive and elegant.

Not to me.  To me it seems more a case of using templates at any
price, even when they aren't appropriate and don't bring any
advantage.  In this case, dynamic dispatch seems significantly
more appropriate (and of course avoids any code bloat).

Unless you adopt my first suggestion, you're going to have to
maintain the mapping message id to input validation by hand.
Putting it in a single file, with two fields per line, is
probably the simplest solution with regards to maintenance; you
can even leave it to a non programmer.  Burying the mapping in
the syntax of explicit template specialization seems like one of
the most complex and painful.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient�e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply kanze 10/17/2006 3:33:41 PM

Greg Herlihy wrote:
> Andy wrote:

     [...]
> > I tried the following approach:

> > Create a dialog_traits class template. For each dialog Id, specialize
> > it and define a typedef in the specializtion which refers to the type
> > of the validator appropriate for this dialog. Something like:

> > template <int DLG_ID> struct dialog_traits;

> > template <>
> > struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File == 5
> > {
> >     typedef FileNameInputValidator validator_t;
> > };

> >  - This still creates a pretty large number of template classes - one
> > for each validator and the executable size can increase significantly
> > because of this.

> What "code" could a typedef declaration be adding to a program?

The fact that the functions which use it will have to be
templates as well.  Currently, he has a function along the lines
of:
     void dialog( int dialogId ) ;
which is called:
     dialog( someId ) ;
With this proposed solution, the first function will become a
template, and the call will become:
     dialog< someId >() ;

Personally, I doubt that the code bloat would be that important,
but there will be some.  A more telling argument, IMHO, is that
it requires a constant dialog id at the call site.  The day he
wants to use some sort of script to sequence the dialogs (a
frequent solution when things become long or complicated), he'll
have to redesign it anyway, since he'll need to pass a variable.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient�e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply kanze 10/17/2006 3:34:45 PM

Andy wrote:
> I am posting below some dummy code which defines a traits template
> class and adds a singe typedef. The typedef refers to some defined
> class in the progam - I have defined a few dummy classes for that
> purpose. Sorry for using the macro to generate code for the dummy
> classes.
>
> I compile this code using:
> // --- code start --- file: trait_exesize.cpp --- //
>
> #include <iostream>
>
> #define dummy_class(class_name) struct class_name \ {\
>         class_name::class_name(){\
>                 std::cout << #class_name" constructed"
> << std::endl;\
>         }\
> \
>         class_name::~class_name(){\
>                 std::cout << #class_name" destructed"
> << std::endl;\
>         }\
> }
>
> dummy_class(A);
> dummy_class(B);
> dummy_class(C);
> dummy_class(D);
> dummy_class(E);
> dummy_class(F);
>
>
> template <int DLG_ID> struct dialog_traits {
>         typedef int val_t;
> };
>
> template <>
> struct dialog_traits<1>
> {
>         typedef A val_t;
> };
>
>...
>
> int main()
> {
>         /**/dialog_traits<1>::val_t a;
>         dialog_traits<2>::val_t b;
>         dialog_traits<3>::val_t c;
>         dialog_traits<4>::val_t d;
>         dialog_traits<5>::val_t e;
>         dialog_traits<6>::val_t f;
>         dialog_traits<7>::val_t g;
>         dialog_traits<8>::val_t h;
>         dialog_traits<9>::val_t i;
>         dialog_traits<10>::val_t j;
>         dialog_traits<11>::val_t k;
>         dialog_traits<12>::val_t l;/**/
>
>         return 0;
> }
>

It's the object allocations of variables a-l that require both storage
and code. So eliminating dialog_traits does not make the program any
smaller. In other words, a program with this main() routine:

    int main()
    {
        A a;
        B b;
        C c;
        D d;
        E e;
        F f;
        A g;
        B h;
        C i;
        D j;
        E k;
        F l;
    }

produces an identical program. So the dialog_traits class template
(being effectively a typedef) adds no overhead to this sample program.

The more pertinent issue is why allocate two objects of each class A-F?
These classes really should be implemented singletons (or as some kind
of namespace-scope function) if redundant overhead is to be eliminated.

Greg


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Greg 10/17/2006 6:36:13 PM

Andy wrote:
>
> I am posting below some dummy code which defines a traits template
> class and adds a singe typedef. The typedef refers to some defined
> class in the progam - I have defined a few dummy classes for that
> purpose. Sorry for using the macro to generate code for the dummy
> classes.

> [ example code and code size statistics snipped ]

> What I see here is that based on the number of instances I create of
> the traits, the size increases. And even after an optimized build
> followed by a strip on the exe, there is a siginficant increase in the
> executable size.

It seems that as you add more variables, you add more code to run --
the constructors and the destructors of the dummy classes. No wonder
the code size has to increase, since it actually has to do more work.

If you avoid the constructor/destructor calls by, e.g. using pointer
variables instead, then there will be no difference in the code size.

--
Seungbeom Kim

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Seungbeom 10/17/2006 7:31:18 PM

kanze wrote:
> Andy wrote:
>
> > I am faced with a basic problem. Currently we only plan to support a
> > UNIX terminal based character-based UI. The whole set of dialog strings
> > or prompt strings which I am supposed to throw at the user is quite
> > large. As of now, each of these has been given a numeric ID and mapped
> > from a message catalog using catopen / catgets calls.
>
> > Now each of these dialogs is a request to the user for an input. There
> > are well-defined rules as to what is acceptable and what is not
> > acceptable as part of the input in response to each Dialog.
>
> > In other words, each Dialog Id has associated with it a validation rule
> > for the input that is read in response to that Dialog.
>
> > Now it is not difficult to create a set of Input Validator classes,
> > each doing one kind of validation. Some do validation on date time
> > strings, others on File names and directories, and still others on
> > numeric values. We have been able to keep the number of such generic
> > validators to 8 at present.
>
> > e.g.
> > class DateInputValidator : AbstractInputValidator
> > {
> > public:
> >       bool validate ( std::string& ) const
> >       {
> >               ...
> >       }
> >
> >       ...
> > };
>
> > Now let us suppose that each of my (say) 250 Dialogs can be validated
> > with one of these 8 validators.
>
> > Now my Dialogs are not first class C++ objects. They are merely numeric
> > Ids. On the other hand, the knowledge of which validator to apply to
> > the input for a dialog should be with the Dialog. A validator need not
> > know which dialogs it should validate. But since the Dialogs are not
> > first class C++ objects, we cannot make give them the intelligence.
>
> In that case, you need some sort of map.
>
> One simple solution, if the dialog id's don't have to be
> contiguous (and they don't have to be with Solaris' message
> catalogs), is to allocate them in blocks, according to the type
> of input validation needed.  Thus, for example, dialog id's from
> 1-1000 require dates, from 1001-2000 filenames, etc.  It's a
> hack, but it's a time proven one, and you recover the necessary
> validator by simply using dialogId/1000 as an index into an
> array of validators.

I am yet to find out. But I will be using GNU gcc and gencat on all the
platforms and I guess the message Ids don't need to be contiguous. So
that solution actually should work great.

>
> More elegant and cleaner would be to use a std::map< int,
> AbstractInputValidator* >, initialized with all of the mappings.
> Presumably, each distinct validator is a static variable.  In
> that case, you could make the map a singleton, with the
> constructor of the validator responsible for registering itself
> for all of the message types it can handle.  Alternatively, in
> this case, I think there's a good argument for keeping the
> information separately, in a text file with lines like:
>
>      messageid valiatortype
>
> A simple AWK script could then generate both the enum for the
> message id's and the initialization code for the map.  (If the
> AWK script takes care of assigning the numeric values for the
> symbolic message id's, you could actually use a simple
> AbstractInputValidator[] as the map.  This is probably the way
> I'd go.)

I am impressed with the extent of insight you show. Indeed, we have a
singleton class which maintains a map of Dialog Ids to Validator
objects. And as you guessed, we have a Validators.hpp file in which we
make static declarations of the appropriate validator objects. These
register themselves with the ValidatorRegistry in their constructor.

>
> > So I am wondering how to approach this problem.
> > I tried the following approach:
>
> > Create a dialog_traits class template. For each dialog Id,
> > specialize it and define a typedef in the specializtion which
> > refers to the type of the validator appropriate for this
> > dialog. Something like:
>
> > template <int DLG_ID> struct dialog_traits;
> >
> > ...
>
> > template <>
> > struct dialog_traits<Dialog_Which_File> // say  Dialog_Which_File == 5
> > {
> >       typedef FileNameInputValidator validator_t;
> > };
>
> >  - This still creates a pretty large number of template classes - one
> > for each validator and the executable size can increase significantly
> > because of this.
>
> The template itself won't cause code bloat, since it has not
> code.
>
> The problem is that it requires static resolution.  I'm not sure
> that this is what you want: your function will be passed a
> dialog id as a parameter, I would imagine.  Otherwise, you will
> get code bloat, since you will need a different function for
> each dialog.
>
> Of course, if you do decide on something like this, you could
> use the AWK script to generate the specializations.

If I indeed want static resolution, could I use some type-list aided
mechanism instead of resorting to awk. I might not go this way at all -
but I ask out of curiosity.


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply arindam 10/18/2006 1:51:31 PM

8 Replies
71 Views

(page loaded in 0.137 seconds)

Similiar Articles:













7/11/2012 1:22:59 AM


Reply: