f



C++ regular expression Vs. Perl regular Expression

Hi all,

In Perl the RegExpr is very good use, it allows programmer to do
everything they want to do, start from simple text file to a large
programm parsing user input.

However, I never come across C++ regular expression that able to do
what perlist can do????

Thank.


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
betterdie (78)
3/19/2006 9:42:46 PM
comp.lang.c++.moderated 10738 articles. 1 followers. allnor (8509) is leader. Post Follow

10 Replies
1131 Views

Similar Articles

[PageSpeed] 20

"phal" <betterdie@gmail.com> writes:

> Hi all,
>
> In Perl the RegExpr is very good use, it allows programmer to do
> everything they want to do, start from simple text file to a large
> programm parsing user input.
>
> However, I never come across C++ regular expression that able to do
> what perlist can do????

http://www.boost.org/libs/regex

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
David
3/20/2006 10:47:28 AM
phal wrote:
> Hi all,
>
> In Perl the RegExpr is very good use, it allows programmer to do
> everything they want to do, start from simple text file to a large
> programm parsing user input.
>
> However, I never come across C++ regular expression that able to do
> what perlist can do????

C++ has some very powerful libraries for pattern matching and text
manipulation.

1) Boost.Regex
     http://boost.org/libs/regex/doc/index.html

2) Boost.Xpressive
     http://boost-sandbox.sf.net/libs/xpressive/

3) PCRE
     http://pcre.org

All three support perl regex syntax, with the caveat that none support
the (?{code}) and (?p{code}) construct for embedding perl code in a
regex, for obvious reasons.

Xpressive allows you to build regex grammars by embedding regexes in
other regexes. (Disclaimer: I wrote Xpressive and just released v1.0.)
This gives you even more power than regexes in Perl 5. Get it here:

http://boost-consulting.com/vault/index.php
    -> Strings - Text Processing
      -> xpressive.zip

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Eric
3/20/2006 1:37:03 PM
In article <1142707696.868779.163150
@i40g2000cwc.googlegroups.com>, betterdie@gmail.com
says...
> Hi all,
>
> In Perl the RegExpr is very good use, it allows programmer to do
> everything they want to do, start from simple text file to a large
> programm parsing user input.
>
> However, I never come across C++ regular expression that able to do
> what perlist can do????

First of all, if you're writing something that really
centers primarily around REs, it may make sense to stick
with PERL.

Second, unless memory fails me, PERL is written in C++
(or perhaps C) so its RE code should be usable from other
C++ code as well.

Without more detail about the specifics of what you want
to do, it's a bit hard to get more specific than that.
There are certainly quite a few RE libraries written for
use with C and C++, so without knowing what shortcomings
you see in them, it's hard to say what would be better.
Depending on what you want, you might prefer to look into
using something like Flex that generates C++ (or C) code
instead of using a library.

-- 
     Later,
     Jerry.

The universe is a figment of its own imagination.

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Jerry
3/20/2006 1:38:13 PM

On Sun, 19 Mar 2006, phal wrote:

> Hi all,
>
> In Perl the RegExpr is very good use, it allows programmer to do
> everything they want to do, start from simple text file to a large
> programm parsing user input.
>
> However, I never come across C++ regular expression that able to do
> what perlist can do????

There are several libraries in C++ that can handle regular expressions.
Obviously, due to the static and non-script nature of C++, they can never
do exactly what Perl can do, but they can do a lot.

Boost.Regex: http://www.boost.org/libs/regex/doc/index.html
Boost.Xpressive:
http://boost-consulting.com/vault/index.php?directory=Strings%20-%20Text%20Processing
PCRE: http://www.pcre.org/
GRETA: http://research.microsoft.com/projects/greta/

Sebastian Redl

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Sebastian
3/20/2006 1:48:20 PM
Jerry Coffin wrote:

> First of all, if you're writing something that really
> centers primarily around REs, it may make sense to stick
> with PERL.

That's easy to say, but is it true? IME, Perl and the other badly typed
languages are easy for short projects, but get progressively harder as
the project grows and more complex structures are created. OTOH, C++ and
the statically typed languages check complex hierarchies and structures
before they ever run. This is a positively huge advantage. In any case,
I wrote a simple connective that presented the standard unix C regex fns
in the form of classes. It wasn't that hard, but I could then say stuff
like "if (patt.is_in(astring)) {..." and "... patt.match(2)..." etc.

It seems to me, that whenever a language like Perl seems faster, it is
better to write whatever is missing from C++ and then use C++.

> Second, unless memory fails me, PERL is written in C++
> (or perhaps C) so its RE code should be usable from other
> C++ code as well.

If you can extract it and design a good class interface, yes.

-- 
Ron House       house@usq.edu.au
                  http://www.sci.usq.edu.au/staff/house

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Ron
3/21/2006 11:15:42 AM
In article <441FAB0B.5050101@usq.edu.au>, 
house@usq.edu.au says...
> Jerry Coffin wrote:
> 
> > First of all, if you're writing something that really
> > centers primarily around REs, it may make sense to stick
> > with PERL.
> 
> That's easy to say, but is it true? IME, Perl and the other badly typed
> languages are easy for short projects, but get progressively harder as
> the project grows and more complex structures are created.

Oh, I quite agree -- but in at least a few cases, there 
are things that really are small and are going to stay 
small.

> OTOH, C++ and
> the statically typed languages check complex hierarchies and structures
> before they ever run. This is a positively huge advantage.

Oh, no question about it. At the same time, there really 
are situations where you can write 10 lines of AWK or 
PERL or whatever on that order that would take a LOT more 
in C++. In some cases, the savings is sufficient that 
even if you think there's a chance it'll grow bigger in 
the future, you're better off doing some quick 
exploration. As Brooks pointed out, you might as well 
plan to throw one away -- you will anyway. The hard part 
of that is realizing when you really do need to throw it 
away and start over...

> It seems to me, that whenever a language like Perl seems faster, it is
> better to write whatever is missing from C++ and then use C++.

I can't really agree. If you can provide a real 
improvement, that's great, but some things have already 
been done well enough that it's perfectly fine to just 
use them and be done with it.
 
> > Second, unless memory fails me, PERL is written in C++
> > (or perhaps C) so its RE code should be usable from other
> > C++ code as well.
> 
> If you can extract it and design a good class interface, yes.

I'm not entirely convinced that a good class interface is 
necessarily a requirement either. Some things work well 
as classes -- but some work perfectly well as free 
functions and ought to be used that way. In writing COM 
interfaces, Microsoft (for one example) is almost 
constantly guilty of taking what's really a single 
function and turning it into a complete interface that 
ultimately takes 5 times as much to use as if it was just 
a function you could call and be done with it.

In the case of COM, that's sort of unavoidable, since it 
simply has no way to represent a free function. C++ isn't 
so limited, so we can often do better.

-- 
    Later,
    Jerry.

The universe is a figment of its own imagination.

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Jerry
3/22/2006 1:59:40 AM
Jerry Coffin wrote:
> In article <441FAB0B.5050101@usq.edu.au>, 
> house@usq.edu.au says...
> 

>>OTOH, C++ and
>>the statically typed languages check complex hierarchies and structures
>>before they ever run. This is a positively huge advantage.

> Oh, no question about it. At the same time, there really 
> are situations where you can write 10 lines of AWK or 
> PERL or whatever on that order that would take a LOT more 
> in C++. In some cases, the savings is sufficient that 
> even if you think there's a chance it'll grow bigger in 
> the future, you're better off doing some quick 
> exploration. As Brooks pointed out, you might as well 
> plan to throw one away -- you will anyway. The hard part 
> of that is realizing when you really do need to throw it 
> away and start over...

The problem is, you get sucked in. You think - just half dozen more 
lines to try out so and so new idea, and before you know it, the whole 
thing is a big job to rewrite and too big for reliability in the script 
langauge. As for Brooks, he himself has stated that this particular 
aphorism was a mistake. I also once heard Bertrand Meyer discussing a 
big project he once supervised. Everything was going swimmingly, and he 
was away for some months. He came back to find the project in the 
doldrums and way behind schedule. Apparently the management out of the 
blue told the team that their work would be thrown away - you always 
throw one away anyway - and it so wrecked morale as to derail the whole 
project.

>>It seems to me, that whenever a language like Perl seems faster, it is
>>better to write whatever is missing from C++ and then use C++.
> 
> 
> I can't really agree. If you can provide a real 
> improvement, that's great, but some things have already 
> been done well enough that it's perfectly fine to just 
> use them and be done with it.

I think the attitude to adopt is that with all the facilities of C++, 
you should be able to create a usable and powerful interface for almost 
anything. So make it once and be done with it.

>>>Second, unless memory fails me, PERL is written in C++
>>>(or perhaps C) so its RE code should be usable from other
>>>C++ code as well.
>>
>>If you can extract it and design a good class interface, yes.
> 
> 
> I'm not entirely convinced that a good class interface is 
> necessarily a requirement either. Some things work well 
> as classes -- but some work perfectly well as free 
> functions and ought to be used that way. In writing COM 
> interfaces, Microsoft (for one example) is almost 
> constantly guilty of taking what's really a single 
> function and turning it into a complete interface that 
> ultimately takes 5 times as much to use as if it was just 
> a function you could call and be done with it.

That's true. But a pattern matcher goes nicely as an object.

-- 
Ron House       house@usq.edu.au
                 http://www.sci.usq.edu.au/staff/house

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Ron
3/23/2006 11:29:39 AM
Hi,

As point of view, C++ is more better then Perl in performance, but for
Perl, it needs less time to develop. I believe that using Perl for web
interface is the only choice far more better then C++.

How do you judge the Perl/CGI Vs. C++/CGI? I believe that you will
think Perl/CGI is far more better then C++/CGI. I hardly encounter any
web using C++/CGI, but most of them using Perl/CGI.

If you compare mod_perl with apache webserver and C/C++ CGI, the
performance not very different, somehow mod_perl more faster.

Perl/CGI Vs. C++/CGI : perl/CGI is need less time to develop
Perl/CGI Vs. C++/CGI: C++/CGI more secured then Perl/CGI, because it is
compile
Perl/CGI Vs. C++/CGI: Perl code more dirty then C++, it not easy to
understand
Perl/CGI Vs. C++/CGI: Perl is used more then C++,


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
phal
3/24/2006 12:42:12 PM
Ron House wrote:
> Jerry Coffin wrote:
>> In article <441FAB0B.5050101@usq.edu.au>,
>> house@usq.edu.au says...

>>> OTOH, C++ and the statically typed languages check complex
>>> hierarchies and structures before they ever run. This is a
>>> positively huge advantage.

>> Oh, no question about it. At the same time, there really are
>> situations where you can write 10 lines of AWK or PERL or
>> whatever on that order that would take a LOT more in C++. In
>> some cases, the savings is sufficient that even if you think
>> there's a chance it'll grow bigger in the future, you're
>> better off doing some quick exploration. As Brooks pointed
>> out, you might as well plan to throw one away -- you will
>> anyway. The hard part of that is realizing when you really
>> do need to throw it away and start over...

> The problem is, you get sucked in.  You think - just half
> dozen more lines to try out so and so new idea, and before you
> know it, the whole thing is a big job to rewrite and too big
> for reliability in the script langauge.  As for Brooks, he
> himself has stated that this particular aphorism was a
> mistake.  I also once heard Bertrand Meyer discussing a big
> project he once supervised.  Everything was going swimmingly,
> and he was away for some months.  He came back to find the
> project in the doldrums and way behind schedule.  Apparently
> the management out of the blue told the team that their work
> would be thrown away - you always throw one away anyway - and
> it so wrecked morale as to derail the whole project.

There's always some point where you need to move up to a
language which supports "separate compilation".  For some
definition of separate compilation -- the important point being
that different people can work on different parts of the project
in parallel.

In theory, almost all languages allow this in some way.  We've
even modularized our GNU make scripts.  But languages like C++
certainly do it better than the typical scripting language, and
since the change requires significant rework anyway, it
generally represents the point where you switch languages.

Of course, there are many different scripting languages, as
well, and you're certainly better off choosing something other
than Perl (which seems to be optimized for write-only code).

>>> It seems to me, that whenever a language like Perl seems
>>> faster, it is better to write whatever is missing from C++
>>> and then use C++.

>> I can't really agree.  If you can provide a real
>> improvement, that's great, but some things have already been
>> done well enough that it's perfectly fine to just use them
>> and be done with it.

> I think the attitude to adopt is that with all the facilities
> of C++, you should be able to create a usable and powerful
> interface for almost anything. So make it once and be done
> with it.

The first thing I did when learning C++ was to create all of the
classes necessary for doing what AWK does.  I've still got them,
and I still use them regularly.  But I also use AWK -- for small
things, it's just a lot simpler not to have to go through the
compile and link phase.

>>>> Second, unless memory fails me, PERL is written in C++ (or
>>>> perhaps C) so its RE code should be usable from other C++
>>>> code as well.

>>> If you can extract it and design a good class interface, yes.

>> I'm not entirely convinced that a good class interface is
>> necessarily a requirement either.  Some things work well as
>> classes -- but some work perfectly well as free functions
>> and ought to be used that way.  In writing COM interfaces,
>> Microsoft (for one example) is almost constantly guilty of
>> taking what's really a single function and turning it into a
>> complete interface that ultimately takes 5 times as much to
>> use as if it was just a function you could call and be done
>> with it.

> That's true.  But a pattern matcher goes nicely as an object.

I could hardly imagine it as anything else in C++.

--
James Kanze                                           GABI Software
Conseils en informatique orient�e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34


      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
kanze
3/27/2006 2:30:00 PM
kanze wrote:

> The first thing I did when learning C++ was to create all of the
> classes necessary for doing what AWK does.  I've still got them,
> and I still use them regularly.  But I also use AWK -- for small
> things, it's just a lot simpler not to have to go through the
> compile and link phase.

Yes, that's the sort of thing I was getting at. C++ is very amenable to
being turned into special-purpose languages by adding classes and
operators. It only has three significant problems, imho: no built-in GC,
built-in legacy features, and unfriendliness to interpreter implementation.

-- 
Ron House       house@usq.edu.au
                  http://www.sci.usq.edu.au/staff/house

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Ron
3/29/2006 10:48:55 AM
Reply: