C allows adjacent string literals such as
"This is "
"a long "
"string"
to be seen as one string. This wouldn't fit my syntax so I need another
way to allow long string literals. How do you guys allow this in your
languages? Some options I have in mind follow. Is there a better -
clearer - way? What would you recommend or prefer to use as a
programmer?
"This is \
a long \
string"
"This is " +
"a long " +
"string"
"""This is
a long
string"""
|
|
0
|
|
|
|
Reply
|
James
|
6/25/2005 12:46:28 PM |
|
"James Harris" <no.email.please> writes:
> C allows adjacent string literals such as
>
> "This is "
> "a long "
> "string"
>
> to be seen as one string. This wouldn't fit my syntax so I need another
> way to allow long string literals. How do you guys allow this in your
> languages?
SML and Haskell do it thus:
"This is \
\a long \
\string"
Whitespace between a pair of \'s is skipped.
My language Kogut does it thus:
"This is \
a long \
string"
Leading indentation after \-newline is skipped. For rare cases when
you insist on starting a continuation line with a space, \s is space,
but usually it's enough to break after a space.
In Kogut when you write thus:
" This is
a long
string
"
then newlines become a part of the string, and this time leading
spaces are not removed if present, so large chunks of text can be
embedded almost directly in the program, and only require escaping of
", \, TAB etc.).
In SML and Haskell literal newlines in strings are invalid; you
usually write \n\ at the end of the previous line and \ at the
beginning of the next one.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
|
|
0
|
|
|
|
Reply
|
Marcin
|
6/25/2005 1:03:03 PM
|
|
"Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl> wrote in message
news:874qbm612w.fsf@qrnik.zagroda...
> "James Harris" <no.email.please> writes:
>
>> C allows adjacent string literals such as
>>
>> "This is "
>> "a long "
>> "string"
>>
>> to be seen as one string. This wouldn't fit my syntax so I need
>> another
>> way to allow long string literals. How do you guys allow this in your
>> languages?
>
> SML and Haskell do it thus:
> "This is \
> \a long \
> \string"
> Whitespace between a pair of \'s is skipped.
>
> My language Kogut does it thus:
> "This is \
> a long \
> string"
> Leading indentation after \-newline is skipped. For rare cases when
> you insist on starting a continuation line with a space, \s is space,
> but usually it's enough to break after a space.
>
> In Kogut when you write thus:
> " This is
> a long
> string
> "
> then newlines become a part of the string, and this time leading
> spaces are not removed if present, so large chunks of text can be
> embedded almost directly in the program, and only require escaping of
> ", \, TAB etc.).
>
> In SML and Haskell literal newlines in strings are invalid; you
> usually write \n\ at the end of the previous line and \ at the
> beginning of the next one.
>
> --
> __("< Marcin Kowalczyk
> \__/ qrczak@knm.org.pl
> ^^ http://qrnik.knm.org.pl/~qrczak/
|
|
0
|
|
|
|
Reply
|
James
|
6/25/2005 1:07:54 PM
|
|
"Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl> wrote in message
news:874qbm612w.fsf@qrnik.zagroda...
> "James Harris" <no.email.please> writes:
<snip>
> SML and Haskell do it thus:
> "This is \
> \a long \
> \string"
> Whitespace between a pair of \'s is skipped.
>
> My language Kogut does it thus:
> "This is \
> a long \
> string"
> Leading indentation after \-newline is skipped. For rare cases when
> you insist on starting a continuation line with a space, \s is space,
> but usually it's enough to break after a space.
>
> In Kogut when you write thus:
> " This is
> a long
> string
> "
> then newlines become a part of the string, and this time leading
> spaces are not removed if present, so large chunks of text can be
> embedded almost directly in the program, and only require escaping of
> ", \, TAB etc.).
>
> In SML and Haskell literal newlines in strings are invalid; you
> usually write \n\ at the end of the previous line and \ at the
> beginning of the next one.
Thanks Marcin. I thought you might respond. My favourite so far is how
you allow continuations in Kogut. Unlike Unix which says that the pair
\<newline> gets stripped my thought is to say that if \ is the last
nonblank of a line then anything between it and the first nonblank of
the next line gets stripped. So
"Dopple\
ganger"
would be one word where
"Dopple \
ganger"
would be two words. I can't think of a reason for needing to begin the
following line with a space. When do you need that?
The difficulty I can see with using a trailing backslash is in
compilation - specifically the lexical phase. Do these long strings
become one token or one-per-line? If one token then their line number
and character offset would have to apply to the beginning of the token I
guess. Then if the string is not closed properly where does the error
get reported?
|
|
0
|
|
|
|
Reply
|
James
|
6/25/2005 1:23:02 PM
|
|
"James Harris" <no.email.please> writes:
> Unlike Unix which says that the pair \<newline> gets stripped my
> thought is to say that if \ is the last nonblank of a line then
> anything between it and the first nonblank of the next line gets
> stripped.
Yes, I didn't mention that but this is actually how it behaves in
Kogut.
> I can't think of a reason for needing to begin the following line
> with a space. When do you need that?
A drawback of these rules is that all ways to split "foo\n bar"
(assuming that foo and bar are long texts) have a glitch:
"foo
bar" // has a fixed indentation on the second line
"foo\n\
\sbar" // requires escaping the space
"foo\n \
bar" // the split is almost at the newline, but not quite
"foo\
\n bar" // same again; a newline should belong to the previous line
> The difficulty I can see with using a trailing backslash is in
> compilation - specifically the lexical phase. Do these long strings
> become one token or one-per-line?
One token.
> If one token then their line number and character offset would have
> to apply to the beginning of the token I guess.
Yes.
> Then if the string is not closed properly where does the error get
> reported?
At the beginning of the string. Well, even if the fragments would
somehow be lexed separately, it would not help with this - perhaps
the actual error is way before the compiler detected it, so showing
a position which is closrer to the point of the detection wouldn't
have to be better.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
|
|
0
|
|
|
|
Reply
|
Marcin
|
6/25/2005 1:49:18 PM
|
|
"Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl> wrote in message
news:87zmtepmw1.fsf@qrnik.zagroda...
> "James Harris" <no.email.please> writes:
>
<snip>
>> I can't think of a reason for needing to begin the following line
>> with a space. When do you need that?
>
> A drawback of these rules is that all ways to split "foo\n bar"
> (assuming that foo and bar are long texts) have a glitch:
>
> "foo
> bar" // has a fixed indentation on the second line
>
> "foo\n\
> \sbar" // requires escaping the space
>
> "foo\n \
> bar" // the split is almost at the newline, but not quite
>
> "foo\
> \n bar" // same again; a newline should belong to the previous
> line
That makes a lot of sense. Option 3 is clear but maybe not as 'bonny' as
it could be. I can see why you allow \s.
|
|
0
|
|
|
|
Reply
|
James
|
6/25/2005 2:13:23 PM
|
|
On Sat, 25 Jun 2005 12:46:28 -0000, "James Harris" <no.email.please>
wrote:
>
>C allows adjacent string literals such as
>
> "This is "
> "a long "
> "string"
>
>to be seen as one string. This wouldn't fit my syntax so I need another
>way to allow long string literals. How do you guys allow this in your
>languages? Some options I have in mind follow. Is there a better -
>clearer - way? What would you recommend or prefer to use as a
>programmer?
>
> "This is \
> a long \
> string"
>
> "This is " +
> "a long " +
> "string"
>
> """This is
>a long
>string"""
An alternative is to have a multi-line text construct. Here is an
example of what it might look like:
text s =
{
This is
a long
string
}
There are sundry issues to be dealt with in this scheme, e.g. trailing
white space, leading white space, special characters, etc. The scheme
I am using in San looks much like this:
begin text s
|This is
| a long
| string.
end
The first character in the text body is a formatting control
character. There are two such characters at present, : and |. The
vertical bar says that function evaluation and string substitution is
turned on. The colon says that it is turned off. The string text
within a line begins immediately after the control character and ends
with the last non-white space character. Splicing is done without
intervening characters, e.g., no EOL characters are inserted.
The main features of the scheme are (a) it creates an assignment and
(b) everything within the text body is prefix delimited.
Richard Harter, cri@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
|
|
0
|
|
|
|
Reply
|
cri
|
6/26/2005 3:26:32 PM
|
|
James Harris wrote:
> C allows adjacent string literals such as
>
> "This is "
> "a long "
> "string"
>
> to be seen as one string. This wouldn't fit my syntax so I need another
> way to allow long string literals. How do you guys allow this in your
> languages? Some options I have in mind follow. Is there a better -
> clearer - way? What would you recommend or prefer to use as a
> programmer?
>
> "This is \
> a long \
> string"
>
> "This is " +
> "a long " +
> "string"
>
> """This is
> a long
> string"""
The Curl language offers a number of different ways to do this:
"This is " &
"a long " &
"string"
simply concatenates the pieces into "This is a long string"
{stringify
This is
a long
string
}
Trims whitespace on left but keeps newlines giving "This is\na long\nstring".
In Curl this is used primarily when the contents contain a snippet of code.
{message
This is
a long
string
}
compresses all adjacent whitespace into a single space: "This is a long
message". This form is intended for text. I am leaving details about,
but it gives you the idea.
The latter two are macros. If you give your language good macro support
then people can write their own macros to suit their purposes.
- Christopher
|
|
0
|
|
|
|
Reply
|
Christopher
|
6/27/2005 1:56:04 PM
|
|
"James Harris" <no.email.please> wrote in message
news:42bd5219$0$24082$db0fefd9@news.zen.co.uk...
>
> C allows adjacent string literals such as
>
> "This is "
> "a long "
> "string"
>
my language uses this teqnique, primarily because in many ways it resembles
c.
my language has a fixed token-length limit, and this teqnique allows parsing
as multiple tokens and merging in the parser.
> to be seen as one string. This wouldn't fit my syntax so I need another
> way to allow long string literals. How do you guys allow this in your
> languages? Some options I have in mind follow. Is there a better -
> clearer - way? What would you recommend or prefer to use as a programmer?
>
> "This is \
> a long \
> string"
>
> "This is " +
> "a long " +
> "string"
>
> """This is
> a long
> string"""
I like the first of the three for literals.
one does have to be careful though about how their lexer/parser works, eg,
there is the potential for buffer overflow.
the second option makes sense more if they are not a literal (or at least
don't appear as one), eg, the language has string concatenation.
this approach is also possible in my language via 2 operators: + and &, both
with equivalent behavior in the string+string case, but with differing
behavior in the string+non-string case. string+int is defined for offset
operations, other uses of + are undefined.
string&whatever is defined to allways stringify 'whatever'.
"foobar"+3 => "bar"
"foobar"&3 => "foobar3"
as for the third option, it is, imo, ugly.
|
|
0
|
|
|
|
Reply
|
cr88192
|
6/29/2005 1:07:20 AM
|
|
|
8 Replies
149 Views
(page loaded in 0.152 seconds)
Similiar Articles: Some text processing questions - comp.lang.vhdl... length string in your line variable L, you could do... ... could you post useful code snippets for string ... maximum string length, since you need to write out the string literals ... image denoising using Adaptive Center-Weighted Median Filter (ACWM ...----- Look at these lines in your code: [y, noise_matrix] =3D acwmf ... demo image file "eight.tif" - well that won't do it since you're merely converting a string literal ... How to insert blank spaces - comp.databases.filemakerHow to insert a single literal quote (") character in a string ... How to insert ... Add code to the HTML tab where you want the extra line to appear. Watch the video. write linebreaks into text file - comp.soft-sys.matlab... Matlab is treated as a > two-element vector instead of a string literal ... Android :: Insert Line Breaks Into Text File? My code is pretty standard.. sReport contains the ... Wrap a function - comp.lang.pythonEssentially, you want to write code in which a literal string, such as ... ... macros, so that a single keystroke turns this text line ... Insert variable into string - comp.soft-sys.matlab... decimal places, whether you want multiple lines in the string ... How to insert a single literal quote (") character in a string ... ... to insert the formatted string into your code ... read() error - Bad Address - comp.unix.programmer... type of mistake that's easy to make when you hard-code ... literal that's identical to the end of the string literal you ... -- The e-mail address in our reply-to line is ... Printing a 2d char array - comp.lang.c++.moderatedAll that it prints now, is that it starts on line ... I.e. what do you expect to happen, and what does ... cout << maze[row] << endl; Just note that string literals ... GNU-awk bug in sub()/gsub() - comp.lang.awkNote that if repl is a string literal (the ... a literal \q ----- Remember you have to use a command line switch to force POSIX ... Do you (or does anyone) know which ... const char ** syntax question - comp.lang.c++.moderatedWhat is wrong with line 4 of the program ... Firstly, you shouldn't assign a string literal ("abc") to a pointer to non ... cannot ... > I would like a way to write code ... Multi-Line Strings in C# - James Kovacs' WeblogAs it turns out, string literals can span multiple lines. The following is ... and you’ll find that there is only ONE string. If you don’t know how to read IL code ... String literal - Wikipedia, the free encyclopediaTriple quoting in Python also has the added benefit of allowing string literals to span more than one physical line of source code. Multiple quoting: Another such extension ... 7/16/2012 8:56:38 AM
|