Hi there,
I am new to Lisp, I hope to parse strings similar to that in Fortran:
Line="H 1 1.00 1.00 2 2.00 2.00"
read(Line, '(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))') s, i, x, y, j, a, b
I means that I hope to use Lisp to extract strings and numbers (integer
and real), however, i don't find any clue in Lisp to do similar thing
like that in Fortran.
Any help or suggestions will be greatly appreciated. Thanks in advance.
Regards,
Jinsong
|
|
0
|
|
|
|
Reply
|
jszhao (77)
|
9/21/2012 2:22:25 PM |
|
Jinsong Zhao <jszhao@yeah.net> writes:
> Hi there,
>
> I am new to Lisp, I hope to parse strings similar to that in Fortran:
>
> Line="H 1 1.00 1.00 2 2.00 2.00"
> read(Line, '(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))') s, i, x, y, j, a, b
>
> I means that I hope to use Lisp to extract strings and numbers
> (integer and real), however, i don't find any clue in Lisp to do
> similar thing like that in Fortran.
I guess one question that arises is, how similar to Fortran’s behaviour
do you need the solution to be? For instance,
program test
character(len=4) :: number = '1234' ! Looks like an integer
read (number, '(F4.2)') r ! Read it as a real
print *, r
end program test
prints out 12.34000, because the F “edit descriptor” (analagous to
"FORMAT directive" in Lisp) has the behaviour that, when no decimal
point is given, the number is read as though it had the form specified
for it. (This is a feature that goes right back to punched‐card days!)
--
Ian ◎
|
|
0
|
|
|
|
Reply
|
ian.clifton (17)
|
9/21/2012 3:45:08 PM
|
|
Jinsong Zhao <jszhao@yeah.net> writes:
> I am new to Lisp, I hope to parse strings similar to that in Fortran:
>
> Line="H 1 1.00 1.00 2 2.00 2.00"
> read(Line, '(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))') s, i, x, y, j, a, b
>
> I means that I hope to use Lisp to extract strings and numbers
> (integer and real), however, i don't find any clue in Lisp to do
> similar thing like that in Fortran.
>
> Any help or suggestions will be greatly appreciated. Thanks in advance.
In lisp, variables are not typed, values are typed.
More over, each printable readably type has a specifically recognizable
syntax.
For example, a text like: "H 1 1.00 1.00 2 2.00 2.00"
can be parsed by the lisp reader as a symbol named "H", an integer
value one, a floating point number value one, an integer value
two, and two floating point number values two. Spaces are skipped
automatically.
So there's no need for a format string to tell the lisp reader how to
parse some text. In a lot of situation, that's good.
In some situations that's not enough. Notably, when you have a text
that doesn't respect the lisp syntax. Then you would have to write your
own parser. Which is usually rather trivial. If you want to do
something like the fortran read format, you'll have to parse the fortran
format string, and to generate a parser drive from that data.
So, simple case, just using the lisp reader, assuming you know how many
values you have on the line:
(with-input-from-string (line "H 1 1.00 1.00 2 2.00 2.00")
(loop repeat 7 collect (read line)))
--> (h 1 1.0 1.0 2 2.0 2.0)
Notice that it's not a character or a string that's read first, but a
symbol. You may or may not want to convert it into a string or
character, or even to read it as a character (using READ-CHAR instead of
READ).
Now, considering the fortran reader format strings such as:
"(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))"
You have to parse them. So you need to know how to write a parser. We
need to have a grammar. You can find it in the Fortran reference
manuals, but I don't have them at hand so for the example, I'll invent
the grammar here:
;; frfs ::= '(' simple-specifier { ',' simple-specifier } ')' .
;; simple-specifier ::= [ count ] specifier .
;; specifier ::= value-specifier | filler-specifier | group-specifier .
;; value-specifier ::= 'I' width | 'A' width | 'F' width '.' decimal .
;; filler-specifier ::= 'x' .
;; group-specifier ::= '(' simple-specifier { ',' simple-specifier } ')' .
;; width ::= integer .
;; decimal ::= integer .
;; count ::= integer .
;; integer := /[0-9]+/ .
;; Now, you can give this grammar to a scanner and parser generator, or you
;; can write them yourself by hand, giving something like:
(eval-when (:compile-toplevel :load-toplevel :execute)
;; Note: I wrap those function in eval-when since they'll be used a
;; compilation time by the macro at the end. An alternative would be to
;; put that macro at the end in another file to be loaded after this one.
(defun eat (ch stream)
"Read a character from the stream and signal an error if it's not CH"
(let ((got (read-char stream nil nil)))
(unless (eql ch got)
(error "Unexpected character got ~A expected ~A" got ch))))
(defun parse/fortran-read-format-specifier (stream)
(eat #\( stream)
(prog1 (loop
:collect (parse/simple-specifier stream)
:while (eql #\, (peek-char nil stream nil nil))
:do (eat #\, stream))
(eat #\) stream)))
(defun parse/simple-specifier (stream)
(let ((count (if (digit-char-p (peek-char nil stream nil nil))
(parse/integer stream)
1))
(item (parse/specifier stream)))
(if (= 1 count)
item
(list 'repeat count item))))
(defun parse/specifier (stream)
(case (peek-char nil stream nil nil)
((#\i #\I)
(read-char stream)
(list 'integer (parse/integer stream)))
((#\a #\A)
(read-char stream)
(list 'string (parse/integer stream)))
((#\f #\F)
(read-char stream)
(let ((width (parse/integer stream)))
(eat #\. stream)
(list 'float width (parse/integer stream))))
((#\x #\X)
(read-char stream)
(list 'filler))
((#\()
(eat #\( stream)
(prog1 (list* 'group (loop
:collect (parse/simple-specifier stream)
:while (eql #\, (peek-char nil stream nil nil))
:do (eat #\, stream)))
(eat #\) stream)))
(t
(error "Unexpected character in fortran reader format string: ~A"
(read-char stream)))))
(defun parse/integer (stream)
(parse-integer (coerce (loop
:for ch = (peek-char nil stream nil nil)
:while (digit-char-p ch)
:collect (read-char stream)) 'string)))
#+test
(with-input-from-string (specifier "(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))")
(parse/fortran-read-format-specifier specifier))
;; --> ((string 1)
;; (filler)
;; (integer 1)
;; (repeat 2 (group (filler) (float 4 2)))
;; (filler)
;; (integer 1)
;; (repeat 2 (group (filler) (float 4 2))))
;; So now that we can parse fortran read format specifier strings into a
;; sexp, we can do something with it. For example, flatten the repeat
;; and group, computing the positions of each field:
(defun compute-positions (specifiers &optional (pos 0))
(let ((results '()))
(dolist (specifier specifiers (values (nreverse results) pos))
(ecase (first specifier)
((string integer float)
(push (append specifier (list pos)) results)
(incf pos (second specifier)))
((filler)
(incf pos))
((repeat)
(destructuring-bind (count &rest element-specifiers) (rest specifier)
(loop
:repeat count
:do (multiple-value-bind (elements new-pos) (compute-positions element-specifiers pos)
(setf results (nreconc elements results)
pos new-pos)))))
((group)
(multiple-value-bind (elements new-pos) (compute-positions (rest specifier) pos)
(setf results (nreconc elements results)
pos new-pos)))))))
#-(and)
(compute-positions
(with-input-from-string (specifier "(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))")
(parse/fortran-read-format-specifier specifier)))
;; --> ((string 1 0)
;; (integer 1 2)
;; (float 4 2 4)
;; (float 4 2 9)
;; (integer 1 14)
;; (float 4 2 16)
;; (float 4 2 21))
;; 25
;; Now that we know where each field is to be found, we can generate a
;; parser function:
(defun generate-value-parser (specifier input-name)
(ecase (first specifier)
(string (destructuring-bind (width pos) (rest specifier)
`(subseq ,input-name ,pos ,(+ pos width))))
(integer (destructuring-bind (width pos) (rest specifier)
`(parse-integer ,input-name :start ,pos :end ,(+ pos width))))
(float (destructuring-bind (width decimals pos) (rest specifier)
;; We could be more specific here.
`(read-from-string ,input-name t nil :start ,pos :end ,(+ pos width))))))
#-(and)
(mapcar (lambda (specifier)
(generate-value-parser specifier 'input))
(compute-positions
(with-input-from-string (specifier "(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))")
(parse/fortran-read-format-specifier specifier))))
;; --> ((subseq input 0 1)
;; (parse-integer input :start 2 :end 3)
;; (read-from-string input t nil :start 4 :end 8)
;; (read-from-string input t nil :start 9 :end 13)
;; (parse-integer input :start 14 :end 15)
;; (read-from-string input t nil :start 16 :end 20)
;; (read-from-string input t nil :start 21 :end 25))
(defun generate-parser (specifiers)
(compile nil `(lambda (input)
(list ,@(mapcar (lambda (specifier)
(generate-value-parser specifier 'input))
(compute-positions specifiers))))))
;; So we can use this generated parser to parse the input line:
#-(and)
(funcall (generate-parser (with-input-from-string (specifier "(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))")
(parse/fortran-read-format-specifier specifier)))
"H 1 1.00 1.00 2 2.00 2.00")
;; --> ("H" 1 1.0 1.0 2 2.0 2.0)
);;eval-when
;; And finally, we can wrap all that in a lisp macro to present a nice
;; API:
(defmacro reading-bind (format-string (&rest variables) input-string &body body)
(if (stringp format-string)
;; we can computer the parser from the format-string at compilation time:
`(destructuring-bind ,variables
(funcall ',(generate-parser (with-input-from-string (specifier format-string)
(parse/fortran-read-format-specifier specifier)))
,input-string)
,@body)
;; We compute a new parser at run-time:
(let ((specifier (gensym)))
`(destructuring-bind ,variables
(funcall (generate-parser (with-input-from-string (,specifier ,format-string)
(parse/fortran-read-format-specifier ,specifier)))
,input-string)
,@body))))
#-(and)
(let ((line "H 1 1.00 1.00 2 2.00 2.00"))
(reading-bind "(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))" (s i x y j a b) line
(list s
(list i (list x y))
(list j (list a b)))))
;; --> ("H" (1 (1.0 1.0)) (2 (2.0 2.0)))
Of interest, is to compare all that with the original pure lisp thingy:
(with-input-from-string (line "H 1 1.00 1.00 2 2.00 2.00")
(loop repeat 7 collect (read line)))
--> (h 1 1.0 1.0 2 2.0 2.0)
--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.
|
|
0
|
|
|
|
Reply
|
pjb (7647)
|
9/21/2012 5:28:25 PM
|
|
Ian Clifton <ian.clifton@chem.ox.ac.uk> writes:
> Jinsong Zhao <jszhao@yeah.net> writes:
>
>> Hi there,
>>
>> I am new to Lisp, I hope to parse strings similar to that in Fortran:
>>
>> Line="H 1 1.00 1.00 2 2.00 2.00"
>> read(Line, '(A1,1x,I1,2(1x,f4.2),1x,i1,2(1x,F4.2))') s, i, x, y, j, a, b
>>
>> I means that I hope to use Lisp to extract strings and numbers
>> (integer and real), however, i don't find any clue in Lisp to do
>> similar thing like that in Fortran.
>
> I guess one question that arises is, how similar to Fortran’s behaviour
> do you need the solution to be? For instance,
>
> program test
> character(len=4) :: number = '1234' ! Looks like an integer
> read (number, '(F4.2)') r ! Read it as a real
> print *, r
> end program test
>
> prints out 12.34000, because the F “edit descriptor” (analagous to
> "FORMAT directive" in Lisp) has the behaviour that, when no decimal
> point is given, the number is read as though it had the form specified
> for it. (This is a feature that goes right back to punched‐card days!)
So you'd have to use (parse-fortran-float input-string position width
decimals) instead of (read-from-string input-string t nil :start
position :end (+ position widht)) in generate-value-parser.
--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.
|
|
0
|
|
|
|
Reply
|
pjb (7647)
|
9/21/2012 5:40:19 PM
|
|
On 2012-09-22 1:28, Pascal J. Bourguignon wrote:
[snip]
> Of interest, is to compare all that with the original pure lisp thingy:
>
> (with-input-from-string (line "H 1 1.00 1.00 2 2.00 2.00")
> (loop repeat 7 collect (read line)))
> --> (h 1 1.0 1.0 2 2.0 2.0)
>
>
Hi Pascal,
Thank you very much for your detail reply. I need some time to
understand your code.
The initial idea of my question, I hope to parse the string, and assign
the different values to different variables, which may be vector or array.
Thanks again.
Regards,
Jinsong
|
|
0
|
|
|
|
Reply
|
jszhao (77)
|
9/22/2012 12:31:20 AM
|
|
Jinsong Zhao <jszhao@yeah.net> writes:
> On 2012-09-22 1:28, Pascal J. Bourguignon wrote:
> [snip]
>> Of interest, is to compare all that with the original pure lisp thingy:
>>
>> (with-input-from-string (line "H 1 1.00 1.00 2 2.00 2.00")
>> (loop repeat 7 collect (read line)))
>> --> (h 1 1.0 1.0 2 2.0 2.0)
>>
>>
>
> Hi Pascal,
>
> Thank you very much for your detail reply. I need some time to
> understand your code.
>
> The initial idea of my question, I hope to parse the string, and
> assign the different values to different variables, which may be
> vector or array.
In the example fortran code you gave, I didn't see that arrays were
assigned. There was as many input values as variables.
When implementing it, I at first translated 2(f4.2) as a vector of two
floats, but obviously it wasn't good, since it had to be bound to two
scalar variable, not a single vector variable.
But if you have a syntax in the read format specifiers to indicate a
vector or an array, you can easily process it similarly (something like
a mix of repeat and group) and generate the reading of a whole vector or
array.
--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.
|
|
0
|
|
|
|
Reply
|
pjb (7647)
|
9/22/2012 12:38:03 AM
|
|
|
5 Replies
20 Views
(page loaded in 0.094 seconds)
|