read-line will not process German characters?

  • Follow


I want to process a normal txt file contain some German characters:
S=FCdafrika

Get the following error:
READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
Unicode-16
   [Condition of type SIMPLE-CHARSET-TYPE-ERROR]

The function looks like

(defun parse-key-file (path)
 (with-open-file (s path :direction :input)
   (do ((l (read-line s) (read-line s)))
        ((eq l 'eof) function-key)
       (parse-line l))))

my .emacs settings:

(setq locale-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(set-default-coding-systems 'utf-8)
(prefer-coding-system 'utf-8)

slime setting:

(setq slime-net-coding-system 'utf-8-unix)

clisp for windows is used.

Is this a read-line problem or my setting problem?

Thanks.

0
Reply xiewensheng (23) 11/19/2010 8:33:22 AM

W dniu 2010-11-19 09:33, keyboard pisze:
> I want to process a normal txt file contain some German characters:
> S�dafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
>     [Condition of type SIMPLE-CHARSET-TYPE-ERROR]

see http://en.wikipedia.org/wiki/UTF-8

#xFC is not valid byte in utf-8 encoding

Probably file you read is not utf-8 encoded.
0
Reply Piotr 11/19/2010 10:28:10 AM


On Nov 19, 3:33=A0am, keyboard <xiewensh...@gmail.com> wrote:
> I want to process a normal txt file contain some German characters:
> S=FCdafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
> =A0 =A0[Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>
> The function looks like
>
> (defun parse-key-file (path)
> =A0(with-open-file (s path :direction :input)
> =A0 =A0(do ((l (read-line s) (read-line s)))
> =A0 =A0 =A0 =A0 ((eq l 'eof) function-key)
> =A0 =A0 =A0 =A0(parse-line l))))
>
> my .emacs settings:
>
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (set-default-coding-systems 'utf-8)
> (prefer-coding-system 'utf-8)
>
> slime setting:
>
> (setq slime-net-coding-system 'utf-8-unix)
>
> clisp for windows is used.
>
> Is this a read-line problem or my setting problem?
>
> Thanks.



This encoding looks more like utf-16 than utf-8.

You could try,
a) (WITH-OPEN-FILE (s "foo" :direction :input :EXTERNAL-FORMAT
CHARSET:UTF-16)
b) determine the true file encoding by using the 'file' program that
comes with cygwin, OR
c) look at the file encoding by opening it in Firefox and choosing
View --> Character Encoding, OR
d) look at the raw binary using the 'od' program to decipher the
character encoding yourself.

The umlaut has unicode code point U+00FC. It can be encoded a number
of ways. In plain UTF-16 it would be physically represented as 0x00FC.
UTF-16 can be re-encoded in UTF-8 but based on your error message that
appears unlikely for this file.
0
Reply vanekl 11/19/2010 11:25:02 AM

"keyboard" wrote:

>I want to process a normal txt file contain some German characters:
>S�dafrika

>Get the following error:
>READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
>Unicode-16
>   [Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>

#xFC is umlaut 'u' in ANSI.

(Apologies to the OP for my unintentionally sending this as personal
email instead of to the group. Sorry.)

Regards,
Gerald

0
Reply Gerald 11/19/2010 12:17:19 PM

On Nov 19, 4:33=A0pm, keyboard <xiewensh...@gmail.com> wrote:
> I want to process a normal txt file contain some German characters:
> S=FCdafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
> =A0 =A0[Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>
> The function looks like
>
> (defun parse-key-file (path)
> =A0(with-open-file (s path :direction :input)
> =A0 =A0(do ((l (read-line s) (read-line s)))
> =A0 =A0 =A0 =A0 ((eq l 'eof) function-key)
> =A0 =A0 =A0 =A0(parse-line l))))
>
> my .emacs settings:
>
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (set-default-coding-systems 'utf-8)
> (prefer-coding-system 'utf-8)
>
> slime setting:
>
> (setq slime-net-coding-system 'utf-8-unix)
>
> clisp for windows is used.
>
> Is this a read-line problem or my setting problem?
>
> Thanks.

Fixed, thanks all.

(defun parse-key-file (path)
 (with-open-file (s path :direction :input :external-format
charset:ISO-8859-1)
   (do ((l (read-line s) (read-line s nil :eof)))
        ((eq l :eof) function-key)
       (parse-line l))))
0
Reply keyboard 11/22/2010 2:36:37 AM

4 Replies
360 Views

(page loaded in 0.08 seconds)

Similiar Articles:













7/23/2012 7:04:41 PM


Reply: