|
|
read-line will not process German characters?
I want to process a normal txt file contain some German characters:
S=FCdafrika
Get the following error:
READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
Unicode-16
[Condition of type SIMPLE-CHARSET-TYPE-ERROR]
The function looks like
(defun parse-key-file (path)
(with-open-file (s path :direction :input)
(do ((l (read-line s) (read-line s)))
((eq l 'eof) function-key)
(parse-line l))))
my .emacs settings:
(setq locale-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(set-default-coding-systems 'utf-8)
(prefer-coding-system 'utf-8)
slime setting:
(setq slime-net-coding-system 'utf-8-unix)
clisp for windows is used.
Is this a read-line problem or my setting problem?
Thanks.
|
|
0
|
|
|
|
Reply
|
xiewensheng (23)
|
11/19/2010 8:33:22 AM |
|
W dniu 2010-11-19 09:33, keyboard pisze:
> I want to process a normal txt file contain some German characters:
> S�dafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
> [Condition of type SIMPLE-CHARSET-TYPE-ERROR]
see http://en.wikipedia.org/wiki/UTF-8
#xFC is not valid byte in utf-8 encoding
Probably file you read is not utf-8 encoded.
|
|
0
|
|
|
|
Reply
|
Piotr
|
11/19/2010 10:28:10 AM
|
|
On Nov 19, 3:33=A0am, keyboard <xiewensh...@gmail.com> wrote:
> I want to process a normal txt file contain some German characters:
> S=FCdafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
> =A0 =A0[Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>
> The function looks like
>
> (defun parse-key-file (path)
> =A0(with-open-file (s path :direction :input)
> =A0 =A0(do ((l (read-line s) (read-line s)))
> =A0 =A0 =A0 =A0 ((eq l 'eof) function-key)
> =A0 =A0 =A0 =A0(parse-line l))))
>
> my .emacs settings:
>
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (set-default-coding-systems 'utf-8)
> (prefer-coding-system 'utf-8)
>
> slime setting:
>
> (setq slime-net-coding-system 'utf-8-unix)
>
> clisp for windows is used.
>
> Is this a read-line problem or my setting problem?
>
> Thanks.
This encoding looks more like utf-16 than utf-8.
You could try,
a) (WITH-OPEN-FILE (s "foo" :direction :input :EXTERNAL-FORMAT
CHARSET:UTF-16)
b) determine the true file encoding by using the 'file' program that
comes with cygwin, OR
c) look at the file encoding by opening it in Firefox and choosing
View --> Character Encoding, OR
d) look at the raw binary using the 'od' program to decipher the
character encoding yourself.
The umlaut has unicode code point U+00FC. It can be encoded a number
of ways. In plain UTF-16 it would be physically represented as 0x00FC.
UTF-16 can be re-encoded in UTF-8 but based on your error message that
appears unlikely for this file.
|
|
0
|
|
|
|
Reply
|
vanekl
|
11/19/2010 11:25:02 AM
|
|
"keyboard" wrote:
>I want to process a normal txt file contain some German characters:
>S�dafrika
>Get the following error:
>READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
>Unicode-16
> [Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>
#xFC is umlaut 'u' in ANSI.
(Apologies to the OP for my unintentionally sending this as personal
email instead of to the group. Sorry.)
Regards,
Gerald
|
|
0
|
|
|
|
Reply
|
Gerald
|
11/19/2010 12:17:19 PM
|
|
On Nov 19, 4:33=A0pm, keyboard <xiewensh...@gmail.com> wrote:
> I want to process a normal txt file contain some German characters:
> S=FCdafrika
>
> Get the following error:
> READ-LINE: Invalid byte #xFC in CHARSET:UTF-8 conversion, not a
> Unicode-16
> =A0 =A0[Condition of type SIMPLE-CHARSET-TYPE-ERROR]
>
> The function looks like
>
> (defun parse-key-file (path)
> =A0(with-open-file (s path :direction :input)
> =A0 =A0(do ((l (read-line s) (read-line s)))
> =A0 =A0 =A0 =A0 ((eq l 'eof) function-key)
> =A0 =A0 =A0 =A0(parse-line l))))
>
> my .emacs settings:
>
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (set-default-coding-systems 'utf-8)
> (prefer-coding-system 'utf-8)
>
> slime setting:
>
> (setq slime-net-coding-system 'utf-8-unix)
>
> clisp for windows is used.
>
> Is this a read-line problem or my setting problem?
>
> Thanks.
Fixed, thanks all.
(defun parse-key-file (path)
(with-open-file (s path :direction :input :external-format
charset:ISO-8859-1)
(do ((l (read-line s) (read-line s nil :eof)))
((eq l :eof) function-key)
(parse-line l))))
|
|
0
|
|
|
|
Reply
|
keyboard
|
11/22/2010 2:36:37 AM
|
|
|
4 Replies
360 Views
(page loaded in 0.08 seconds)
|
|
|
|
|
|
|
|
|