f



[2.5.1] "UnicodeDecodeError: 'ascii' codec can't decode byte"?

Hello

I'm getting this error while downloading and parsing web pages:

=====
    title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal  not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

Does someone know how to solve this error?

Thank you.
0
nospam21 (19047)
10/29/2008 9:29:57 AM
comp.lang.python 77058 articles. 5 followers. Post Follow

3 Replies
1095 Views

Similar Articles

[PageSpeed] 56

Gilles Ganault wrote:
> I'm getting this error while downloading and parsing web pages:
> 
> =====
>     title = m.group(1)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
> 48: ordinal  not in range(128)
> =====
> 
> From what I understand, it's because some strings are Unicode, and
> hence contain characters that are illegal in ASCII.

You just need to use a codec according to the encoding of the webpage. Take
a look at 
  http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

0
eckhardt (136)
10/29/2008 10:11:38 AM
Ulrich Eckhardt wrote:
> Gilles Ganault wrote:
>> I'm getting this error while downloading and parsing web pages:
>>
>> =====
>>     title = m.group(1)
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
>> 48: ordinal  not in range(128)
>> =====
>>
>> From what I understand, it's because some strings are Unicode, and
>> hence contain characters that are illegal in ASCII.
> 
> You just need to use a codec according to the encoding of the webpage. Take
> a look at 
>   http://wiki.python.org/moin/Python3UnicodeDecodeError
> It is about Python 3, but the principles apply nonetheless. In any case,
> throwing the error at a websearch will turn up lots of solutions.
> 
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
 Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC              http://www.holdenweb.com/

0
steve73 (4801)
10/29/2008 11:29:15 AM
Ulrich Eckhardt wrote:
> Gilles Ganault wrote:
>> I'm getting this error while downloading and parsing web pages:
>>
>> =====
>>     title = m.group(1)
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
>> 48: ordinal  not in range(128)
>> =====
>>
>> From what I understand, it's because some strings are Unicode, and
>> hence contain characters that are illegal in ASCII.
> 
> You just need to use a codec according to the encoding of the webpage. Take
> a look at 
>   http://wiki.python.org/moin/Python3UnicodeDecodeError
> It is about Python 3, but the principles apply nonetheless. In any case,
> throwing the error at a websearch will turn up lots of solutions.
> 
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
 Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC              http://www.holdenweb.com/

0
steve73 (4801)
10/29/2008 11:29:15 AM
Reply: