Hi,
I set up a website on a 1and1 hosting with a MySQL db containing different
languages, including greek, hebrew and coptic and it all works fine with
Firefox.
I no longer maintain that site, so have set up my own MySQL db on my
computer (Windows XP) and copied the pages to another 1and1 site. I cannot
get the scripts to display properly now with the new site, and linking into
my own db. Using Navicat I can see that at least the greek and hebrew have
imported and are displaying fine in my db.
I went into the original site and briefly switched the db source to mine,
and it did not display the scripts correctly either - all '?' again. I
switched back to the original database, and all displayed fine again.
Using the Console function in Navicat, I can get the Greek and Hebrew to
display correctly.
So, the source of the problem must be with my database - possibly some
setting somewhere.
I am using php, with ...
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
on both sites.
Any suggestions as to why my db is not working properly?
TIA
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/3/2012 2:51:06 AM |
|
On 03/08/12 04:51, Iain wrote:
> Hi,
>
> I set up a website on a 1and1 hosting with a MySQL db containing
> different languages, including greek, hebrew and coptic and it all works
> fine with Firefox.
>
> I no longer maintain that site, so have set up my own MySQL db on my
> computer (Windows XP) and copied the pages to another 1and1 site. I
> cannot get the scripts to display properly now with the new site, and
> linking into my own db. Using Navicat I can see that at least the greek
> and hebrew have imported and are displaying fine in my db.
I guess your export/import broke the site, when you do a mysql dump, you
can specify which charset to use, you nay also have to specify the
charset you art using on your computer and which one is used on the
mysql server.
When you import, you should use the same values again.
> So, the source of the problem must be with my database - possibly some
> setting somewhere.
>
> I am using php, with ...
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> "http://www.w3.org/TR/html4/loose.dtd">
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
> on both sites.
You can't use two charset for one page, but in your case you may get
some pages to work if you switch to another charset on your secondary site.
I suggest you do a new dump which you import into your new database.
--
//Aho
|
|
0
|
|
|
|
Reply
|
user7 (3884)
|
8/3/2012 5:02:20 AM
|
|
El 03/08/2012 4:51, Iain escribi�/wrote:
> I set up a website on a 1and1 hosting with a MySQL db containing
> different languages, including greek, hebrew and coptic and it all works
> fine with Firefox.
>
> I no longer maintain that site, so have set up my own MySQL db on my
> computer (Windows XP) and copied the pages to another 1and1 site. I
> cannot get the scripts to display properly now with the new site, and
> linking into my own db. Using Navicat I can see that at least the greek
> and hebrew have imported and are displaying fine in my db.
>
> I went into the original site and briefly switched the db source to
> mine, and it did not display the scripts correctly either - all '?'
> again. I switched back to the original database, and all displayed fine
> again.
>
> Using the Console function in Navicat, I can get the Greek and Hebrew to
> display correctly.
>
> So, the source of the problem must be with my database - possibly some
> setting somewhere.
>
> I am using php, with ...
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
> "http://www.w3.org/TR/html4/loose.dtd">
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
> on both sites.
>
> Any suggestions as to why my db is not working properly?
I have the impression that the problem can be at any step of the
process, not necessarily the database, and you are not sure about how to
very it. You'll find pretty good advice in this article:
http://www.itnewb.com/tutorial/UTF-8-Enabled-Apache-MySQL-PHP-Markup-and-JavaScript
Read it carefully and debug each part separately.
--
-- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programaci�n web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
|
|
0
|
|
|
|
Reply
|
alvaro.NOSPAMTHANX1 (511)
|
8/3/2012 7:05:03 AM
|
|
"Iain" <spam@smaps.net> wrote in message
news:a80sl8FvblU1@mid.individual.net...
> Hi,
>
> I set up a website on a 1and1 hosting with a MySQL db containing different
> languages, including greek, hebrew and coptic and it all works fine with
> Firefox.
>
> I no longer maintain that site, so have set up my own MySQL db on my
> computer (Windows XP) and copied the pages to another 1and1 site. I
> cannot get the scripts to display properly now with the new site, and
> linking into my own db. Using Navicat I can see that at least the greek
> and hebrew have imported and are displaying fine in my db.
>
> I went into the original site and briefly switched the db source to mine,
> and it did not display the scripts correctly either - all '?' again. I
> switched back to the original database, and all displayed fine again.
>
> Using the Console function in Navicat, I can get the Greek and Hebrew to
> display correctly.
>
> So, the source of the problem must be with my database - possibly some
> setting somewhere.
My guess (and apologise if you've already checked this) is taht its the
charset used for the database. I had similar issues once and it turned out
that the database was using the default charset (Latin1) and not utf8. The
problem went away when I switched charsets and re-imported.
If this is your issue then the notes I made at the time might help:
http://www.cryer.co.uk/brian/mysql/howto_change_database_collation_order.htm
If it is the charset then once you've changed it then you will need to
re-import your data.
Hope this helps.
--
Brian Cryer
http://www.cryer.co.uk/brian
|
|
0
|
|
|
|
Reply
|
not.here1 (24)
|
8/3/2012 11:03:28 AM
|
|
Brian Cryer wrote:
> "Iain" <spam@smaps.net> wrote in message
> news:a80sl8FvblU1@mid.individual.net...
> > Hi,
> >
> > I set up a website on a 1and1 hosting with a MySQL db containing
> > different languages, including greek, hebrew and coptic and it all
> > works fine with Firefox.
> >
> > I no longer maintain that site, so have set up my own MySQL db on my
> > computer (Windows XP) and copied the pages to another 1and1 site. I
> > cannot get the scripts to display properly now with the new site,
> > and linking into my own db. Using Navicat I can see that at least
> > the greek and hebrew have imported and are displaying fine in my db.
> >
> > I went into the original site and briefly switched the db source to
> > mine, and it did not display the scripts correctly either - all '?'
> > again. I switched back to the original database, and all displayed
> > fine again. Using the Console function in Navicat, I can get the Greek
> > and
> > Hebrew to display correctly.
> >
> > So, the source of the problem must be with my database - possibly
> > some setting somewhere.
>
> My guess (and apologise if you've already checked this) is taht its
> the charset used for the database. I had similar issues once and it
> turned out that the database was using the default charset (Latin1)
> and not utf8. The problem went away when I switched charsets and
> re-imported.
> If this is your issue then the notes I made at the time might help:
> http://www.cryer.co.uk/brian/mysql/howto_change_database_collation_order.htm
> If it is the charset then once you've changed it then you will need to
> re-import your data.
>
> Hope this helps.
I checked these earlier and changed them. Viewing the database properties
through Navicat, they are:
Character set: utf8 -- UTF-8 Unicode
Collation: utf8_general_ci
That is what the original db is - getting that info from querying the
'INFORMATION_SCHEMA.COLUMNS' table ...
CHARACTER_SET_NAME: utf8 - utf8_general_ci
DEFAULT_COLLATE_NAME: utf8 - utf8_general_ci
The strange thing is that when I do a .sql dump of the various tables, which
can include a 'show create table ...', the tables always come up with
'latin1' and 'latin1_general_ci'. This is also confirmed by doing another
query on the 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table.
The results are always:
(field name:) code, (type:) varchar(4), latin1, latin1_general_ci
(field name:) texts, (type:) text, latin1, latin1_general_ci
However, if I do not edit the 'create table ...' (which uses the 'latin1'),
at the first instance of trying to 'insert', it comes up with an error:
[Err] 1366 - Incorrect string value: '\xCE\xB2\xCE\xB9 ...' which is the
rtf-8 coding (in this instance, Greek).
Replying to Alvaro:
Thanks for the link. I checked the settings and set up the .htaccess file
(the REM '//' lines have to be removed otherwise it will not work).
I also added the settings to the my.ini file. The 'default-character-set =
utf8' under the server section stopped MySQL from restarting, so I had to
remove that, although there was already a setting,
'character-set-server=utf8' in the server section.
Unfortunately, that has not sorted it out.
Replying to Aho
I have tried several new dumps, with importing them and maybe with one or
two tweeks, but still no success.
:(
Thanks for all the suggestions so far, but still no success.
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/3/2012 4:14:29 PM
|
|
Iain wrote:
>
> The strange thing is that when I do a .sql dump of the various tables,
> which can include a 'show create table ...', the tables always come up
> with 'latin1' and 'latin1_general_ci'. This is also confirmed by doing
> another query on the 'INFORMATION_SCHEMA.COLUMNS', with the
> corresponding table. The results are always:
> (field name:) code, (type:) varchar(4), latin1, latin1_general_ci
> (field name:) texts, (type:) text, latin1, latin1_general_ci
>
There is a switch on mysqldump --create-options which I THINK you
probably need to use to recreate character sets..
Someone better versed than I will confirm or refute that, I am sure.
--
To people who know nothing, anything is possible.
To people who know too much, it is a sad fact
that they know how little is really possible -
and how hard it is to achieve it.
|
|
0
|
|
|
|
Reply
|
tnp (2255)
|
8/3/2012 4:32:53 PM
|
|
On 03/08/12 18:14, Iain wrote:
> I have tried several new dumps, with importing them and maybe with one
> or two tweeks, but still no success.
Make a dump and use --skip-set-charset and see if it works better, there
are always issues with charsets when export/import, I hope this could
improve in mysql.
Other options is that you use sed to replace the charset in the dump you
made.
sed 's/Latin1/utf8/g' -i yourdatabasedump.sql
and then import it.
|
|
0
|
|
|
|
Reply
|
user7 (3884)
|
8/3/2012 5:46:21 PM
|
|
The Natural Philosopher wrote:
> Iain wrote:
>
> >
> > The strange thing is that when I do a .sql dump of the various
> > tables, which can include a 'show create table ...', the tables
> > always come up with 'latin1' and 'latin1_general_ci'. This is also
> > confirmed by doing another query on the
> > 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table. The
> > results are always: (field name:) code, (type:) varchar(4), latin1,
> > latin1_general_ci (field name:) texts, (type:) text, latin1,
> > latin1_general_ci
>
> There is a switch on mysqldump --create-options which I THINK you
> probably need to use to recreate character sets..
>
> Someone better versed than I will confirm or refute that, I am sure.
What I have been doing is editing the .sql dump before running it. I have
both removed the latin1 references, and allowed it to re-create the table
using the default utf-8, and replacing the the latin1 with utf-8. Both run
OK and the data is imported into the table correctly and running a query
shows both the Greek and Hebrew characters coming up correctly(? - don't
know either language, but recognise the characters). So it seems that
within the environment of the database / tables (Navicat), the utf-8 is
working correctly.
Somewhere along the line, the getting of the data from the tables to the php
changes the text from being recognisable within the table environment, to
becoming '???' in the php.
Viewing the page source for both the site where it works, and the site where
it doesn't; where it works, the correct characters appear within the page
source, and the site where it doesn't work, the '?' appears in the page
source.
It's similar stories for both Firefox and MS IE, except the coptic
characters do not display properly on IE. Maybe that's just IE!
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/3/2012 5:54:47 PM
|
|
J.O. Aho wrote:
> On 03/08/12 18:14, Iain wrote:
>
> > I have tried several new dumps, with importing them and maybe with
> > one or two tweeks, but still no success.
>
> Make a dump and use --skip-set-charset and see if it works better,
> there are always issues with charsets when export/import, I hope this
> could improve in mysql.
>
> Other options is that you use sed to replace the charset in the dump
> you made.
>
> sed 's/Latin1/utf8/g' -i yourdatabasedump.sql
>
> and then import it.
I can only access the data in the database that's working (the other one)
through php - I do not have direct access to the database itself now.
I'm using a php routine 'mysqldump.php', which I have been modifying to get
other bits of information, eg. from the 'INFORMATION_SCHEMA.COLUMNS' table.
http://forums.phpfreaks.com/index.php?topic=162154.0
There seem to be various versions around.
But surely does it not mean something that the characters display correctly
after I have imported them into my new database? Does that not show that
the dump is correct, and that they have also imported correctly?
(Coptic is a different story at the moment, and may follow once the Greek
and Hebrew are working OK).
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/3/2012 6:07:58 PM
|
|
> I checked these earlier and changed them. Viewing the database properties
> through Navicat, they are:
> Character set: utf8 -- UTF-8 Unicode
> Collation: utf8_general_ci
>
> That is what the original db is - getting that info from querying the
> 'INFORMATION_SCHEMA.COLUMNS' table ...
> CHARACTER_SET_NAME: utf8 - utf8_general_ci
> DEFAULT_COLLATE_NAME: utf8 - utf8_general_ci
You need to be concerned with:
- The character set of the connection. ("SET NAMES utf8")
- The character set of the client.
- The character set of the server (--character-set-server=latin1 is the default)
- The character set of the database (CREATE DATABASE `dbname` DEFAULT CHARACTER SET utf8)
- The character set of the table (CREATE TABLE `tablename` CHARACTER SET utf8)
- The character set of the column (CHARACTER SET utf8 clause in a column definition in
CREATE TABLE)
The character set of a column is the first of the column, table, database, and server
character set that is specified.
- What's really stored in the column. If it doesn't match what MySQL thinks the
character set is for the column, you're in trouble. One of the messier situations
is having the characters actually utf8, but labelled latin1, so as long as the
character sets are all labelled equally WRONG, it works, but if you set one of
them correctly, it tries to convert and you get a mess. The solution is usually
to re-import the data after fixing the column character sets.
- The character set of a string literal in a query is the character set of the connection.
If the character set of the column is different from the character set of the connection,
MySQL will try to convert it (and it may fail, e.g. converting Greek letters from
utf8 to latin1).
Oh, yes, you probably need to worry about collations also.
>
> The strange thing is that when I do a .sql dump of the various tables, which
> can include a 'show create table ...', the tables always come up with
> 'latin1' and 'latin1_general_ci'. This is also confirmed by doing another
> query on the 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table.
That sounds like trouble, perhaps indicating that the tables were created
before the default character set for the database was set.
> The results are always:
> (field name:) code, (type:) varchar(4), latin1, latin1_general_ci
> (field name:) texts, (type:) text, latin1, latin1_general_ci
>
> However, if I do not edit the 'create table ...' (which uses the 'latin1'),
> at the first instance of trying to 'insert', it comes up with an error:
> [Err] 1366 - Incorrect string value: '\xCE\xB2\xCE\xB9 ...' which is the
> rtf-8 coding (in this instance, Greek).
The query: SHOW VARIABLES LIKE 'character%';
after setting the default database to the one you're using may be useful.
Also try typing \s to the MySQL command-line client.
|
|
0
|
|
|
|
Reply
|
gordonb.eoeqf (1)
|
8/3/2012 6:22:59 PM
|
|
Iain wrote:
>
> Somewhere along the line, the getting of the data from the tables to the
> php changes the text from being recognisable within the table
> environment, to becoming '???' in the php.
Hmm.
Have you ;viewed source' to see if the php is spitting out UTF8 but not
informing the browser that it is?
> Viewing the page source for both the site where it works, and the site
> where it doesn't; where it works, the correct characters appear within
> the page source, and the site where it doesn't work, the '?' appears in
> the page source.
Ah...possibly there is a PHP character set variable?
"To change the character encoding in your php.ini file, find the
following line and input your preferred character encoding. In the
example below, UTF-8 is the character set.
default_charset = "UTF-8""
>
> It's similar stories for both Firefox and MS IE, except the coptic
> characters do not display properly on IE. Maybe that's just IE!
>
probably not got the right fonts loaded.
--
To people who know nothing, anything is possible.
To people who know too much, it is a sad fact
that they know how little is really possible -
and how hard it is to achieve it.
|
|
0
|
|
|
|
Reply
|
tnp (2255)
|
8/3/2012 6:35:40 PM
|
|
The Natural Philosopher wrote:
> Iain wrote:
> Ah...possibly there is a PHP character set variable?
>
> "To change the character encoding in your php.ini file, find the
> following line and input your preferred character encoding. In the
> example below, UTF-8 is the character set.
>
> default_charset = "UTF-8""
>
I don't think that I can gain access to the php.ini file in a commercially
hosted site. I have already put:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
in the corresponding page headers.
> >
> > It's similar stories for both Firefox and MS IE, except the coptic
> > characters do not display properly on IE. Maybe that's just IE!
> >
> probably not got the right fonts loaded.
It works perfectly with Firefox - maybe there's a specific IE coptic font!
Anyway, I don't normally worry too much about this in IE; it has enough
quirks in javascript to keep anyone going. The main original site has the
Firefox logo on it and says that it is optimized for Firefox.
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/3/2012 8:48:53 PM
|
|
Iain wrote:
> The Natural Philosopher wrote:
>> Iain wrote:
>
>> Ah...possibly there is a PHP character set variable?
>>
>> "To change the character encoding in your php.ini file, find the
>> following line and input your preferred character encoding. In the
>> example below, UTF-8 is the character set.
>>
>> default_charset = "UTF-8""
>>
> I don't think that I can gain access to the php.ini file in a
> commercially hosted site.
I think there is usually a locally overridable one.
That you can access.
knock up a script to display te output of phpinfo..and see what that says.
Thats how I have tracked down several - 'my test site and my production
site behave diffrently' issues
I have already put:
> <meta http-equiv="content-type" content="text/html;charset=utf-8" />
> in the corresponding page headers.
>
>> >
>> > It's similar stories for both Firefox and MS IE, except the coptic
>> > characters do not display properly on IE. Maybe that's just IE!
>> >
>> probably not got the right fonts loaded.
>
> It works perfectly with Firefox - maybe there's a specific IE coptic
> font! Anyway, I don't normally worry too much about this in IE; it has
> enough quirks in javascript to keep anyone going. The main original
> site has the Firefox logo on it and says that it is optimized for Firefox.
>
Oh, the firefox is on a windows system?
There is something about IE that is ticking away at tteh back of my
brain - and font smoothing..firefox sees a better job IIRC but IE may be
tweakable..or a font size change may help.
--
To people who know nothing, anything is possible.
To people who know too much, it is a sad fact
that they know how little is really possible -
and how hard it is to achieve it.
|
|
0
|
|
|
|
Reply
|
tnp (2255)
|
8/3/2012 10:45:39 PM
|
|
The Natural Philosopher wrote:
> I think there is usually a locally overridable one.
> That you can access.
>
> knock up a script to display te output of phpinfo..and see what that
> says.
> Thats how I have tracked down several - 'my test site and my
> production site behave diffrently' issues
I have created a phpinfo.php file that displays all of the settings for
phpinfo settings: 1, 2, 4, 8, 16, 32, 64.
Both sites have the same settings (they are both 1and1).
default_charset both have 'no value' for both Local and Master values.
and both have idn.default_charset set to 'ISO-8859-1' for both Local and
Master values.
So there are no differences there, nor in all of the other settings, except
for the Environment and PHP variables.
......
> Oh, the firefox is on a windows system?
>
> There is something about IE that is ticking away at tteh back of my
> brain - and font smoothing..firefox sees a better job IIRC but IE may
> be tweakable..or a font size change may help.
Thanks for the suggestion - it was worth checking.
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/4/2012 1:04:39 AM
|
|
>> Other options is that you use sed to replace the charset in the dump
>> you made.
>>
>> sed 's/Latin1/utf8/g' -i yourdatabasedump.sql
>>
>> and then import it.
Beware that queries like:
set names utf-8;
do not work. And if it produces an error message when running a
whole dump, it may be difficult to see. MySQL managed to spell the
character set names differently from what you have to put in the
HTTP headers (in this case, utf8, *not* utf-8 goes in the above set
names query).
Note that if you need 4-byte UTF-8 encodings, you need utf8mb4, not
utf8, as the character set. Unless you need the "Ancient Greek Numbers"
block, you do not need this for Greek, Hebrew, and Coptic.
It may be useful to see what is actually in the database field.
This requires a little knowledge of character encoding.
Something like:
select hex(last_name) from employees where empid = 33;
(pick a record where last_name includes some non-ASCII letters)
can get you a dump of the data actually there.
Edit your import file (in a UTF-8 editor) so that one of the last
names is "André", (that's A n d r e-with-acute-accent. The accented
e is code point 0xE9, encoded in UTF-8 as C3A9.) with a known record
number, here assumed to be 33. Duplicate the import process as
exactly as you can. Now run:
select hex(last_name) from employees where empid = 33;
The correct encoding is:
416E6472C3A9
and if you got that, the correct data went *in* to the database.
If you got:
416E6472E9
it went in encoded as iso-8859-1 or Windows-1252. Somehow it thinks your database
field is latin1, not utf8.
If you got:
416E6472C383C2A9
it got encoded as UTF-8 *TWICE*. It probably thought the data being imported
was in latin1 and translated it to utf-8. Fix the character set for the
database connection. This is what I got with "set names" with the charset
spelled incorrectly.
If you got the correct data *in* to the database, now get the value out and
display it (in a web browser?). If you get:
A n d r e-with-acute-accent
it's working.
If you get:
A n d r capital-A-with-tilde copyright-symbol
it's getting the bytes out and treating them as iso-8859-1 or Windows-1252.
If you get:
A n d r capital-A-with-tilde unknown-character capital-A-with-circumflex copyright-symbol
it's getting the bytes out, treating them as iso-8859-1 or Windows-1252, and
translating them to UTF-8.
> I can only access the data in the database that's working (the other one)
> through php - I do not have direct access to the database itself now.
> I'm using a php routine 'mysqldump.php', which I have been modifying to get
> other bits of information, eg. from the 'INFORMATION_SCHEMA.COLUMNS' table.
> http://forums.phpfreaks.com/index.php?topic=162154.0
> There seem to be various versions around.
>
> But surely does it not mean something that the characters display correctly
> after I have imported them into my new database?
If you consistently get the character set *WRONG*, but consistently
*WRONG* (say, everything is consistently labelled as Romulan-13),
and nobody is doing any translating it might look like it works.
Fix any one character set, and it will be translated and look wrong,
even though you are now "closer" to correct.
> Does that not show that
> the dump is correct, and that they have also imported correctly?
> (Coptic is a different story at the moment, and may follow once the Greek
> and Hebrew are working OK).
>
|
|
0
|
|
|
|
Reply
|
gordonb.3atlh (1)
|
8/4/2012 1:30:57 AM
|
|
Gordon Burditt wrote:
> > I checked these earlier and changed them. Viewing the database
> > properties through Navicat, they are:
> > Character set: utf8 -- UTF-8 Unicode
> > Collation: utf8_general_ci
>
> >
> > That is what the original db is - getting that info from querying
> > the 'INFORMATION_SCHEMA.COLUMNS' table ...
> > CHARACTER_SET_NAME: utf8 - utf8_general_ci
> > DEFAULT_COLLATE_NAME: utf8 - utf8_general_ci
>
> You need to be concerned with:
>
> - The character set of the connection. ("SET NAMES utf8")
This was the solution!!!
I put the line
mysql_query("SET NAMES 'utf8'")
after
mysql_select_db(...
Why this should be the difference - needed for it to work on another site
from the same hosting company (with identical phpinfo), I do not know. But
this has not got all the characters displaying properly. That is the Greek,
Hebrew, and Coptic. Whether this is the only, or the absolutely correct
solution, I'm not sure. But it works! :)
It still doesn't get the Coptic to appear correctly in MS IE, so I'll have
to play around with that. The Greek and Hebrew are still OK in MS IE though.
Could it be something to do with the data coming from a Windows source,
rather than from a Linux based server, I wonder? Or from a remote source
over the internet, rather than from the same hosting company?
And I had gone through creating a new table, populating it, and including
your "Andr�". As it happened, the correct encoding appeared.
Very many thanks to all who contributed suggestions. I have learnt quite a
bit from all of this testing (including spending lots of time messing around
with different variations on the sql dumps, and importing, etc.)
All responses very much appreciated.
--
Iain
|
|
0
|
|
|
|
Reply
|
spam1194 (14)
|
8/4/2012 3:04:28 AM
|
|
>> You need to be concerned with:
>>
>> - The character set of the connection. ("SET NAMES utf8")
>
> This was the solution!!!
>
> I put the line
> mysql_query("SET NAMES 'utf8'")
> after
> mysql_select_db(...
>
> Why this should be the difference -
If you do not put this in, MySQL is supposed to take the utf8 data
in the table and *TRANSLATE IT INTO latin1*. Since there may
not be a translation for some of the characters, especially Greek,
Hebrew, and Coptic, expect some problems. This is where question
marks come from. In the command-line client, try "set names ascii"
and SELECTing something with accented letters. You get question marks.
> needed for it to work on another site
> from the same hosting company (with identical phpinfo), I do not know.
The character set of the connection is a characteristic of MySQL, not PHP.
It seems to me to vary with things like the $LANG environment variable
when you start up the MySQL client. I get different results if I start
the MySQL client in a text console vs. an xterm because of this.
> But
> this has not got all the characters displaying properly. That is the Greek,
> Hebrew, and Coptic. Whether this is the only, or the absolutely correct
> solution, I'm not sure. But it works! :)
If your phpinfo() shows the default charset as iso-8859-1, and you're outputting
UTF-8, you need
header("Content-type: text/html; charset=utf-8");
near the front of your web page (before any text output, even blank lines).
It's probably also a good idea to put, in the <head> section:
<meta content="text/html; charset=utf-8" http-equiv="Content-type">
I don't know why it needs to be declared TWICE. I think it has to do with
quirky browsers. View the page in Firefox and type ctrl-I. You should
see Encoding: UTF-8, not Encoding: Windows-1252 or Encoding: ISO-8859-1.
Otherwise the browser will probably render utf-8 as Windows-1252
and you get wrong characters (funny accented letters where they are not
expected).
> It still doesn't get the Coptic to appear correctly in MS IE, so I'll have
> to play around with that. The Greek and Hebrew are still OK in MS IE though.
>
> Could it be something to do with the data coming from a Windows source,
> rather than from a Linux based server, I wonder? Or from a remote source
> over the internet, rather than from the same hosting company?
I really doubt this. The networking connection shouldn't affect character
set translations.
> And I had gone through creating a new table, populating it, and including
> your "André". As it happened, the correct encoding appeared.
That says it went *INTO* the table correctly. Your problem is getting
it out.
>
> Very many thanks to all who contributed suggestions. I have learnt quite a
> bit from all of this testing (including spending lots of time messing around
> with different variations on the sql dumps, and importing, etc.)
> All responses very much appreciated.
|
|
0
|
|
|
|
Reply
|
gordonb.7fb73 (1)
|
8/4/2012 7:10:10 AM
|
|
On 03-08-2012 19:46, J.O. Aho wrote:
> On 03/08/12 18:14, Iain wrote:
>
>> I have tried several new dumps, with importing them and maybe with one
>> or two tweeks, but still no success.
>
> Make a dump and use --skip-set-charset and see if it works better, there
> are always issues with charsets when export/import, I hope this could
> improve in mysql.
>
> Other options is that you use sed to replace the charset in the dump you
> made.
>
> sed 's/Latin1/utf8/g' -i yourdatabasedump.sql
>
> and then import it.
>
Don't do that....
You will change the charset which is used for creating the table, but
not the characterset which was used for exporting the data in that
table. On my linux system mysqldump will generate a UTF8 file ALWAYS
(if settings are not changed with some parameters), see below, and note
the 'c3 a9' which represents the e with accent-how is the name of that
thing again ;).
mysql> show create table charsetlatin\G
*************************** 1. row ***************************
Table: charsetlatin
Create Table: CREATE TABLE `charsetlatin` (
`i` int(11) DEFAULT NULL,
`t` varchar(20) CHARACTER SET utf8 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci
1 row in set (0.00 sec)
mysql> show create table charsetutf8\G
*************************** 1. row ***************************
Table: charsetutf8
Create Table: CREATE TABLE `charsetutf8` (
`i` int(11) DEFAULT NULL,
`t` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> select * from charsetlatin;
+------+------+
| i | t |
+------+------+
| 1 | � |
+------+------+
1 row in set (0.00 sec)
mysql> select * from charsetutf8;
+------+------+
| i | t |
+------+------+
| 1 | � |
+------+------+
1 row in set (0.00 sec)
mysql> quit
Bye
~> mysqldump test charsetlatin >charsetlatin.sql
~> mysqldump test charsetutf8 >charsetutf8.sql
~> file charsetlatin.sql
charsetlatin.sql: UTF-8 Unicode text
~> file charsetutf8.sql
charsetutf8.sql: UTF-8 Unicode text
~>
~> hexdump charsetlatin.sql
.....
00000530 41 42 4c 45 20 4b 45 59 53 20 2a 2f 3b 0a 49 4e |ABLE KEYS
*/;.IN|
00000540 53 45 52 54 20 49 4e 54 4f 20 60 63 68 61 72 73 |SERT INTO
`chars|
00000550 65 74 6c 61 74 69 6e 60 20 56 41 4c 55 45 53 20 |etlatin`
VALUES |
00000560 28 31 2c 27 c3 a9 27 29 3b 0a 2f 2a 21 34 30 30
|(1,'..');./*!400|
00000570 30 30 20 41 4c 54 45 52 20 54 41 42 4c 45 20 60 |00 ALTER
TABLE `|
.....
~> hexdump charsetutf8.sql
.....
00000500 4b 45 59 53 20 2a 2f 3b 0a 49 4e 53 45 52 54 20 |KEYS
*/;.INSERT |
00000510 49 4e 54 4f 20 60 63 68 61 72 73 65 74 75 74 66 |INTO
`charsetutf|
00000520 38 60 20 56 41 4c 55 45 53 20 28 31 2c 27 c3 a9 |8` VALUES
(1,'..|
00000530 27 29 3b 0a 2f 2a 21 34 30 30 30 30 20 41 4c 54
|');./*!40000 ALT|
.....
|
|
0
|
|
|
|
Reply
|
luuk (814)
|
8/4/2012 9:52:11 AM
|
|
|
17 Replies
51 Views
(page loaded in 0.328 seconds)
|