Converting database from Latin1 to UTF8.

Before going to conversion step I wish to add some information about UTF-8 character set. UTF-8 character is prepared for world domination, but Latin1 isn’t. If you wish to allow users to post their own languages, and if you want users from all countries to participate, then you have to switch the tables to UTF-8 character set. Latin1 covers only ASCII and western European characters. Also no translation needed when importing / exporting data to UTF-8 aware components ( JavaScript, Java, etc). So I hope, now you understand the importance of using UTF-8 character set in your website. Here is the steps for converting database from Latin-1 to UTF-8.

Take database dump with the help of mysqldump command

mysqldump -u $USERNAME -p $DATABASENAME > databasedump.sql

Then use an editor to replace the charset: here I am using VIM editor.

vim databasedump.sql

:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq

Create a new empty database in cPanel

In phpMyAdmin under OPERATIONS change the COLLATION to “utf8-unicode-ci”
Import the database to the new DB

mysql –default-character-set=utf8 -p -u $USERNAME $NEWDB < databasedump.sql

You should now have the old unchanged database; and a new database in utf8. Connect the website to the new db to test. If you get diamond symbols for special characters, e.g. £ signs; then try to force mysql connection in utf8 in the site’s configuration file via:

$mysql_charset=”utf8″;

Leave a Reply

Your email address will not be published.