I am having a lot of trouble with changing the encoding of an existing wordpress site. Basically the site was mostly in English (latin characters) however now we need to include many other languages such as Thai and Chinese etc.
I have changed the encoding in the wp-config file from
define( ‘DB_CHARSET’, ‘utf8mb4’ );
define( ‘DB_CHARSET’, ‘utf8mb4_unicode_520_ci’ );
Which now allows me to save the foreign characters. That is working perfectly 🙂
I have also changed the collation for all tables in the database to be utf8mb4_unicode_520_ci as well.
However this change is causing issues for the existing English content.
Now in the English content I see the question mark in the black diamond for many spaces but not all spaces and for apostophes and quotation marks.
The content in the two different formats appears like this:
Using define( ‘DB_CHARSET’, ‘utf8mb4_unicode_520_ci’ );
which�lies approximately 50-minutes from central Bangkok, is one of Thailand�s premier golfing destinations. The land at�the members only�</p>
Using define( ‘DB_CHARSET’, ‘utf8mb4’ );
which lies approximately 50-minutes from central Bangkok, is one of Thailand’s premier golfing destinations. The land at the members only
The apostophes and quotation marks are easily fixed with a search and replace on the database however the spaces are still a problem. Because I don’t know what to search for and replace in the database to replace the “different” spaces.
A search for
returns no results as this is not actually stored in the database. So what can I do to replace all the “BAD” space characters with normal space characters?
Please note that I have already tried
remove_filter('the_content', 'wptexturize'); remove_filter( 'the_content', 'wpautop' );
and neither resolves this issue.
I have also tried the UTF8 sanitise plugin and no help either.
Lastly I have tried exporting the entire database and there appears to be no difference in the .sql file between the normal spaces and the “BAD” spaces, however even after export and reimport of the same .sql the problem still persists so there must be a difference between the space characters.
I’m completely stumped.
Thanks for your help 🙂