It is the scourge of just about anyone who places content on the internet and it confronts email marketers to an even greater degree as the text copy we create is often read by subscribers all over the world. It is the character set, and if you are like absolutely every single other person who has ever been online, you have noticed the weird characters which fall into some copy which make quotation marks, apostrophes, and other symbols display as the completely wrong thing and turn your well-designed and thought-out email look like a dog’s breakfast. You don’t have to surrender to the vagaries of the character set if you master the art of proper one for your email body copy.

7 bit ASCII supports 128 characters while 8 bit ISO 8859 has double

All your email is, in the final analysis, is a huge long string of 0s and 1s and the only way that it is displayed in a format that is identifiable to mortal human beings is through the process of mapping which matches the bit sequences into alphabetical character sets. The most widely used character set in the English speaking world is ASCII which is a 7 bit sequence which allows for 128 characters. This number is taken up by the 26 upper and lower case letters, the ten digits, and the range of various punctuation and special characters. However, there is a whole family of ISO 8859 character sets which are based on 8 bit sequences and therefore can display 256 or double the number of symbols as ASCII. This allows for the integration of a great number of letters and characters specific to various European languages but not all, and therefore the ISO 8859 family has split up into more than a dozen different members to accommodate all.

Unicode is 16 bit for a total of 65,536 characters

However, even the extensive ISO 8859 character set groupings are nowhere near enough to take into consideration the full range of common characters utilized in various languages such as the ones found in Asia. That is why a 16 bit sequence version entitled Unicode has become extremely popular in international email marketing as it allows for 65,536 characters. It may seem hard to believe but there are still many characters utilized all over the world which don’t fit even within Unicode’s extensive character base, but at least we can rest assured that many of the most popular characters in use internationally are incorporated within that sequence. The most popular encoding of Unicode’s character set is the UTF-8 version which translates the binary data into numbers as compared to Unicode which translates numbers into characters. Still with me? Good.

Gmail ignores Content Type and converts to UTF-8

Now if you’ve been able to digest all of that binary conversion stuff you may thing that’s all just fine and dandy except that as just about everything else in the email marketing world it’s not quite that easy or simple to implement. Most email clients will take the Content Type from the header of your email and display it the way you have determined. However, the very popular Gmail and its Android, iPhone, and iPad variants all totally ignore your Content Type and convert your text to UTF-8 whether your email is designed to best be displayed in that sequence or not.

How do you get around this? The best way is to convert all of your special characters to their equivalent HTML entities as you enter them into your email. Therefore, instead of entering & you are going to enter & and instead of a quotation mark you will enter ” and so on. Fortunately there are HTML entities for just about every symbol you will ever want to display, although some of the more esoteric ones such as poker card symbols are not universally supported so even though you might use ♥ ♠ and so on, you might still end up getting a weird and unintended symbol on some email clients. Pay very close attention to character sets or your emails will turn into a total mess.