What is Punycode?

Photo by Mika Baumeister on Unsplash

Punycode
noun
Unicode that converts words that cannot be written in ASCII, like the Greek word for thank you ‘ευχαριστώ’ into an ASCII encoding, like ‘xn--mxahn5algcq2e’ for use as domain names.
What does this actually mean?!

Writing with numbers

As with all things computers, it all boils down to numbers. Every letter, character, or emoji we type has a unique binary number associated with it so that our computers can process them. ASCII, a character encoding standard, uses 7 bits to code up to 127 characters, enough to code the Alphabet in upper and lower case, numbers 0-9 and some additional special characters. Where ASCII falls down is that it does not support languages such as Greek, Hebrew, and Arabic for example, this is where Unicode comes in; it uses 32 bits to code up to 2,147,483,647 characters! Unicode gives us enough options to support any language and even our ever-growing collection of emojis.

So where does Punycode come in?

Punycode is a way of converting words that cannot be written in ASCII, into a Unicode ASCII encoding. Why would you want to do this? The global Domain Name System (DNS), the naming system for any resource connected to the internet, is limited to ASCII characters. With punycode, you can include non-ASCII characters within a domain name by creating “bootstring” encoding of Unicode as part of a complicated encoding process.

Share