During the past couple of months, our customers directed many questions
to us concerning Internationalized Domain Names. This overview lists
all frequently asked questions and answers. It will be regularly updated
in order to reflect further developments in this area.
Internationalized Domain Names (IDN) are domain names
that contain »special characters« such as the German
umlaut characters (»ä«, »ö« and »ü«).
What looks normal for the habitants of a specific region can be nothing
but »hieroglyphics« for other people since these local language characters
do not exist in their alphabet.
Prior to the introduction of IDN, the only characters allowed in domain names
were the letters from »a« to »z«, digits from »0« to »9« and the hyphen.
This was primarily due to historical reasons.
Meanwhile, more and more registries expand the set of allowed characters for
the names of their domains. Type and number of the additional characters vary
broadly from registry to registry, however.
We have summarized the IDN expansions for the different top level domains
for you in an overview.
It is already possible to enter local language characters into your browser.
But the Domain Name System, the complete infrastructure, does not know
them and therefore cannot handle them. Characters like that may be ignored
or can even cause errors.
So far, a German company called »Müller-Bier« would have had to register
the domain name »mueller-bier.de«. This alternative notation for the missing
umlaut characters consistently lead to legal disputes, because trademarks are
always registered with the original spelling containing the umlaut »ü«.
In other language areas it is not even possible to replace local
language characters with a combination of other characters or to
leave out accents, because the meaning of the word would be
different. The Turkish word »Sürat«, for example, means speed while »Surat«
means face.
Asian characters do not even have the possibility to
replace their special characters and therefore are limited in using
the internet.
The vast majority, circa 3/4 of all internet users, are from not
English-speaking language areas. Those users shall in future be
able to register domain names in their native language.
Local language characters can be used from the second level of a domain name on.
In »köln.de« the character »ö« is situated on the left hand side
of the first dot, i.e. on the second level. Endings, so called top-level-domains
(»de« in our example),
will not be internationalized in the foreseeable future.
Domain names are not only needed for browsing the internet, but
also for many other services and programs, e.g. e-mail,
name servers, etc. The general internet user does not notice those services,
which work in the background, but they are of vital importance.
The adaption of all of these services and programs is complex. Therefore the
internet community has agreed on a certain technical method, which is
disadvantageous in some fields, but considering the realization less difficult
than others.
The Unicode Consortium
lists all local language characters existing worldwide. Depending on the way
they are counted, there exist about 70.000 different characters. Characters
can be letters or words represented by one character (e.g. Chinese characters).
If it should be possible to use all existing characters, e.g. Arabian, then it
is unavoidable to respect that some languages are written from right to left.
This problem is not to be underestimated.
Many characters are used in different alphabets. This can sometimes confuse
internet users. For example, a well-known American ISP has registered the
domain name »AOL.COM« (often written in capital letters).
Certain Greek letters appear exactly the same, but have a different encoding
in UNICODE and therefore a different domain name would be registered.
However, users are likely to mix them up.
Of course the danger of abuse through copying websites is great. If
the website is a shop where customers can pay by credit card, the
copy may be used to collect card numbers.
In Asian languages it is common that a word can be represented by
different characters (»symbols«). Strictly speaking, all different »symbols«
would have to be registered. The would make it very expensive for a registrant
to register some domains, because there can be up to 20 different symbols for
the same name.
The registries often choose to block the registration of other variants
of a domain name when one variant is being registered. If desired, the
registrant can then additionally register the other variants, but no one else
is allowed to do so.
Mozilla 1.4 and Netscape 7.1, respectively, as well as Opera 7.2 do support IDN.
Plug-ins are available for Microsoft's IE to make it IDN-aware
(see technical background.)
Your browser transforms the characters of the internationalized domain name
into a new character string. The converted string only consists of characters
from a to z, digits, and probably hyphens. This clever encoding makes it
possible that most part of the necessary adaptions can be realized on the
client side (i.e. in the end software like browsers, e-mail programs, etc.).
The new encoding is called Punycode or also AMC-ACE-Z.
The domain name »Müller-Bier.de« is encoded as »xn--mller-bier-9db.de«.
All domain names encoded this way begin with the character string »xn--«.
You can try and make conversions yourself with
our Punyocde conversion tool.
Before the technicians agreed on Punycode, a big registry started a rather
questionable test. They used the so-called RACE-encoding. Here, all encoded
domain names start with »bq--«.
It was obvious from the beginning that these domain name encodings would be
useless after all parties had come to a final agreement. But since business
interests dominated the action of this registry the inadequate encoding was
sold in spite of knowing the facts.
It depends on the registries that are responsible for the
top level domain, when IDN will be introduced.
For .com and .net domains IDN registrations are possible since
December 14th, 2003. With DENIC, the registry for .de domains,
Internationalized Domain Names are possible since March 1st, 2004, while Afilias
(.info domains) allows umlauts since March 16th, 2004.
You can find a timetable containing the IDN-expansions for the different
top level domains - if available - in the
following overview.
We have set up a pre-registration system for our customers in the run-up
to the IDN introduction for .de and .info domains.
It wasn't possible to guarantee for a registration of the domain names submitted
to the system. The possibility of an assignment was substantially increased,
however, since the system automatically sent the collected requests to the
registry directly after its opening. Submitting a domain wish to our queue was
free of charge. Our customers only had to pay for successfully registering
a domain.
We are planning to offer pre-registration systems like these to our customers
in the future, too. Be it that additional characters become possible for
existing top level domains, or that new top level domains are introduced
(like »eu«).
For other registries, important information (like the number of allowed
characters, for example) is still missing. No respectable registrar can accept
pre-registration requests for domains of such a registry at this point of time.