- Details
- Written by Tech Notes
- Published: 26 April 2014
Number of characters in a Domain Name, Labels, TLD, and Hostnames
I started this article more than a year ago, in 2014, and still have not gotten a chance to complete it. I've gone ahead and posted what has been done for now, and it will get updated later.
Intro
There seems to be some serious confusion on the Internet about how many characters you can have in a domain name or in domain name labels. I say label, because that is what each section of a domain name is technically called. For example TechNotes.whw1.com has 3 Labels. The Hostname of TechNotes is a label, and I purposely used that name of Hostname to bring your attention to it's usage, since people are confused about that too. WHW1 is a label, and so is com (which by the way stands for commercial or company).
Now, a moment for an Off Topic subject, but slightly related.
Hostname: As I recall, a Hostname is not the same thing as a host name, or host, but it can be used for those as well. A Hostname is a technical name and it is typically used to specifically reference the label or subdomain after the secondary domain (secondary level domain). The www of www.whw1.com could be a Hostname for a world wide web server with the secondary domain being whw1 of the domain name whw1.com. More relaxed usage of Hostname now references any last label of a domain name, and so it does not need to be just the 3rd label, and can be higher labels.
Note that in the RFC documents the phrase of Domain Name and Domain are often interchangeably used, and so TechNotes.whw1.com would be considered a Domain Name as well.
When you say Domain, that word of Domain can refer to something with a single Label or multiple Labels. Thus the com is a domain all by itself. It is in fact a Top Level Domain.
In this case, the Top Level Domain (TLD) is the com (it can be a net, info, tv, biz, us, ca, museum, nyc, sf, localhost, and many more from the world; both public and private), and the separator is a period (.), and this separator is used between each label.
As various industry standards, and commonly, when you say "Domain Name", it means the TLD and the secondary domain (second level label; the label or domain that comes after the TLD) together. The domain of WHW1.com is a Domain Name, but again in the RFC and technically speaking, a Domain Name could reference any number of labels that make up it's domain.
Might as well clarify about sub-domains. WHW1 is the subdomain of .com, and TechNotes is a subdomain of WHW1.com, and there can still be even more subdomains (labels) separated by dots.
Resolving Argument About Domain name having 253 or 255 characters allowance.
Not only is it difficult to get a clear answer on the question of how many characters a domain name is allowed to have, but lots of incorrect information are floating on the Internet. In addition to that, the "official" docs defining this are confusing to most, and especially to those who are new to DNS, protocols, Internet function, or domains.
Lets start by going directly to the source in numerical order.
A) http://www.ietf.org/rfc/rfc1035.txt : "To simplify implementations, the total length of a domain name (i.e., label octets and label length octets) is restricted to 255 octets or less."
B) http://www.ietf.org/rfc/rfc1123.txt : "Host software MUST handle host names of up to 63 characters and SHOULD handle host names of up to 255 characters."
C) http://tools.ietf.org/rfc/rfc2181.txt : "A full domain name is limited to 255 octets (including the separators)."
D) http://www.ietf.org/rfc/rfc4408.txt : "When the result of macro expansion is used in a domain name query, if the expanded domain name exceeds 253 characters (the maximum length of a domain name), the left side is truncated to fit, by removing successive domain labels until the total length does not exceed 253 characters."
So which is it, 253, or 255, and what about the 63?
After all, some places like Wikipedia say the "full domain name may not exceed a total length of 253 ASCII characters in its textual representation": http://en.wikipedia.org/wiki/Domain_name
(Pause)
I am going to pause for a moment to cover the meaning of Octets, as this gets people confused too, and it is relevant to understanding what is explained later. Octets in this case mean bytes, and computer characters are each defined by a single octet (single byte). Each byte is made up of 8 bits, and so an Octet of bits. If you know nothing about a bit, then, for now, you can think of it as a single digit number that can only be 1 or 0 like a a switch being On or Off, and the location/position/order of the digit can have different meanings. I assume Octet is based on the ancient word Octo, which means 8, or "of 8". A set of 8 bits can define a number from 0 (zero) to 255, and so a total of 256 possible combinations (since zero is counted too). So a single byte (8 bits, or octet in this context) alone could indicate any one of 256 characters or can be used to represent a maximum numerical value of 255. Of course human language character sets like Latin have much less quantity of characters than 256. Do not confuse this this number of 256, or the highest value possible being 255, with the 255 number being discussed for the number of characters allowed in a domain name.
(Continue)
The question at hand is, are there 253 bytes/Octets/characters allowed in a Domain Name or are the allowed characters 255, and again where does the 63 number fit into all this?
The Answer
The answer has to do with understanding the communication protocol of domain information versus the visible or readable characters displayed and readable by humans. For example, when a domain name of maximum size is communicated from server to server it will use 255 Bytes (Octets) to transfer this information. To transmit and receive this information in an understandable form and know how long the domain name is, the FIRST BYTE (First Octet), which is referred to as a Label Length Octet (note that such a byte exists per label), is used to indicate the start of the transmission and the length, and the LAST BYTE (Last Octet) is used to indicate the end of the transmission. So, a domain with 253 readable characters would use 253 Octets (253 Bytes; and 1 byte per character) and be transmitted as a 255 Byte packet of information. By the way, the Last Byte which indicates the end of the domain is the indication of the root domain, also referred to as a Null Label, which is a represented by a period at the end of every domain name, but writing convention has left the ending period out; this means that "technotes.whw1.com" is truly "technotes.whw1.com.", but as an accepted convention we just don't mention the ending root domain referenced by the last dot.
This brings us to another supporting portion of the RFC 1035 where it is stated that "each label is represented as a one octet length field followed by that
number of octets. Since every domain name ends with the null label of the root, a domain name is terminated by a length byte of zero. The high order two bits of every length octet must be zero, and the remaining six bits of the length field limit the label to 63 octets or less."
Basically, what this whole RFC stuff says is in order to transmit the size and information of a domain label, 1 Byte (a label length byte/octet) will need to be consumed to do so, and plus the number of bytes for the label characters themselves, and since that one label length byte can only represent a maximum number of 63 (due to only 6 bits of it's 8 bits being used), each label is limited to 63 characters.
So, in order to have the above mentioned maximum domain size of 253 characters transmitted as a 255 Octet packet of information, what is needed. First lets visualize the Label Length Octets as dots, and so any domain would start with such a dot, and end with a dot for the root domain. One combination can be this:
Example
9 |
't' |
'e' |
'c' |
'h' |
'n' |
'o' |
't' |
'e' |
's' |
4 |
'w' |
'h' |
'w' |
'1' |
3 |
'c' |
'o' |
'm' |
0 |
THIS ARTICLE IS IN THE PROCESS OF BEING COMPLETED. PLEASE COME BACK TO SEE IT'S COMPLETED FORM. IT SHOULD BE COMPLETED SOON.