## Numbering Systems

Before you can understand how the rest of the coding system used in DEMCS™ works and the implications of the decisions made when designing it, it is important to understand some of the basics of how numbering systems work.

### Decimal (base-10)

The numbering system most people are familiar with is based on ten digits (0-9). Mathematicians call this the base-10 numbering system. Each character can be one of 10 different symbols. A one-character number can represent 10 different values. If you add another character you can then represent 10 times that original set of 10 values, or 100 different values (00-99). Every additional character increases the number of different possible values by 10. This can be expressed in terms of exponents by saying that the number of different possible values is equal to 10^{n} where n is the number of characters.

### Binary (base-2)

The next most well known numbering system is the binary system which uses just ones and zeros. All computers use this system because it is much easier for electronics to distinguish between two different states than 10 different states. The binary system is also known as the base-2 number system. There are 2 different values for each character. The number of possible values is calculated similar to the way it is done for base-10 numbers. That is, the number of different values is equal to 2^{n} where n is the number of characters.

### Hexadecimal (base-16)

Another, less well-known, system is the base-16 system, otherwise known as the hexadecimal system. This expresses 16 different values per character by using the numerals 0-9 plus the letters A-F. This system is really only familiar to computer scientists and programmers. It is often used because it is easy to translate directly between binary and hexadecimal and the latter requires fewer characters to represent the same number of different values.

### Alphanumeric (Base-36)

There is another "numbering system" that people use all the time without even thinking of it. Numbers of this type aren't usually thought of as actual "numbers" even though we often call them numbers. These are commonly called serial numbers or model numbers. These numbers use a combination of the numerals 0-9 and all of the letters A-Z. Another way to think about these "numbers" is to call them base-36 numbers. Each character can represent one of 36 different possible values - the 10 numerals plus the 26 letters of the alphabet. If hexadecimal can represent more different values in the same number of characters than binary, then imagine how many more different possible values can be represented by a base-36 number. Using the same formula as before, a base-36 number can represent 36^{n} different values where n is the number of characters.

This gives base-36 numbers far far more capacity than regular base 10 numbers. Five characters can represent 10^{5} or 100,000 different values in base-10, which seems like quite a lot. However, those same five characters can represent 36^{5} or 60,466,176 different values in base-36. The real benefit comes not from representing gigantic numbers but in representing a normal sized number with far fewer characters. Remember the goal of this classification system is to create a distinct code for each branch in the tree of knowledge that comprises educational material. The codes for all the sub-subjects under any specific subject only need to be big enough to distinguish one child-subject from its siblings.

### Why DEMCS™ uses base-36 codes

In the realm of educational material, there are really only about 12 - 20 different child-subjects per parent subject. Expressed in base-10 this would require two characters per level in the hierarchical tree. If the tree had 20 levels then this would result in 40 characters being used to specify a certain topic from the top level all the way down to its individual branch. Since these code numbers are also the names of the folders in which the content is stored, we will also need the slashes, for a total of 60 characters. By using base-36 code numbers, we can eliminate 20 characters, resulting in a 40 character code name which corresponds to a 40 character path name. This leaves much more room for the names of the individual files that comprise the content of the topics. While 20 characters may not seem like a lot, remember, this is just an example and many topics may be buried much more than 20 levels deep in the tree. Plus, some file systems, especially CDs have a limit on the total length of a path-name. Besides, keeping the code numbers as short as possible will make it easier for people to copy them down if necessary.

Next: Sorting...