Number bases: history and possible future.

ENG

New site

Advanced search

[ New messages · Forum rules · Members ]

Page 1 of 1
1

Number bases: history and possible future. (History in detail. Discuss usage of other bases.)

Number bases: history and possible future.

Gondor2222

Date: Wednesday, 14.08.2013, 00:49 | Message # 1

Space Pilot

Group: Users

United States

Messages: 92

Status: Offline

Warning: this post is VERY long so I've condensed it into a series of spoilers. There's a lot of history in here so if you want to just get to the debate without taking into account ancient linguistic theory feel free to skip to sections relevant to your interests.

Basic explanation if you're not aware of how a positional number system works:

Prehistory of the development of base ::::: and why we call it "ten":

While most people accept that we count in base ::::: because we have that many fingers, I have never heard an argument for nor been able to find literature explaining how one arises from the other. However, I do have a hypothesis:

Counting in quantities of ::::: almost certainly arose before positional number systems, as fingers were the most convenient way to represent numbers before written systems came about. Since we have ::::: fingers, and early humans usually didn't need to count very far above this, most languages assigned a name to each quantity of fingers possible. In other words, they came up with ::::: terms for quantities (though at this stage most considered ( ) dots as not a quantity but merely the absence of quantity). There is evidence for the influence of such a counting system on the names for numbers: Proto-Indo-European (from which all major modern European languages but Finnish, Estonian, and Hungarian are descended- descendents also include Hittite, Latin, Sanskrit, and Hindi) reconstructions indicate that the name for "five" was probably the word for "fist" and the name for "ten" was probably the two words for "two hands"

Presumably, early humans interpreted larger numbers than these not as the sum of fingers onto one person but as the sum of pairs of hands, and this is why there are no units after (:::::). This means that the number of fingers contained on two full hands is both a unit AND a radix- it can be expressed with either one or two digits, and this is probably why the need for a zero was ignored for so long, as what we think of as a multiple of "10" was expressed as a multiple of the units digit "ten". This created a mixed base10-base1 system where both the units and the digit after it go up to "both hands" and the only way to express larger numbers is by reduplicating hands. I.e. any digits to the left of the "hands" digit are in unary, and what we think of as 200 is would probably be expressed in this ancient language as "hundred-hundred". 20 would have originally been literally "two tens" rather than two times ten. Etymology for the number 20 in English agrees with this analysis of twenty as "two-tens". Numbers from 11 to 99 would have been expressed as the sum of full pairs of hands (tens) plus non-full pairs of hands (units).

Numbers on the scale of 100 typically aren't needed for specification by early stone age societies (who might have used a term for a large number akin to "zillion" but with a meaning corresponding to what we think of as hundreds or thousands. In fact, the word that came to mean "100" may have originally been this "zillion"-like term before become fixed as "ten tens". There is some practical indication of this: although we don't know the original usage of the PIE term (kmtom) or even if it was used as a definite number, its daughter language Proto-Germanic began calling ten tens "kmtom count" in their own language, indicating that the original kmtom was not a definite count to begin with. Etymologically, the PIE word for "100" appears to be some sort of conjugation of "ten", indicating it might have literally meant something similar to "many tens".

My hunts for etymology of "thousand" were inconclusive: there are reconstructed proto-germanic words for "thousand" but their origins are unknown and the complete incoherence between the oldest attested forms of "thousand" in various languages younger than PIE indicates there likely was no word for "thousand" in that language. This makes more sense if we assume we were right about PIE having no definite word for hundred and instead having a general "large number" term which became a definite term only in its daughter languages. Thousand need not derive from terms for smaller numbers, as the oldest attested ancestor of the Latin word for thousand appears to have been an abstraction derived from the grinding of grain.

The earlier nonpositional systems:

Latin: the long story with the history of each symbol:

Earlier languages tended to use a combination of base-10 and other bases. As described in the prehistory section, the earliest of these was probably a mixed unary-base ::., where counting started with each number below a fist (of ::. fingers) being represented by that actual number rather than an abstract digit, and a second "digit indicating whether there were one of such hands or two. Latin's smaller numbers exhibit this system: the numbers before a full hand are given in unary (|, ||, |||, and |||| in the initial stages of its development.) These strokes originated as tally marks and may either be simplifications to ease notching in rock (straight lines only), representations of the shapes of fingers, or both. ::. was represented with two crossed notches in a ^ shape, probably originally representing a hand, and x was two such double notches, probably originating as a simplified drawing of two hands together. In the original system, the entire system was unary and the extra notches on the ^ and x served merely as placeholders. For example, ::. was marked ||||^. However, since ^ is never marked before notching ||||, ^ eventually came to imply all marks before it, and ::. became just ^. The same happened with the rather copious ||||^||||x becoming just x. In a stroke of genius, the writers of the tally systems apparently realized it would be quicker to write |||| as |^, expressing it as the | before ^, and |||| became |^ and ^|||| became |x. Around this time, the tallies began to be used in the same texts as literary writings, and were quickly either confused or identified with the similar I, V, and X, and became replaced by them. (In the same way a confused stone-age Arabic-numeral society adopting the Latin alphabet might have started to use the numerals "IZ34S67B9".
the tenth V in the counting system received an extra stroke and became ᗐ. Attempts to make this more quickly identifiable transformed it into ⊥, and then the "we want it to look like our letters" occurrued again and it became L. The tenth X also received an extra stroke to become various symbols with the extra stroke in different places (but all vertical), but Ж was most popular. Yet another "we want it to be made of letters" occurred and it became ƆIC, which was abbreviated variously to Ɔ or C, with C of course winning because that was a Latin letter. The hundredth V was circled Ⓥ but became Ð when people attempted to make it faster to write and from there, if you had been paying attention to what happened to the previous letters, obviously became D. Similarly, the hundredth X Ⓧ became ∞ to ease writing, then ⋈ to resemble a letter, and then M. Larger numbers didn't become necessary until after the fall of the western Roman Empire, and so are less important here.

Latin: the short story:

Arabic numerals: origins:

This system can be traced back to the Brahmi numerals. Unlike Roman Numerals, only the digits -, =, and ≡ are easily recognizable in Brahmi, and +, which comes after ≡, has only a small number of possible origins. However, due to the cursive script of the language it is nearly impossible to determine anything of the origins of the other symbols without extensive comparisons between languages and large numbers of original texts which are not known to still exist, as several strokes often became a single cursive stroke. This number system may have started as a tally system, but by the time of our first records of it it had already become a hybrid exponential base-10-100 script. That is, it had 20 numerals, the first equivalent to 123456789 and the second equivalent to 10 20 30 40 50 60 70 80 90. A number like 78 for example, is written with the unit "70" and then the unit "8". For powers larger than these, each power of 10 had a separate symbol. For example, 4348 is written with the units "4" "1000" "3" "100" "40" "8". The first nine of these symbols were later adopted into a then-new positional system which added a zero and a negative sign (which was originally +, funnily enough) as well as a decimal point, creating a positional system identical to the modern one except for the use of different symbols and the lack of a bar to indicate repeating numbers. This bar was introduced many centuries later by Muslim scholars. The symbols used for the units evolved over time, shown in the preceding image from the second-to-bottom row to the top. When introduced to Europe in the 12th century, Arabic numerals were adopted in a text-like vertical positioning system that gave the numbers "tails" and "necks" like modern English's g's and h's respectively. Centuries later, as typesetting gained speed, attempts to create a uniform system of numerals with all the numerals at the same height and depth became popular and largely replaced these old "tailed and necked" numbers. Originally, Arabic numerals were restricted to the intellectual elite, who learned them alongside Roman numerals and were able to convert between the two, but as more difficult mathematics involving multiplication and division became common in European life, the Arabic numerals spread to different classes, making Roman numerals obsolete for regular mathematics by the 16th century.

The language of numbers and its relevance to the base-debate:

English is currently mostly based in base ::::: (ignoring words like dozen, gross, great gross, and score). It could in its current form be used to express numbers in lower bases by simply skipping certain words, like skipping from eight to ten and from eighteen to twenty if using base (::::.). However, the fact that it has been based in base-::::: for so long means that there are now ambiguities as to the relation between the word, its meaning as a quantity, and its meaning as a representation. For example, I've heard arguments that "10" in hex is not "ten" but "sixteen", and this seems to be the most popular opinion because people seem to insist that the meaning of a representation is only meaningful is base 10. However, communicating what is written in another base is more efficient if you communicate in the same base as you write, as this eliminates the need for the speaker to translate from the "written base" to the "spoken base". Under such a system, the number after 1F in hex is called "twenty", for example. Although the English words are ultimately based etymologically in base :::::, most people are unaware of these histories and are therefore unlikely to be confused that "ten" in hex does not match its pronunciation as a butchered PIE "two hands". However, people still associate the name with a fundamental quantity in unary (they think of "ten" as not 10 but :::::, an association that makes it difficult to adopt a language in a different base using the same number words.)
However, the simplest language for communicating in a base above ::::: would recycle all existing words so that it's easier to learn the (now small number of) new words. For example, in a base 12 system with the new units "ah" and "be", pronunciation after nine would go "ah, be, ten, eleven...nineteen, ahteen, beteen, twenty...ninety-be, ahty...bety...bety-be, one-hundred". In this case, "ah" is used instead of the English pronunciation of "A" because "1A" and "18" are homophones or near-homophones under the latter pronunciation. Also "4D" sounds like "40" in systems containing the unit D.

The possible proposals and their merits. In this section, I have expressed those numbers for which base-::::: and the given-base representation refer to different quantities using dots. For example, decimal 6 is represented as ::: in base 4 but 6 in base 7.

So, reading what you guys have from this, what do you guys think? Questions and discussions on the language of numbers and the theoretical adoption of other bases is welcome in this thread.

Edited by Gondor2222 - Wednesday, 14.08.2013, 00:51

midtskogen

Date: Wednesday, 14.08.2013, 17:37 | Message # 2

Star Engineer

Group: Users

Norway

Messages: 1674

Status: Offline

Quote (Gondor2222)

descendents also include Hittite

Some have argued that Hittite is a sister language of IE, that IE and Hittite share a common ancestor. This to explain some features in Hittite that IE lacks, such as the three genders. It's a finer point of classification in my opinion, and the gender system in Hittite just serves as an excellent explanation for how genders arose in IE.

Quote (Gondor2222)

reconstructions indicate that the name for "five" was probably the word for "fist"

I don't think that is well supported. I think it's lost to obscurity. It has been proposed that IE penkʷe is the postfix "and" (-kʷe) and a word for thumb, then meaning "and the thumb". A colourful explanation, suggesting that counting arose from naming the individual fingers and last and fifth the thumb. But I doubt it.

Quote (Gondor2222)

the name for "ten" was probably the two words for "two hands"

dekṃ(t) = de-kṃt ("two hands")? It's really tempting to read the de- as "two". And "kṃt" looks very similar to Germanic "hand" which is otherwise unexplained. But this I think belongs to pure speculation. It looks more likely that dekṃt "ten" and kṃtom "hundred" are related (not necessarily excluding the "two hands" hypothesis), but how or even which way doesn't seem entirely clear.

What is pretty certain is that our counting system arose from the number of fingers (otherwise a more practical base would have been chosen), and that the names for the numbers are very, very old and have changed relatively little. They're certainly recognisable if we go back more than 5000 years.

Note that there is another non-IE language in Europe not related to anything else, Basque. It's probably the last surviving remnant of a language group that was widespread in Europe before the IE invasion. It uses a vigesimal counting system, and it has been suggested that Basque has influenced French, Danish and Celtic languages which have a semi-vigesimal system.

Quote (Gondor2222)

its daughter language Proto-Germanic began calling ten tens "kmtom count" in their own language, indicating that the original kmtom was not a definite count to begin with.

I'm not sure what you're saying here, but it's worth noting that "hundred" in Germanic languages meant "120" well into the middle ages. It may be an early Germanic invention.

Quote (Gondor2222)

then M. Larger numbers didn't become necessary until after the fall of the western Roman Empire, and so are less important here.

There was a need for larger numerals (than 1000) even during the republic. Such as in economy. There exist a couple of notations: adding lines around the letters or stacking up like this: CCCCIƆƆƆƆ (deciens centena milia, 1,000,000), but I don't have my books here to check what was used in the different periods. I think, though, such extra notation frequently was omitted, assuming it was understood whether the counting was in singles, thousands or 100,000's.

A discussion of the Babylonian numerals would fit in here as well, which was sexagesimal (well, more precisely decimal-sexagesimal perhaps). The system survives in our minutes, seconds and angles.

Added (14.08.2013, 20:37)
---------------------------------------------
In my copy of my favourite encyclopedia, Natural history by Pliny, there are many big numbers, which are either written in full (such as "ā turbidō ad lūnam uīciēns centum mīlia stadiōrum" "from the windy air to the moon two million stades (=370,000 km)). Other places regular numerals with strokes above and/or to the sides to multiply with 1000 and 100,000. But the manuscript has been copied many times at the notation could have been updated in medieval times. So we must look at inscriptions to find certain examples of ancient use. In republic inscriptions I find symbols for 5000, 10,000 and 100,000. 5000 and 10,000 look like an extra D or M stacked outside an inner one, not unlike CCIƆƆ which I already mentioned. So such symbols did exist. The 100,000 symbol (circle with a vertical bar and to V's inside) was on an inscription dated as early as c. 260 BC.

NIL DIFFICILE VOLENTI

Edited by midtskogen - Wednesday, 14.08.2013, 19:44

Number bases: history and possible future. (History in detail. Discuss usage of other bases.)

Page 1 of 1
1