3883 entries. Last updated June 19, 2013.

Linguistics / Translation / Speech Timeline

Theme

2,500,000 BCE – 8,000 BCE

Evidence for the Origin of Language in Southwestern Africa Circa 150,000 BCE – 50,000 BCE

Map showing origin and spread of language from southern Africa.  Graphic from the journal Science and the New York Times.

(View Larger)

Quentin D. Atkinson of the University of Auckland, New Zealand published "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa," Science 15 April 2011: Vol. 332 no. 6027 pp. 346-349 DOI: 10.1126/science.1199295.

"Human genetic and phenotypic diversity declines with distance from Africa, as predicted by a serial founder effect in which successive population bottlenecks during range expansion progressively reduce diversity, underpinning support for an African origin of modern humans. Recent work suggests that a similar founder effect may operate on human culture and language. Here I show that the number of phonemes used in a global sample of 504 languages is also clinal and fits a serial founder–effect model of expansion from an inferred origin in Africa. This result, which is not explained by more recent demographic history, local language diversity, or statistical non-independence within language families, points to parallel mechanisms shaping genetic and linguistic diversity and supports an African origin of modern human languages" (Abstract)

"The detection of such an ancient signal in language is surprising. Because words change so rapidly, many linguists think that languages cannot be traced very far back in time. The oldest language tree so far reconstructed, that of the Indo-European family, which includes English, goes back 9,000 years at most.

"Quentin D. Atkinson, a biologist at the University of Auckland in New Zealand, has shattered this time barrier, if his claim is correct, by looking not at words but at phonemes — the consonants, vowels and tones that are the simplest elements of language. He has found a simple but striking pattern in some 500 languages spoken throughout the world: a language area uses fewer phonemes the farther that early humans had to travel from Africa to reach it.  

"Some of the click-using languages of Africa have more than 100 phonemes, whereas Hawaiian, toward the far end of the human migration route out of Africa, has only 13. English has 45 phonemes" (http://www.nytimes.com/2011/04/15/science/15language.html?hp, accessed 04-15-2011).

View Map + Bookmark Entry

8,000 BCE – 1,000 BCE

One of the Earliest Surviving Examples of Narrative Relief Sculpture and Egyptian Hieroglyphs Circa 3,200 BCE

The Narmer Palette, one of the earliest surviving examples of narrative relief sculpture, was found during excavations at Nekhen (Greek: Ἱεράκων πόλις 'city of hawks', Strabo xvii. p. 817, transliterated as Hierakonpolis, Hieraconpolis, or Hieracompolis; Arabic: الكوم الأحمر‎ Al-Kom Al-Aħmar) in the 1890s. It is also one of the earliest surviving records of Egyptian hieroglyphs.

The Narmer Palette is preserved in the Museum of Egyptian Antiquities (Egyptian Museum) Cairo.

View Map + Bookmark Entry

The Earliest Autograph Signatures Circa 3,100 BCE

A pictographic list of titles and professions in ancient Sumeria (top), with the scribe's signature on the reverse side (bottom.) (View Larger)

Pictographic lexical lists written in ancient Sumerian pictographic script on clay tablets are the earliest literature known, and also the earliest known evidence of school and learning.

An example preserved in the Schøyen Collection (MS 2429/4 MS 2429/4) is a lexical list of 41 titles and professions, starting: Nam Gist Sita (Lord of the Mace), signed by the scribe Gar.Ama. 

The scribal signatures on this tablet and other lexical lists are the earliest autograph signatures extant.

View Map + Bookmark Entry

The Palace Archive of Ebla, Syria 2,500 BCE – 2250

Ebla Tablet

(View Larger)

Ebla tablets in situ.

(View Larger)

Ebla tablets in situ.

(View Larger)

Distribution of tablets on room shelves.

(View Larger)

Between 1974 and 1975 Italian archaeologist Paolo Matthiae from the University of Rome La Sapienza and his team discovered up to 1800 cuneiform tablets and 4700 fragments, and many thousand minor chips, representing the palace archives of the ancient city of Ebla, Syria. The city of Ebla, long known from Egyptian and Akkadian inscriptions, had been discovered by Matthiae in 1968.

Collectively, the tablets discovered at Ebla have come to be known as the Ebla tablets. Found in situ on collapsed shelves, the tablets retained many of their contemporary clay tags, by which they could be referenced by original users. 

"About 80% of the tablets are written using the usual Sumerian combination of logograms and phonetic signs, while the others exhibited an innovative, purely phonetic representation using Sumerian cuneiform of a previously unknown Semitic language, which was called Eblaite. Bilingual Sumerian/Eblaite vocabulary lists were found among the tablets, allowing them to be translated. Giovanni Pettinato and Mitchell Dahood believed the Eblaite language was West Semitic, however I. J. Gelb and others believed it was an East Semitic dialect, closer to the Akkadian language. Now it is commonly accepted that Eblaite is part of the East Semitic branch of Semitic, and very close to the Akkadian language."

"It now appears that the building housing the tablets was not the palace library, which may yet be uncovered, but an archive of provisions and tribute, law cases and diplomatic and trade contacts, and a scriptorium where apprentices copied texts. The larger tablets had originally been stored on shelves, but had fallen onto the floor when the palace was destroyed. The location where tablets were discovered where they had fallen allowed the excavators to reconstruct their original position on the shelves: it soon appeared that they were originally shelved according to subject" (Wikipedia article on Ebla, accessed 01-12-2013).

The Ebla tablets are preserved in Syrian museums in Aleppo, Damascus, and Idlib.

View Map + Bookmark Entry

The Earliest Known Dictionaries Circa 2,300 BCE

The Urra=hubullu, currently preserved at the Louvre Museum in Paris. (View Larger)

The oldest known dictionaries are cuneiform tablets from the Akkadian empire with biliingual wordlists in Sumerian and Akkadian discovered in Ebla in modern Syria.

The Urra=hubullu glossary, a major Babylonian glossary or encyclopedia from the second millenium BCE, preserved in the Louvre, is an outstanding example of this early form of wordlist. 

"The canonical version extends to 24 tablets. The conventional title is the first gloss, ur5-ra and ḫubullu meaning "interest-bearing debt" in Sumerian and Akkadian, respectively. One bilingual version from Ugarit [RS2.(23)+] is Sumerian/Hurrian rather than Sumerian/Akkadian.

"Tablets 4 and 5 list naval and terrestrial vehicles, respectively. Tablets 13 to 15 contain a systematic enumeration of animal names, tablet 16 lists stones and tablet 17 plants. Tablet 22 lists star names.

"The bulk of the collection was compiled in the Old Babylonian period (early 2nd millennium BC), with pre-canonical forerunner documents extending into the later 3rd millennium" (Wikipedia article on Urra=hubullu, accessed 05-08-2009).

View Map + Bookmark Entry

"The World's First Typewritten Document" - James Chadwick Circa 2,000 BCE – 1,700 BCE

Sides A (left) and B (right) of the Phaistos Disc. (View Larger)

The Phaistos Disc, a disc of fired clay from the Minoan Palace of Phaistos on the island of Crete, was discovered in 1908 by the Italian archaeologist Luigi Pernier, and remains the most famous document found in Crete.

"It is about 15 cm (5.9 in) in diameter and covered on both sides with a spiral of stamped symbols. Its purpose and meaning, and even its original geographical place of manufacture, remain disputed, making it one of the most famous mysteries of archaeology. This unique object is now on display at the archaeological museum of Heraklion in Crete" (Wikipedia article on Phaistos Disc, accessed 07-26-2009).

Because of the unique features of the disc, and the mysteries surrounding its origin, many people have doubted its authenticity, but no one has yet been able to prove conclusively that it is a forgery.

"The disk has the distinction of being the world's first typewritten document. It was made by taking a stamp or punch bearing the sign to be written in a raised pattern, and impressing this on the wet clay. The maker therefore needed to have as many stamps as there were signs in the script. It has the advantage that even complicated signs can be quickly written, and every example of the same sign is identical and easy to read. The disadvantage is that a considerable outlay of time and effort is required to make the set of stamps before any document can be produced. It is therefore evident that the system was not created solely for a single document; its maker must have intended to reproduce a large number of documents, though it remains some way from being an anticipation of printing.

"It is therefore all the more remarkable that after more than eighty years of excavation not another single scrap of clay impressed with these stamps had been found at Phaistos, or at any other site in Crete or elsewhere. It would be very surprising if there were not somewhere more examples of the script waiting to be found, but the disk remains so far unique, and the suspicion must arise that it was an isolated object brought from some other area.

"This impression of foreign origin can be supported by two arguments. The work of cutting the stamps, whether made directly or perhaps more likely by making moulds into which metal was poured, is a technique very similar to gem-engraving. We might therefore expect the signs to bear a stylistic resemblance to those engraved on seal-stones. In fact the style of art is noticeably different. Secondly, some of the objects depicted by the signs have a distinctly foreign appearance to those familiar with Minoan art" (Chadwick, Linear B and Related Scripts  [1987]  57-58).

View Map + Bookmark Entry

The Rigveda Circa 1,700 BCE – 1,100 BCE

One of the oldest extant texts in any Indo-European language, the Rigveda (Rig Veda) (Sanskrit: ऋग्वेद ṛgveda, a compound of ṛc "praise, verse" and veda "knowledge"), an ancient Indian sacred collection of Vedic Sanskrit hymns, was composed in the north-western region of the Indian subcontinent. 

"It is counted among the four canonical sacred texts (śruti) of Hinduism known as the Vedas. Some of its verses are still recited as Hindu prayers, at religious functions and other occasions, putting these among the world's oldest religious texts in continued use. The Rigveda contains several mythological and poetical accounts of the origin of the world, hymns praising the gods, and ancient prayers for life, prosperity, etc." (Wikipedia article on Rigveda, accessed 07-10-2011).

The date of composition of the Vedas is controversial. Some argue that the Rigveda was composed circa 3000 BCE, which would make it the oldest surviving literary work.

View Map + Bookmark Entry

One of the Earliest Known Examples of Writing in Europe Circa 1,490 BCE – 1,390 BCE

On April 2, 2011 Michael Cosmopoulos of the University of Missouri-St. Louis reported the discovery at Ilaina, Greece of a clay tablet written in Linear B script. This tablet, 2 x 3 inches in size, was preserved when someone discarded it in a trash pit, burned the trash, and inadvertently fired the clay. 

When the tablet was discovered it was the one of the earliest examples of writing found on the mainland of Europe.

View Map + Bookmark Entry

Archive of Egyptian Diplomatic Correspondence Written in the Diplomatic Language, Akkadian Cuneiform Circa 1,360 BCE – 1,330 BCE

ME E29785 of the British Museum: A letter from Burnaburiash, a king of the Kassite dynasty of Babylonia, to Amenhotep IV. The tablet is one of the Amarna Letters. (View Larger)

The Amarna Letters, or Correspondence, an archive of mostly diplomatic correspondence written on clay tablets, between the Egyptian administration and its representatives in Canaan and Amurru during the New Kingdom, was found around 1887 in Upper Egypt at Amarna, the modern name for the Egyptian capital of Akhetaten (Akhetaton), founded by pharaoh Akhenaten (Akhnaton), during the Eighteenth dynasty of Egypt.  

"The Amarna letters are unusual in Egyptological research, being mostly written in Akkadian cuneiform, the writing system of ancient Mesopotamia rather than ancient Egypt. The known tablets currently total 382 in number, 24 further tablets having been recovered since the Norwegian Assyriologist Jørgen Alexander Knudtzon's landmark edition of the Amarna correspondence, Die El-Amarna-Tafeln in two volumes (1907 and 1915).

"These letters, consisting of cuneiform tablets mostly written in Akkadian – the regional language of diplomacy for this period – were first discovered by local Egyptians around 1887, who secretly dug most of them from the ruined city (they were originally stored in an ancient building archaeologists have since called the Bureau of Correspondence of Pharaoh) and then sold them on the antiquities market. Once the location where they were found was determined, the ruins were explored for more. The first archaeologist who successfully recovered more tablets was William Flinders Petrie in 1891–92, who found 21 fragments. Émile Chassinat, then director of the French Institute for Oriental Archaeology in Cairo, acquired two more tablets in 1903. Since Knudtzon's edition, some 24 more tablets, or fragments of tablets, have been found, either in Egypt, or identified in the collections of various museums.

"The tablets originally recovered by local Egyptians have been scattered among museums in Cairo, Europe and the United States: 202 or 203 are at the Vorderasiatisches Museum in Berlin; 80 in the British Museum; 49 or 50 at the Egyptian Museum in Cairo; seven at the Louvre; 3 at the Pushkin Museum; and 1 is currently in the collection of the Oriental Institute at the University of Chicago.

"The full archive, which includes correspondence from the preceding reign of Amenhotep III as well, contained over three hundred diplomatic letters; the remainder are a miscellany of literary or educational materials. These tablets shed much light on Egyptian relations with Babylonia, Assyria, the Mitanni, the Hittites, Syria, Canaan, and Alashiya (Cyprus). They are important for establishing both the history and chronology of the period. Letters from the Babylonian king Kadashman-Enlil I anchor the timeframe of Akhenaten's reign to the mid-14th century BC. Here was also found the first mention of a Near Eastern group known as the Habiru, whose possible connection with the Hebrews remains debated. Other rulers include Tushratta of Mittani, Lib'ayu of Shehchem, Abdi-Heba of Jerusalem and the quarrelsome king Rib-Hadda of Byblos, who in over 58 letters continuously pleads for Egyptian military help" (Wikipedia article on Amarna letters, accessed 09-01-2009).

View Map + Bookmark Entry

The Earliest Chinese Inscriptions that are Indisputably Writing Circa 1,200 BCE – 1,050 BCE

 

The oldest Chinese inscriptions that are indisputably writing are the Oracle bone script (Chinese: 甲骨文; pinyin: jiǎgǔwén; literally 'shell-bone-script') of the late thirteenth century BCE. It is not until the oracle-bone inscriptions that we find grammatically connected marks that certainly record language. Lack of archaeological evidence prevents addressing the related questions of how long before that time writing developed and in what contexts, or whether writing in China developed gradually or rapidly, and whether it developed exclusively in a religious context or, as in the ancient Middle East, it was tied to court adminstration.

Oracle bone script was

"first identified by scholars in 1899 on pieces of bone and turtle shell being sold as medicine, and by 1928, the source of the oracle bones had been traced back to modern Xiǎotún (小屯) village at Ānyáng in Hénán Province, where official archaeological excavations in 1928–1937 discovered 20,000 oracle bone pieces, about 1/5 of the total discovered. The inscriptions were records of the divinations performed for or by the royal Shāng household. The oracle bone script is a well-developed writing system, attested from the late Shang Dynasty (1200–1050 BC). Only about 1,400 of the 2,500 known oracle bone script logographs can be identified with later Chinese characters and thus deciphered by paleographers."

"The late Shāng oracle bone writings, along with a few contemporary characters in a different style cast in bronzes, constitute the earliest significant corpus of Chinese writing, which is essential for the study of Chinese etymology, as Shāng writing is directly ancestral to the modern Chinese script. It is also the oldest member and ancestor of the Chinese family of scripts.

"The oracle bone script of the late Shāng appears archaic and pictographic in flavor, as does its contemporary, the Shāng writing on bronzes. The earliest oracle bone script appears even more so than examples from late in the period (thus some evolution did occur over the roughly 200-year period). Comparing oracle bone script to both Shāng and early Western Zhōu period writing on bronzes, oracle bone script is clearly greatly simplified, and rounded forms are often converted to rectilinear ones; this is thought to be due to the difficulty of engraving the hard, bony surfaces, compared with the ease of writing them in the wet clay of the molds from which the bronzes were cast. The more detailed and more pictorial style of the bronze graphs is thus thought to be more representative of typical Shāng writing (as would have normally occurred on bamboo books) than the oracle bone script forms, and it is this typical style which continued to evolve into the Zhōu period writing and then into the seal script of the Qín state in the late Zhōu period.

"It is known that the Shāng people also wrote with brush and ink, as brush-written graphs have been found on a small number of pottery, shell and bone, and jade and other stone items, and there is evidence that they also wrote on bamboo (or wooden) books just like those which have been found from the late Zhōu to Hàn periods, because the graphs for a writing brush (聿 yù) and bamboo book (冊 cè, a book of thin vertical slats or slips with horizontal string binding, like a Venetian blind turned 90 degrees) are present in the oracle bone script. Since the ease of writing with a brush is even greater than that of writing with a stylus in wet clay, it is assumed that the style and structure of Shāng graphs on bamboo were similar to those on bronzes, and also that the majority of writing occurred with a brush on such books. Additional support for this notion includes the reorientation of some graphs, by turning them 90 degrees as if to better fit on tall, narrow slats; this style must have developed on bamboo or wood slat books and then carried over to the oracle bone script. Additionally, the writing of characters in vertical columns, from top to bottom, is for the most part carried over from the bamboo books to oracle bone inscriptions. In some instances lines are written horizontally so as to match the text to divinatory cracks, or columns of text rotate 90 degrees in mid stream, but these are exceptions to the normal pattern of writing, and inscriptions were never read bottom to top. The vertical columns of text in Chinese writing are traditionally ordered from right to left; this pattern is found on bronze inscriptions from the Shāng dynasty onward. Oracle bone inscriptions, however, are often arranged so that the columns begin near the centerline of the shell or bone, and move toward the edge, such that the two sides are ordered in mirror-image fashion" (Wikipedia article on Oracle bone script, accessed 07-11-2009).

Edward L. Shaughnessy, "The Beginnings of Writing in China" IN: Woods (ed) Visible Language. Inventions of Writing in the Middle East and Beyond (2010) 215-24.

View Map + Bookmark Entry

1,000 BCE – 300 BCE

Possibly the Earliest Hebrew Inscription Circa 1,000 BCE

A shard of ancient pottery found in the Elah Fortress, bearing Proto-Canaanite script which might compose the earliest known Hebrew inscription. (View Larger)

An ostracon shard found in October 2008 about 20 miles southwest of Jerusalem at the Elah Fortress in Khirbet Qeiyafa, the earliest known fortified city of the biblical period of Israel, and written in ink in Proto-Canaanite script, could be the earliest known Hebrew inscription, according to biblical archaeologist Yosef Garfinkel.  Other scholars urge caution in accepting that interpretation. The shard is one of only a dozen or so examples of Proto-Canaanite that have survived.

"The Israelites were not the only ones using proto-Canaanite characters, and other scholars suggest it is difficult - perhaps impossible - to conclude the text is Hebrew and not a related tongue spoken in the area at the time. Garfinkel bases his identification on a three-letter verb from the inscription meaning to do, a word he said existed only in Hebrew.

" 'That leads us to believe that this is Hebrew, and that this is the oldest Hebrew inscription that has been found,' he said.

"Other prominent Biblical archaeologists warned against jumping to conclusions.

"Hebrew University archaeologist Amihai Mazar said the inscription was very important, as it is the longest proto-Canaanite text ever found. But he suggested that calling the text Hebrew might be going too far" (http://www.haaretz.com/hasen/spages/1032929.html, accessed 08-30-2009).

View Map + Bookmark Entry

The "Chicago Syllabary" Circa 900 BCE

The "Chicago Syllabary," a cuneiform lexical list of unknown provenance, dating from the first millennium BCE, preserved in The Oriental Institute of the University of Chicago, is thought to contain content compiled earlier in the second millenium. 

"The text gives the Sumerian and Akkadian pronunciations of various cuneiform signs along with their names. As such, the text provides unique insights into how the ancients understood and analyzed their languages and the cuneiform script. The list is organized by sign shape. The table consists of two halves, with each half divided into four columns. The first column gives the pronunciations of a given sign and the second column gives the corresponding graph. The third column gvies the name of the sign as given by the Babylonian compilers (in some cases a descriptive designation that blends Sumerian and Akkadian), while the fourth column gives the corresponding Akkadian pronunciation. In addition to the importance of its content, the text examplies the development of the cuneiform scriptin the first millenium BC" (Woods, Teeter, Emberling (eds) Visible Language. Inventions of Writing in the Ancient Middle East and Beyond [2010] No. 60).

View Map + Bookmark Entry

The First Olympic Games 776 BCE

Date of the first Olympic games, according to ancient Greek records, which also represent the adoption in Greece of the Phoenician alphabet, from which all other Western alphabets are descended.

The date is based on inscriptions, found at Olympia, of the winners of a foot race held every four years, starting in 776 BCE.

View Map + Bookmark Entry

One of the Oldest Known Examples of Writing in Greek Circa 740 BCE – 720 BCE

The Cup of Nestor. (View Larger)

The so-called Cup of Nestor from Pithikoussai, a clay drinking cup (kotyle) that was found in 1954 at excavations in a grave in the ancient Greek site of Pithikoussai on the island of Ischia in the Tyrrhenian Sea at the northern end of the Gulf of Naples, bears a three-line inscription that was scratched on its side at a later time. This inscription and the so-called Dipylon inscription from Athens, are the oldest known examples of writing in the Greek alphabet.

Pithikoussai was one of the earliest Greek colonies in the West. The cup is dated to the Geometric Period (c.750-700 BCE) and is believed to have been originally manufactured in Rhodes. It is preserved in the Villa Arbusto museum in the village of Lacco Ameno on the island of Ischia, Italy.

Both the Cup of Nestor and the Dipylon inscription have been linked to early writing in the island of Euboea.

View Map + Bookmark Entry

The Marsiliana Tablet Abecedarium 700 BCE

The earliest Estruscan abecedarium, the Marsiliana d'Albegna tablet, which dates to c. 700 VCE. (View Larger)

It is not clear whether the process of adaptation of the Old Italic or Etruscan alphabet from the Greek alphabet took place in Italy in the city of Cumae, the first Greek colony on the mainland of Italy, or in Greece/Asia Minor. The Etruscan alphabet was a precursor of the Old Latin alphabet, the basis of the Latin alphabet.

"It was in any case a Western Greek alphabet. In the alphabets of the West, X had the sound value [ks], Ψ stood for [kʰ]; in Etruscan: X = [s], Ψ = [kʰ] or [kχ] (Rix 202-209).

"The earliest Etruscan abecedarium, the Marsiliana d'Albegna (near Grosseto) tablet which dates to c. 700 BCE, lists 26 letters corresponding to contemporary forms of the Greek alphabet which retained san and qoppa but which had not yet developed omega.

 In transliteration: "A B G D E V Z H Θ I K L M N Ξ O P Ś Q R S T Y X Φ Ψ"


"21 of the 26 archaic Etruscan letters were adopted for Old Latin from the 7th century BCE, either directly from the Cumae alphabet, or via archaic Etruscan forms, compared to the classical Etruscan alphabet retaining B, D, K, O, Q, X but dropping Θ, Ś, Φ, Ψ, F (Etruscan U is Latin V, Etruscan V is Latin F).

In translieration: "A B C D E F Z H I K L M N O P Q R S T V X"

(Wikipedia article on Old Italic alphabet, accessed 08-02-2009).

View Map + Bookmark Entry

The Taylor Prism and the Sennacherib Prism 689 BCE – 691 BCE

The Taylor Prism, ME 91032 of the British Library. (View Larger)

The Taylor Prism, a six-sided baked clay document (or prism) was discovered at the Assyrian capital Nineveh, in an area known today as Nebi Yunus, now Iraq. It was acquired by Colonel R. Taylor, British Consul General at Baghdad, in 1830, after whom it is named. The British Museum purchased it from Taylor's widow in 1855.

One of the first major Assyrian documents discovered, the Taylor Prism played an important part in the decipherment of cuneiform script.

"The prism is a foundation record, intended to preserve King Sennacherib's achievements for posterity and the gods. The record of his account of his third campaign (701 BC) is particularly interesting to scholars. It involved the destruction of forty-six cities of the state of Judah and the deportation of 200,150 people. Hezekiah, king of Judah, is said to have sent tribute to Sennacherib. This event is described from another point of view in the Old Testament books of 2 Kings and Isaiah. Interestingly, the text on the prism makes no mention of the siege of Lachish which took place during the same campaign and is illustrated in a series of panels from Sennacherib's palace at Nineveh" (http://www.britishmuseum.org/explore/highlights/highlight_objects/me/t/the_taylor_prism.aspx, accessed 12-26-2009).

♦ Another version of the same text, produced in the same prism format, and known as the Sennacherib Prism, was purchased by James Henry Breasted from a Baghdad antiques dealer in 1919 for the Oriental Institute of Chicago, where it is preserved. The two known complete examples of Sennacherib's inscription are nearly identical, although the dates on the prisms show that they were written sixteen months apart, the Taylor Prism in 691 BCE and the Oriental Institute prism in 689 BCE. There are also at least eight other fragmentary prisms preserving parts of this text, all in the British Museum, and most of them containing just a few lines.

View Map + Bookmark Entry

The Rosetta Stone of Cuneiform Script 522 BCE – 486 BCE

The Behistun Inscription. (View Larger)

The Behistun Inscription (also Bisitun or Bisutun, Modern Persian: بیستون ; Old Persian: Bagastana, meaning "the god's place or land"),  a multi-lingual stone inscription approximately 15 meters high and 25 meters wide, located on Mount Behistun in  Kermanshah Province, near the city of Kermanshah in western Iran, was written by Darius I, the Great sometime between his coronation as Zoroastrian king of kings of the Achaemenid, or Persian, Empire in the summer of 522 BCE and his death in autumn of 486 BCE.

" . . . the inscription begins with a brief autobiography of Darius I, the Great including his ancestry, lineage etc. Later in the inscription, Darius provides a lengthy sequence of events following the death of Cyrus the Great and Cambyses II in which he fought nineteen battles in a period of one year (ending in December of 521 BC) to put down multiple rebellions throughout the Persian Empire. Darius' inscription states in detail that the rebellions, which had resulted from the deaths of Cyrus the Great and his son Cambyses II, were orchestrated by several impostors and their co-conspirators in various cities throughout the empire, each of whom falsely proclaimed kinghood during the upheaval following Cyrus the Great's death. Darius the Great proclaimed himself victorious in all battles during the period of upheaval, attributing his success to the "grace of Ahuramazda (God)".

"The inscription includes three versions of the same text, written in three different cuneiform script languages: Old Persian, Elamite, and Babylonian. Babylonian was a later form of Akkadian: unlike Old Persian, they are Semitic languages. In effect, then, the inscription is to cuneiform what the Rosetta Stone is to Egyptian hieroglyphs: the document most crucial in the decipherment of a previously lost script.

"Translation of the text was a multi-step and multi-national effort based on earlier work done on the decipherment of the Old Persian script by Georg Friedrich Grotefend in the late 1700's when Grotefend discovered that, unlike Elamite and Babylonian texts, Old Persian text is alphabetic. In the following years, the efforts of [Eugène] Burnouf, [Christian] Lassen, and [Henry] Rawlinson (who had the remainder of the inscription transcribed in two parts, in 1835 and 1843) contributed to translating the Old Persian cuneiform text using the Zoroastrian book Avesta as a key, in addition to cross referencing with modern Persian and Vedic languages. With the Old Persian text deciphered, Rawlinson and others were able to then translate the Elamite and Babylonian texts (both of which were ancient translations of the Old Persian text) after 1843.

"The Inscription is . . . 100 metres up a limestone cliff from an ancient road connecting the capitals of Babylonia and Media (Babylon and Ecbatana, respectively). The mountainside was removed to make the inscription more visible after its completion. The Old Persian text contains 414 lines in five columns; the Elamite text includes 593 lines in eight columns, and the Babylonian text is in 112 lines. The inscription was illustrated by a life-sized bas-relief of Darius I, the Great, holding a bow as a sign of kingship, with his left foot on the chest of a figure lying on his back before him. The prostrate figure is reputed to be the pretender Gaumata. Darius is attended to the left by two servants, and ten one-metre figures stand to the right, with hands tied and rope around their necks, representing conquered peoples. Faravahar floats above, giving his blessing to the king" (Wikipedia article on Behistun Inscription, accessed 12-27-2009).

View Map + Bookmark Entry

The Earliest Known Work on Descriptive Linguistics Circa 501 BCE

An Indian postage stamp, released in 2004, in honor of Pannini.

Panini, an Indian grammarian from Pushkalavati (Sanskrit: पुष्कलावती), an ancient site situated in Peshawar valley in the Khyber Pakhtunkhwa province (formerly NWFP) of Pakistan (then Gandhara), composed his formulation of 3,959 rules of Sanskrit morphology known as Ashtadhyayi. This was the earliest known work on descriptive linguistics. It included the concepts of the phoneme, the morpheme, and the root, and metarules, transformation, and recursion.

View Map + Bookmark Entry

300 BCE – 30 CE

The Earliest Known Examples of Maya Script Circa 300 BCE

A vertical, columnar stone inscription roughly six inches long. Image: Boris Beltrán/Science. (View Larger)

The earliest stone inscription which is identifiably in Maya script, (or Maya glyphs or Maya hieroglyphs) was found in in 2005 the pre-Columbian Maya archaeological site in San Bartolo in the Department of Petén in northern Guatemala, northeast of Tikal and roughly fifty miles from the nearest settlement. This vertical column of ten glyphic words roughly six inches long, dating from circa 300 BCE, "may be related to a nearby painted image of the maize god" (http://www.nytimes.com/2006/01/10/science/10maya.html?_r=1, accessed 03-23-2010). In 2010 this inscription had not been deciphered.

View Map + Bookmark Entry

The Earliest Surviving Monolingual Dictionary Circa 250 BCE

An edition of the Erya.(View Larger)

The earliest surviving monolingual dictionary is the Chinese dictionary called the Eyra.

"The Erya has been described as a dictionary, glossary, synonymicon, thesaurus, and encyclopaedia. Karlgren (1931: 46) explains that the book "is not a dictionary in abstracto, it is a collection of direct glosses to concrete passages in ancient texts." The received text contains 2094 entries, covering about 4300 words, and a total of 13,113 characters. It is divided into nineteen sections, the first of which is subdivided into two parts. The title of each chapter combines shi ("explain; elucidate") with a term describing the words under definition. Seven chapters (4, 8, 9, 10, 12, 18, and 19) are organized into taxonomies. For instance, chapter 4 defines terms for: paternal clan (宗族), maternal relatives (母黨), wife's relatives (妻黨), and marriage (婚姻). The text is divided between the first three heterogeneous chapters defining abstract words and the last sixteen semantically-arranged chapters defining concrete words. The last seven – concerning grasses, trees, insects and reptiles, fish, birds, wild animals, and domestic animals – describe more than 590 kinds of flora and fauna. It is a valuable document of natural history and historical biogeography" (Wikipedia article on Eyra, accessed 05-08-2008).

View Map + Bookmark Entry

The Beginning of Latin Literature Circa 250 BCE

Roman dramatist and epic poet Livius Andronicus translated Homer's Odyssey into Latin, and translated and staged Greek comedies and tragedies in Rome.

This is considered the beginning of Latin literature.

View Map + Bookmark Entry

30 CE – 500 CE

The Mensa Isiaca or Bembine Table of Isis Circa 50 CE

An elaborate bronze tablet with enamel and silver inlay mimicking Egyptian style, the Mensa Isiaca or Bembine Tablet or Bembine Table of Isis was probably created in Rome during the first century CE. It was discovered after the sack of Rome in 1527, soon after which Cardinal Pietro Bembo acquired it at an exhorbitant price.

"After Bembo's death in 1547 the Tablet was acquired by the House of Mantua, remaining in their museum until the capture of Mantua in 1630 by Ferdinand II's troops. The Tablet eventually came into the hands of Cardinal Pava, who presented it to the Duke of Savoy, who in turn presented it to the King of Sardinia. With the French conquest of Italy in 1797 the Tablet came to Paris, and Alexandre Lenoir wrote in 1809 that it was on exhibition in the Bibliothèque Nationale. It was later returned to Italy after peace was established. Karl Baedeker in his Guide to Northern Italy mentions that the tablet was a central exhibit in Gallery 2 in the Museum of Antiquities at Turin, where it is today."

In the seventeenth century the fame of the Bembine Tablet was such that Athanasius Kircher used it as the primary source for his attempt to decipher Egyptian hieroglyphs, reproducing an engraving of the table in his misconceived Oedipus Aegyptiacus (1652-55).

The first scholarly study of the table was by the Padovan scholar and antiquarian Lorenzo Pignoria in Vetustissimae tabulae aeneae sacris Aegyptiorum simulachris coelatate accurata explicatio descriptio (Venice, 1605). This was the first detailed printed account of the table. In his description Pignoria compared the table to other known archeological objects, particularly Egyptian amulets and engraved gems. Unlike some of his contemporaries, who saw the table as a mystical relic from the dawn of creation, Pignoria concluded that the table was a Roman work of the Augustan period. The large folding plates of this edition were engraved by the Venetian engraver and publisher Giacomo Franco in 1600 to replicate the various parts of the Table, and were included, variously assembled and folded, in a handful of copies of the first edition, published by Franco in 1605. In later editions the large woodcuts were reproduced as copperplate engravings.

"Egypt held great appeal for the Romans, who eagerly absorbed the Isis cult. However, after, the battle of Actium (31 BC) and the deaths of Cleopatra and Mark Antony (30 BC), the cult was persecuted until later in the first century AD When the Emperor Caligula (AD 12-41), descendant of Augustus and of Mark Antony, built a great temple to Isis Campus Martius: the Iseum Campense. Also it was sometime in the first century AD When this remarkable table was produced, probably in Rome. The hieroglyphs are nonsense and the cult scenes are Egyptianising, but do not depict true Egyptian rites. Some of the bizarre attributes make it unclear Whether the figures are divinities or kings and queens, and Whether or not a god, instead of the king, is depicted making an offering to another god. Egyptian motifs Appear helter-skelter throughout. Nevertheless, the central figure in a chapel can be Recognised as Isis, suggesting That the table comes from a place where the Isis cult was Celebrated, possibly even the Iseum Campensis. The table is an important example of metallurgical knowledge in the ancient world, with its surface decoration of different colored precious (silver, gold, copper and gold with much) and base metals. Perhaps the most interesting color on the table is the black, usually incorrectly as described on niello. In fact, analysis on similarly inlaid black-Roman objects reveal That this was made by alloying copper and tin with small amounts of gold or silver (about 2%) and then 'pickling' the object in organic acid. Pliny (Natural. History) and Plutarch (Moralia) both described on a prestigious black bronze alloy, 'Corinthian bronze', Which contained gold and silver" (http://www.museoegizio.org/pages/isiaca_en.jsp, accessed 02-19-2013).

James Stevens Curl, Egyptomania. The Egyptian Revival: a Recurring Theme in the History of Taste  (1994) 57-58.

View Map + Bookmark Entry

The New Testament Was Probably Written over Less than a Century Circa 65 CE – 150 CE

Unlike the Old Testament, which was written over several hundred years, the New Testament was written in a relatively narrow span of time, probably less than a century.

The 27 books of the New Testament were written by various authors at various times and places, probably in Koine Greek, the vernacular dialect in first-century Roman provinces.

"Koine Greek is not only important to the history of the Greeks for being their first common dialect . . ., but it's also important . . . for being the first 'international' form of speech, and eventually the chosen medium for the teaching and spreading of Christianity. Koine Greek was unofficially a first or second language in the Roman Empire."

View Map + Bookmark Entry

The Earliest Runic Inscriptions Circa 150 CE

"The earliest runic inscriptions date from around A.D. 150. The characters were generally replaced by the Latin alphabet as the cultures that had used runes underwent Christianization by around A.D. 700 in central Europe and by around A.D. 1100 in Northern Europe. However, the use of runes persisted for specialized purposes in Northern Europe. Until the early 20th century runes were used in rural Sweden for decoration purposes in Dalarna and on Runic calendars.

"The three best-known runic alphabets are the Elder Futhark (around 150 to 800 AD), the Anglo-Saxon Futhorc (400 to 1100 AD), and the Younger Futhark (800–1100). The Younger Futhark is further divided into the long-branch runes (also called Danish, although they were also used in Norway and Sweden), short-branch or Rök runes (also called Swedish-Norwegian, although they were also used in Denmark), and the stavesyle or Hälsinge runes (staveless runes). The Younger Futhark developed further into the Marcomannic runes, the Medieval runes (1100 AD to 1500 AD), and the Dalecarlian runes (around 1500 to 1800 AD).

"The origins of the runic alphabet are uncertain. Many characters of the Elder Futhark bear a close resemblance to characters from the Latin alphabet. Other candidates are the 5th to 1st century BC Northern Italic alphabets: Lepontic, Rhaetic and Venetic, all of which are closely related to each other and descend from the Old Italic alphabet" (Wikipedia article Runic alphabet, accessed 10-26-2010).

View Map + Bookmark Entry

The Earliest Known Runic Inscription Circa 160 CE

The Vimose Comb. (View Larger)

The Vimose Comb, from Vimose, Funen, Denmark, is considered the oldest datable Elder Futhark runic inscription in late Proto-Germanic or early Proto-Norse

This and other slightly later items from Vimose dating from circa 200-300 CE, are known as the Vimose inscriptions.

View Map + Bookmark Entry

The Most Widely Used Medieval Grammar Circa 350 CE

Roman grammarian and teacher of rhetoric Aelius Donatus wrote the Latin grammar book, Ars grammaticaDonatus was the teacher of St. Jerome.

"His Ars grammatica, especially the section on the eight parts of speech, though possessing little claim to originality, and evidently based on the same authorities which were used by the grammarians Charisius and Diomedes, attained such popularity as a schoolbook that, in the Middle Ages, he became the eponym for a rudimentary treatise of any sort, called a donet. When books came to be printed in the 15th century, editions of the little book were multiplied to an enormous extent. It is also the only purely textual work to be printed in blockbook form (cut like a woodcut, not using movable type). It is in the form of an Ars Minor, which only treats of the parts of speech, and an Ars Major, which deals with grammar in general at greater length.

"Donatus was a proponent of an early system of punctuation, consisting of dots placed in three successively higher positions to indicate successively longer pauses, roughly equivalent to the modern comma, colon, and full stop. This system remained current through the seventh century, when a more refined system due to Isidore of Seville gained prominence" (Wikipedia article on Aelius Donatus, accessed 01-15-2011).

View Map + Bookmark Entry

The Latest Known Inscription Written in Egyptian Hieroglyphs August 24, 394 CE

The latest known inscription written in Egyptian hieroglyphs is The Graffito of Emset-Akhom (or Philae 436) inscribed in the temple Isis at Philae, formerly an island in the First Cataract of the Nile. Having been relocated as a result of the Aswan dam, it is now on an island in Lake Nasser, in southern Egypt. It includes a relief of a ptolemaic or Roman period pharaoh. "The hieroglyphs are crude in execution but are clear enough to read. On the left is a badly damaged figure of a king wearing an elaborate crown" (http://www.memphis.edu/egypt/philae2.php, accessed 02-19-2013).

The inscription is dated to the Birthday of Osiris, year 110 (of Diocletian) equivalent to August 24, 394 CE.

View Map + Bookmark Entry

Surviving in Only One Deeply Corrupt Renaissance Manuscript Circa 450 CE

The "Alphabetical Collection of All Words" (Συναγωγὴ Πασῶν Λεξέων κατὰ Στοιχεῖον), written in the fifth century by the Greek grammarian Hesychius of Alexandria (Ἡσύχιος ὁ Ἀλεξανδρεύς), remains the richest lexicon of unusual and obscure Greek words that survived. It includes many words not found in surviving ancient Greek texts, and its explanations of many epithets and phrases also reveal important facts about the religion and social life of the ancients.

Hesychius's work survived in only one "deeply corrupt" 15th century manuscript preserved in the Biblioteca Marciana in Venice (Marc. Gr. 622). This manuscript, which belonged to the Mantuan scholar Giangiacomo Bordellone, was edited by Greek scholar and philosopher Marcus Musurus (Μάρκος Μουσοῦρος; Marco Musuro) and published for the first time in print by Aldus Manutius of Venice in 1514. In his preface to the first edition Aldus thanked Bordellone for loaning the manuscript to Musurus so that it could be published.

View Map + Bookmark Entry

500 CE – 600

The Codex Argenteus, The Primary Surviving Example of the Gothic Language Circa 520

A page from the Codex Argenteus. (View Larger)

About 520 CE the Codex Argenteus (silver codex) was written in silver and gold letters on purple vellum in probably in Ravenna, or in the Po valley, or in Brescia, probably for the Ostrogothic ruler of Italy, Theodoric

The Codex Argenteus contains fragments of the Four Gospels translated into Gothic by the fourth century Bishop Ulfilas (Wulfila), of Nicopolis ad Istrum (now Northern Bulgaria). It is the primary surviving example of the Gothic language, an extinct Germanic language that was spoken by the Goths, and set down in writing by Ulfilas who devised devised the Gothic alphabet. Of the original 336 leaves only 188 are preserved at the Carolina Rediviva library at the University of Uppsala, Sweden, plus one separate leaf, discovered, remarkably, in 1970 in the cathedral of Speyer in Germany.

During the Ostrogothic rule of Italy there was a bilateral Gothic-Latin culture, of which the Codex Brixianus, also produced in Italy at approximately the same time, survives as a Latin counterpart to the Codex Argenteus. It is believed that the Latin version of the Bible in the Codex Brixianus may be the Latin text from which Ulfilas translated the Bible into Gothic.

"With the end of Gothic rule the Gothic manuscripts in Italy were rendered valueless; what remained of them (with the exception of the Codex Argenteus) became part of that waste material which in the seventh and eighth centuries was re-used in Bobbio" (Bischoff, Latin Palaeography: Antiquity and Middle Ages [1990] 186).

After about a thousand years during which the Codex Argenteus appeared in no inventories, it was rediscovered in the middle of the 16th century in the library of the Benedictine monastery of Werden in the Ruhr, near Essen in Germany (Werden Abbey). This abbey, whose abbots were imperial princes with a seat in the imperial diets, was among the richest monasteries of the Holy Roman Empire. The Dutch physician, humanist, and linguist Johannes Goropius Becanus published the first mention of the manuscript in his 1569 book Origines Antwerpianae. In 1665 Franciscus Junius the Younger published the editio princeps of the text as Quatuor D. N. Jesu Christi euangeliorum versiones perantiquae duae, Gothica scil. et Anglo-Saxonica (Dordrecht, 1665).

In 1597 Bonaventura Vulcanius, professor of Greek at Leiden, published portions of the Gothic Bible text from the Codex Argenteus in a collection of treatises on the Goths which he edited for publication by the Plantin Press. In his preface to one of these treatises, De literis et lingua Getarum sive Gothorum, Vulcanius wrote that it represented two brief disserations by an unidentifiable scholar, the first of which he said was "concerned with the script and prounciation, and the other with the Lombardic script, which the author said he copied from a manuscript codex of great antiquity which he called 'the Silver.' This was the first publication in print of any Gothic text, and it gave the manuscript its name, Codex Argenteus. Vulcanius identified Ulfilas as the translator of Gothic text of the Bible. Vulcanius's book included images of Gothic script as compared to other ancient languages. 

"Later the manuscript became the property of the Emperor Rudolph II, and when, in July 1648, the last year of the Thirty Years' War, the Swedes occupied Prague, it fell into their hands together with the other treasures of the Imperial Castle of Hradcany. It was subsequently deposited in the library of Queen Christina in Stockholm, but on the abdication of the Queen in 1654 it was acquired by one of her librarians, the Dutch scholar Isaac Vossius. He took the manuscript with him to Holland, where, in 1662, the Swedish Count Magnus Gabriel De la Gardie bought the codex from Vossius and, in 1669, presented it to the University of Uppsala. He had previously had it bound in a chased silver binding, made in Stockholm from designs by the painter David Klöcker Ehrenstrahl" (http://www.ub.uu.se/arv/codexeng.cfm, accessed 11-22-2008).

Munkhammar, Lars. The Silver Bible: Origins and History of the Codex Argenteus. (Uppsala, 2011).

Uppsala University Library makes a digital facsimile of the codex available on line at: http://www.ub.uu.se/en/Collections/Manuscript-Collections/Silver-Bible/Codex-Argenteus-Online/, accessed 01-22-2013.

View Map + Bookmark Entry

The Code of Justinian 529 – 533

Justinian. (Click to view larger.)

Thinking that the curriculum is contrary to Christian teachings, Emperor Justinian I closed the last surviving classical school at Athens, causing Constantinople to become the capital of Greek culture.

About this time Justinian appointed a commission of scholars to codify 2000 volumes of legal works, some dating back about 1000 years.

This condensation formed the Codex Justinianus, later known as the Code of Justinian or, after a printed edition of 1583, as the Corpus Juris Civilis. The Corpus Juris Civilis became the basis for civil law in western Europe. It was written and distributed in Latin, which remained the official language of the government of the Empire even though the prevalent language of merchants, farmers, seamen, and other citizens was Greek. By the early 7th century, the official government language of the Byzantine empire segued into ancient Greek under the lengthy reign of Heraclius.

"This code compiled, in the Latin language, all of the existing imperial constitutiones (imperial pronouncements having the force of law), back to the time of Hadrian. It used both the Codex Theodosianus and the fourth-century collections embodied in the Codex Gregorianus and Codex Hermogenianus, which provided the model for division into books that were divided into titles. These codices had developed authoritative standing."

"Justinian's Corpus Juris Civilis was distributed in the West but was lost sight of; it was scarcely needed in the comparatively primitive conditions that followed the secession of Italy from the Byzantine empire in 8th century. The only western province where the Justinianic code was effectively introduced was Italy, following its recovery by Byzantine armies (Pragmatic Sanction of 554), but a continuous tradition of Roman law in medieval Italy has not been proven. Historians disagree on the precise way it was recovered in Northern Italy about 1070: perhaps it was waiting unneeded and unnoticed in a library until the legal studies that were undertaken on behalf of papal authority that was central to the Gregorian Reform of Pope Gregory VII led to its accidental rediscovery. Aside from the Littera Florentina, a 6th-century codex of the Pandects that was preserved at Pisa, apparently without ever being publicly consulted, (and removed to Florence after Florence conquered Pisa in 1406), there may have been other manuscript sources for the text that began to be taught at Bologna, by Pepo and then by Irnerius. The latter's technique was to read a passage aloud, which permitted his students to copy it, then to deliver an excursus explaining and illuminating Justinian's text, in the form of glosses. Irnerius's pupils, the so-called Four Doctors of Bologna, were among the first of the "glossators" who established the curriculum of Roman law. The tradition was carried on by French lawyers, known as the Ultramontani, in the 13th century. 

"The merchant classes of Italian communes required law with a concept of equity and which covered situations inherent in urban life better than the primitive Germanic oral traditions. The provenance of the Code appealed to scholars who saw in the Holy Roman Empire a revival of venerable precedents from the classical heritage. The new class of lawyers staffed the bureaucracies that were beginning to be required by the princes of Europe. The University of Bologna, where Justinian's Code was first taught, remained the dominant centre for the study of law through the High Middle Ages" (Wikipedia article on Corpus Juris Civilis, accessed 01-02-2010).

View Map + Bookmark Entry

700 – 800

The Oldest Surviving Book in the German Language 765 – 775

The Abrogans, or Codex Abrogans, a dictionary of synonyms or glossary or word-list from Latin into Old High German, is the oldest surviving book in the German language. Abrogans ("humble") is the first word on the word-list.

The codex is preserved in the library of the Abbey of St. Gall. (St. Gall, Stiftsbibliothek, Codex 911).  A digital facsimile is available as part of the Codices Electronic Sangallenses (CESG) virtual library project. 

View Map + Bookmark Entry

The First Sample of an Early Italian Language Circa 775 – 825

The parchment on which the Veronese Riddle is written. (View Larger)

The growth of a written vernacular allowed the development of a written culture outside the religious orders.

The Indovinello versonese or Veronese Riddle, a riddle, apparently half-Italian, half-Latin, written on the margin of the Verona Orational (also known as the Libellus Orationum), probably in the late eighth or early ninth century by a monk from Verona—a city in the Veneto region in Northern Italy— contains the first written sample of an early Italian language different from Late Latin. 

"It is written in northern-Italian cursive minuscule. . . : 'Se pareba boves, alba pratalia araba, albo versorio teneba, negro semen seminaba', which can be translated more or less as 'In front of him (he) led oxen, White fields (he) plowed, A white plow (he) held , A black seed (he) sowed'. This can be easily interpreted as a reprehesentation of the act of holding a pen and writing on a white sheet." (Wikipedia article on Verona Orational, accessed 01-22-2012).

"Many more European documents seem to confirm that the distinctive traits of Romance languages occurred all around the same time (e.g. France's Serments de Strasburg). Though initially hailed as the earliest document in Italian in the first years following Schiapparelli's discovery, today the record has been disputed by many scholars from Migliorini to Segre and Bruni, who have placed it at the latest stage of Vulgar Latin, though this very term is far from being clear-cut and Migliorini himself considers it dilapidated. At present, however, the Placito Capuano (960 A.D.) (the first in a series of four documents dating 960-963 A.D. issued by a Capuan court) is considered to be the first document ever written in Italian, although Migliorini concedes that since the Placito was put on record as an official court proceeding (and signed by a notary), Italian must have been widely spoken for at least one century" (Wikipedia article on Veronese Riddle, accessed 06-22-2009).

View Map + Bookmark Entry

800 – 900

The First Written Swedish Literature Circa 800

The Rök Runestone, believed to be the earliest Sweedish writing, makes reference to Ostrogothic King, Theodoric the Great.

(View Larger)

The Rök Runestone (Swedish: Rökstenen; Ög 136), one of the most famous runestones, features the longest known runic inscription in stone. Preserved by the church in Rök (between Mjölby and Ödeshög, close to the E4 and Lake Vättern), Östergötland, Sweden, it is considered the first piece of written Swedish literature. 

View Map + Bookmark Entry

The First Surviving Book Written Entirely in English Circa 890

A manuscript of Gregory the Great's (Pope Gregory I) Liber regulae pastoralis, or Regula pastoralis (Pope Gregory I) translated into English at the order of Alfred the Great, is the earliest surviving book written entirely in English.  It is preserved in the Bodleian Library, Oxford.

"This earliest English manuscript book is the only surviving book that can be linked to the King; it is his translation from the Latin into Anglo-Saxon of Pope Gregory’s work and bears a unique preface about the decline of learning among his people in the form of a letter from the King to Waerferth, Bishop of Worcester.  

"The book represents a sophisticated approach to language and learning at a crucial time in the nation’s political and cultural development. It forms part of Alfred’s attempts to educate and rally his people through the use of edifying texts in the vernacular rather than in Latin, bringing them together in the face of an external (Viking) threat. This manuscript remained at Worcester until the late 17th century, when it was bought by the Bodleian from Robert Scot, as part of the collection of Christopher Hatton, 1st Baron Hatton" (http://www.bodleian.ox.ac.uk/news/2011_may_23, accessed 07-16-2011) .

View Map + Bookmark Entry

900 – 1000

Massive Byzantine Encyclopedic Dictionary Circa 950

The Suda, or Souda (Σοῦδα), a massive tenth century Byzantine encyclopedic dictionary of the Mediterranean world written in Greek, contains 30,000 entries, many drawn from ancient sources that were since lost. Little is known regarding its compilation except that it must have been compiled before the time of 12th century writer Eustathius of Thessalonica (Archbishop Eustathios of Thessalonike) who frequently quoted from it.  Its title probably comes from the Byzantine Greek word souda, meaning "fortress" or "stronghold," with the alternate name, Suidas, stemming from an error made by Eustathius, who mistook the title for the proper name of the author.

"The Suda is somewhere between a grammatical dictionary and an encyclopedia in the modern sense. It explains the source, derivation, and meaning of words according to the philology of its period, using such earlier authorities as Harpocration and Helladios. There is nothing especially important about this aspect of the work. It is the articles on literary history that are valuable. These entries supply details and quotations from authors whose works are otherwise lost. They use older scholia to the classics (Homer, Thucydides, Sophocles, etc.), and for later writers, Polybius, Josephus, the Chronicon Paschale, George Syncellus, George Hamartolus, and so on.

"This lexicon represents a convenient work of reference for persons who played a part in political, ecclesiastical, and literary history in the East down to the tenth century. The chief source for this is the encyclopedia of Constantine VII Porphyrogenitus (912-59), and for Roman history the excerpts of John of Antioch (seventh century). Krumbacher (Byzantinische Literatur, 566) counts two main sources of the work: Constantine VII for ancient history, and Hamartolus (Georgios Monachos) for the Byzantine age" (Wikipedia article on Suda, accessed 02-02-2010).

Toward the very end of the 15th century humanist Demetrios Chalkokondyles (Demetrius Chalcondylas) edited the encyclopedic text and had it published for the first time in print in Greek as Lexicon graecum. Chalcondylas's edition was issued by Johannes Bissolus and Benedictus Mangius of Milan on November 15, 1499. This work, consisting of 516 leaves in folio, was the largest single-volume work printed in Greek in the fifteenth century. It was necessarily an expensive book, and included on folio 1a dialogue of Stephanus Niger between a bookseller and a student, mentioning the price of three ducats. ISTC No. is00829000

♦ The most significant modern edition and the first edition in English, is Suda On Line: Byzantine Lexicography. This online collaboration began in 1998, predating the Wikipedia, which began in 2001.

"The purpose of the Suda On Line is to open up this stronghold of information by means of a freely accessible, keyword-searchable, XML-encoded database with translations, annotations, bibliography, and automatically generated links to a number of other important electronic resources. To date over 170 scholars have contributed to the project from eighteen countries and four continents. Of the 30,000-odd entries in the lexicon, over 25,000 have been translated as of this date, and more translations are submitted every day." 

View Map + Bookmark Entry

1000 – 1100

The First Truly Recognizable Dictionary Circa 1040 – 1050

Between 1040 and 1050 Papias, an Italian sometimes known as Papias the Lombard or Papias the Grammarian, wrote Elementarium doctrinae rudimentum. This work, which was in circulation by 1053, has been called the "first fully recognizable dictionary."

"The Elementarium . . . is a landmark in the development of dictionaries as distinct from mere collections of glosses. Papias arranges entries alphabetically based on the first three letters of the word, and is the first lexicographer to name the authors or texts he uses as sources. Although most entries are not etymological, Papias laid the groundwork for derivational lexicography, which became firmly established only a century later. Papias seems to have been a cleric with theological interests, possibly living in Pavia. The name 'Papias'means 'the guide,' and may be a pseudonym or pen name. Bruno of Würzburg saw an early draft of the Elementarium before he died in 1045, but an unambiguous reference in the chronicle of Albericus Trium Fontium establishes that it was published by 1053" (Wikipedia article on Papias, accessed 11-22-2012).

Papias's Elementarium was first published in print under the title Vocabularium by Dominicus de Vespolate of Milan on December 12, 1476. ISTC No. ip00077500

View Map + Bookmark Entry

The Invention of Movable Type in China Circa 1041 – 1048

A Chinese statue of Pi Sheng. (View Larger)

The Chinese alchemist Pi Sheng invented movable type made of an amalgam of clay and glue hardened by baking, similar to Chinese porcelain. He composed texts by placing the types side by side on an iron plate coated with a mixture of resin, wax, and paper ash.

Because the Chinese alphabet is primarily pictographic and ideographic rather than alphabetic, movable type did not advance in China at this time.

"Shen Kuo wrote that during the Qingli reign period (1041–1048), under Emperor Renzong of Song (1022–1063), an obscure commoner and artisan known as Bi Sheng (990–1051) invented ceramic movable type printing.

"Although the use of assembling individual characters to compose a piece of text had its origins in antiquity, Bi Sheng's methodical innovation was something completely revolutionary for his time. Shen Kuo noted that the process was tedious if one only wanted to print a few copies of a book, but if one desired to make hundreds or thousands of copies, the process was incredibly fast and efficient. Beyond Shen Kuo's writing, however, nothing is known of Bi Sheng's life or the influence of movable type in his lifetime.

"Although the details of Bi Sheng's life were scarcely known, Shen Kuo wrote:

" 'When Bi Sheng died, his fount of type passed into the possession of my followers (i.e. one of Shen's nephews), among whom it has been kept as a precious possession until now.

"There are a few surviving examples of books printed in the late Song Dynasty using movable type printing. This includes Zhou Bida's Notes of The Jade Hall (玉堂雜記) printed in 1193 using the method of baked-clay movable type characters outlined in the Dream Pool Essays" (Wikipedia article on Shen Kuo, accessed 01-25-2012).

View Map + Bookmark Entry

1200 – 1300

Perhaps the First Grammar of a Romance Language Circa 1240

Troubadours Uc "Faidit" (meaning "exiled" or "dispossessed," Uc de Saint Circ [San Sir] or Hugues [Hugh] de Saint Circq, and Raymond Vidal de Besaudun published Donatz proensals.

Troubadours (Occitan pronunciation: [tɾuβaˈðuɾ], originally [tɾuβaˈðoɾ], English /ˈtruːbədʊər/, French: [tʁubaduːʁ]) were composers and performers of Occitan lyric poetry during the High Middle Ages (1100–1350).  Uc is considered the "inventor" (trobador) of troubadour poetry. It is thouight that he may have taken the name "Faidit" (exiled or dispossessed) during his exile in Italy during the Albigensian Crusade. This grammar of the Occitan language may have been first grammar of an Romance language.

Occitan, a romance language spoken in southern France, Italy's Occitan Valleys, Monaco, and Spain's Val d'Aran-- the regions sometimes known informally as Occitania-- is also spoken in the linguistic enclave of Guardia Piemontese (Calabria, Italy). It is an official language in Catalonia (Spain) (known as Aranese in Val d'Aran). Modern Occitan is the closest relative of Catalan.

The manuscript was first published in print in 1840.  A "revised, corrected and considerably augmented" edition by François Guesnard entitled Grammaires provençales appeared in Paris in 1858.

View Map + Bookmark Entry

The Oldest Surviving Literary Document in Yiddish 1272

Folio 54r of the Worms Mahzor, upon which, in the interstices of the first word in the Prayer for Dew, is inscribed the oldest known Yiddish text: a small blessing in the form of a rhymed couplet, directed towards those who are charged with the seemingly onerous task of carrying the heavy Mahzor from the house of the owner to the synagogue. (View Larger)

Yiddish originated in the Ashkenazi culture that developed from about the 10th century in the Rhineland and then spread to Central and Eastern Europe and eventually to other continents. The oldest surviving literary document in Yiddish, dates from 1272. It is a blessing in the Mahzor Worms, a festival prayerbook in Hebrew according to the Ashkenazi rite of the Jews in Worms, Germany, for the use of hazanim (cantors) in the synagogue.

The manuscript is preserved in the Jewish National and University Library of the Hebrew University of Jerusalem. 

♦ You can download a digital facsimile of the Mahzor Worms from the Jewish National and University Library at this link: http://jnul.huji.ac.il/dl/mss/worms/a_eng.html, accessed 04-04-2010.

View Map + Bookmark Entry

The First European Patrons of the Art of Printing? 1294

John of Monte Corvino.

John of Monte Corvino, the first missionary sent by the Pope to China, arrived in Cambaluc [medieval term for Peking] soon after Marco Polo left for Europe. John remained at Cambaluc, as head of the mission until his death in 1328. This mission became the base for other Catholic missionary work in China.

"These missionaries, spending their lives in China, learning the language and mingling with the people, must have come in contact with printed literature at every turn. John of Monte Corvino in the first dozen years of his work, even before reinforcements had arrived, had already translated the New Testament and Psalter, and prepared pictures and text for the ignorant at just the time when in China it was the natural thing to have every important literary work printed. There is no question that the Chinese who were associated in the work of translation would have suggested that the translation and the pictures should be brought before the public in what to them was the usual and natural way. Whether the missionaries agreed and thus became the first European patrons of the art of printing, we have no means of knowing. That religious image prints, prepared, like the pictures of John of Monte Corvino, 'for the ignorant,' began to appear in Europe some time within the half century after these early missionaries laid down their work, may not be altogether a coincidence" (Carter, Invention of Printing in China 2nd ed [1955] 161-62.)

View Map + Bookmark Entry

1300 – 1400

The Earliest Surviving Example of Old Polish Literature Circa 1375

Folio 3r of the Psałterz Floriansk. (View Larger)

The Psałterz Floriansk, an illuminated psalmody, consisting of parallel Latin, Polish and German texts created toward the end of the 14th century, is probably the earliest surviving example of literature in the Old Polish language. Sometimes also known as Hedwig Psałterz, its name comes from a village in Austria — Sankt Florian. The manuscript was discovered in 1827, and first published as a printed book in Vienna, 1834. It was acquired by Poland in 1931, and is preserved in the National Library of Poland in Warsaw.

View Map + Bookmark Entry

1400 – 1450

The Earliest Grammar of a Romance Language 1437 – 1441

A statue of Leon Lattista Alberti in the Uffizi museum. (View Larger)

Between 1437 and 1441 Italian author, artist, architect, poet, priest, linguist, philosopher, and cryptographer Leon Battista Alberti wrote Grammatica della lingua toscanaThis was the earliest grammar of a Romance language. Also called the Grammatichetta vaticana, it is known from the only surviving manuscript copy included in the codex Reginense Latino 1370 preserved in Rome in the Vatican Library.

Alberti's Grammatica della lingua toscana was not published in print until 1908.

View Map + Bookmark Entry

Lorenzo Valla Proves that the Donation of Constantine is a Forgery 1440

A depiction of the Donation of Constantine in the Apostolic Palace, Vatican City, by an artist of Raphael's studio. (View Larger)

Italian priest, humanist, rhetorician and orator Lorenzo Valla circulated in manuscript De falso credita et ementita Constantini Donatione declamatio, proving on historical and linguistic grounds that the Donation of Constantine was a forgery.  Because of church opposition the essay was not formally published in print until 1517. It became popular among Protestants, and an English translation was published for Thomas Cromwell in 1534. Valla's case was so convincingly argued that it still stands today, and the illegitimacy of the Donation of Constantine is generally conceded.

Valla showed that the document could not possibly have been written in the historical era of Constantine I (4th Century), as its vernacular style dated conclusively to a later era (8th Century). One of Valla's reasons was that the document contained the word satrap which he believed Romans such as Constantine I would not have used.

The document, though met with great criticism at its introduction, was accepted as legitimate, in part owing to the beneficial nature of its content for the western church. The Donation of Constantine suggested that Constantine I "donated" the whole of the Western Roman Empire to the Roman Catholic Church as an act of gratitude for having been miraculously cured of leprosy by Pope Sylvester I. This would have obviously discounted Pepin the Short's own Donation of Pepin, which gave the Lombards land to the north of Rome.

"Valla was motivated to reveal the Donation of Constantine as a fraud by his employer of the time, Alfonso of Aragon, who was involved in a territorial conflict with the Papal States, then under Pope Eugene IV. The Donation of Constantine had often been cited to support the temporal power of the Papacy, since at least the 11th century" (Wikipedia article on Lorenzo Valla, accessed 01-17-2009).

View Map + Bookmark Entry

1450 – 1500

An Intermediate Form Between a Collection of Prints and a Blockbook Circa 1460 – 1465

It appears that no blockbooks (block books) in the literal sense were published in France in the 15th century. An example of an intermediate form between a collection of prints and a blockbook printed in France about 1465 was a collection of three woodcuts with text, printed on one side of three sheets, entitled Les neuf preux. This is known from a single copy preserved in the Bibliothèque nationale de France. 

"It consists of three sheets of paper, each of which contains an impression from a block containing three figures. They are printed by means of the frotton in light-coloured ink, and have been coloured by hand. The first sheet contains pictures of the three champions of classical times, Hector, Alexander, and Julius Caesar; the second the three champions of the Old Testament, Joshua, David, and Judas Maccabeaeus; the third, the three champions of mediaeval history, Arthur, Charlemagne, and Godfrey of Boulogne. Under each picture is a stanza of six lines, all rhyming, cut in a body type.

"These leaves form part of the Armorial of Gilles le Bouvier, who was King-at-Arms to Charles VII of France; and as the manuscript was finished between 9th November 1454 and 22 September 1457, it is reasonable to suppose that the prints were executed in France, probably at Paris, before the latter date. The verses are, at any rate, the oldest printed specimen of the French language" (Duff, Early Printed Books (1893) 17-18).

Les neuf preux is described by Ursula Baurmeister in Catalogue des incunables de la Bibliothèque nationale de France (CIBN), Vol. 1, fascicule 1 (Xylographes) no. NN-1.

The Armorial of Gilles le Bouvier is BnF Ms. fr. 4985.

In "Prints in the Early Printing Shops," Parshall (ed) The Woodcut in Fifteenth-Century Europe (2009) 39-91 Paul Needham discusses publications related to Les neuf preux.

View Map + Bookmark Entry

Gutenberg's Last Production? An Early Form of Stereotyping? 1460 – 1469

In 1460 an edition of the encyclopedic and lexicographical work by the 13th century Dominican of Genoa, Johannes Balbus (Giovanni Balbi), entitled the Summa grammaticalis quae vocatur Catholicon, was issued in Mainz by "the printer of the Catholicon", (ISTC No. ib00020000). The was the first printed book to name its place of printing. It was also called the first work printed that was not entirely religious in content, though in its non-religious aspects it was clearly preceded by the bloodletting calendar of 1456, of which only one copy survived. 

From the standpoint of lexicography Balbi became "the first lexicographer to achieve complete alphabetization (from the first to the last letter of each word)" (Oxford History of English Lexicography [2008] 30). The first four sections of of Balbi's work concerned orthography, prosody, word derivations and syntax and figures of speech. Throughout his work Balbi quoted not only from the Bible and writings of the saints but also from the Latin classics. It remained the most widely-used lexical resource during the 14th and 15th centuries, and had no serious rival until the early 16th century.

The colophon of this book reads in translation:

"This book was produced not with a reed, stylus, or quill, but by the admirable design, proportion, and adjustment of punches and matrices."

The means by which this book was printed continues to be the subject of research:

"As early as 1905 Gottfried Zedler recognized that the Catholicon edition dated Mainz 1460 exists in three impressions printed from a single setting of type but associated with three presses (with different pinhole patterns) and printed on three distinct paper stocks. In 1982 Paul Needham presented evidence that the three issues were printed at three different times, according to the datable use of their paper stocks: copies on Bull's Head paper (with which are classed the vellum copies) in 1460, copies on Galliziani paper ca. 1469, and copies on Crown and Tower papers ca. 1472. Moreover, Needham argued that the three impressions were produced, not from standing type, but from two-line 'slugs' cast from the type and capable of being reassembled for subsequent impressions. According to this theory, the first impression of the Catholicon was produced by Gutenberg himself in 1460; the 'slugs' then passed into the possession of Konrad Humery with Gutenberg's other typographic material after the latter's death in 1468 and were re-used by Humery, probably with the help of Peter Schoeffer, ca. 1469. In this view, which has aroused prolonged controversy among incunabulists, the 1460 Catholicon represents not only Gutenberg's last production but also his final achievement, the invention of an early form of stereotyping" (The Nakles Collection of Incunabula, Christie's New York, 17 April 2000, Lot 2).

"Three issues can be distinguished in spite of identical typesetting: a) printed on vellum or Bull's Head paper; b) on Galliziani paper; c) on Tower & Crown paper. This has given rise to the theory that issue a) was printed in 1460, issue b) in 1469 and issue c) about 1472; see P. Needham, in BSA 76 (1982) pp.395-456 and the articles "zur Catholicon-Forschung" in Wolfenbütteler Notizen zur Buchgeschichte 13 (1988) pp.105-232. For an alternative theory that all three states were printed about 1469, see L. Hellinga in Gb Jb 1989 pp. 47-96 and in The Book Collector (Spring 1992) pp. 28-54" (http://istc.bl.uk/search/search.html?operation=record&rsid=220621&q=0, accessed 12-28-2009).

View Map + Bookmark Entry

Probably the First Printed Book with an Index November 10, 1470

Mammotrectus super Bibliam, an etymological analysis of the Bible written in the 14th or 15th century by Giovanni Marchesini (Johannes Marchesinus), an Italian Franciscan friar from the Reggio Emilia area, was first published in print by Peter Schoeffer of Mainz (ISTC No.: im00232000), and by Helias Helye de Laufen in Beromünster, Switzerland in 1470. (ISTC No.: im00233000). A reference work for preachers, the Swiss edition of this work appears to be the first printed book to contain a thematic index at the end of the text.  The index is linked to sections in the text which are identified with letters. The Swiss edition I saw at the Lucerne Public Library in September 2012. In October 2012 a digital facsimile of this edition was available from the Bayerische Staatsbibliothek, München.

Roughly simultaneously with the Swiss edition, Peter Schoeffer issued an edition of Marchesini's text from Mainz.  That edition, of which a digital facsimile was also available from the Bayerische Staatsbibliothek, does not appear to incorporate the system of indexing the work by subject incorporated into the Swiss edition.

View Map + Bookmark Entry

The First Basic Greek Grammar and the First Book Printed in Greek Circa 1471

Printer Adam de Ambergau of Venice published the first printed edition of Emanuel (Manuel) Chrysoloras's Erotemata in Greek and Latin, in the version edited by Chrysoloras's student, the Greek teacher Guarino da Verona.

This was the first basic Greek grammar published in Europe. Because the work bears a Greek title page and large parts of the text are in Greek this may also be considered the first book printed in Greek.  Ten printed editions of this text were published in the fifteenth century.

ISTC No. ic00492000

View Map + Bookmark Entry

The First Technical Dictionary 1473 – 1474

Printer Günther Zainer of Augsburg, Germany, issued Vocabularius, with text in both Latin and German. ISTC no. iv00322000.

Vocabularius rerum was the first technical dictionary, and after the Vocabularius ex quo (1467), the first bi-lingual dictionary, of which one copy, printed in Eltville, Germany, is recorded (ISTC no. v00361700).  The work was "devoted entirely to technical terms, each with its own section, of medicine (four sections), culinary and medicinal herbs and food plants, zoology, mining and mineralogy, navigation, architecture, textiles, tanning and leather work, musical instruments, books and book production, cooking and kitchen utensils, baking, wine and viticulture, gambling, carpentry, horses and carriages, etc.

"Some of the words are highly technical, lexicographical rarities. In the section on scribes and book production we find definitions not only of the traditional scribal tools (calamus, stilus, graphius, pugillaris, etc.), but also of such specialist words as antipira (= the scribe's eye-shade, for protection against the fire or candle-light), corrosorium (= the mill or grinder to reduce chalk to a powder for the preparation of vellum), and epicausterium (= the table-cloth on which the parchment is laid for ease of writing). None of these last words occurs, for example, in Karen Gould's "Terms for Book Production in a Fifteenth-Century Latin-English Nominale", The Papers of the Bibliographical Society of America, 79 (1985), pp. 75-99. There is also an entry on the distinction between the words liber, volumen, and codex; likewise between exemplar and exemplum.' (Nicholas Poole-Wilson). . . ." (W. P. Watson Antiquarian Books, online description, accessed 08-09-2009).

"Possessed of a knowledge of names rather than of things, the mediaeval student had one urgent need - a dictionary. New words began to pour in—in Arabic, Syriac, Hebrew, and Greek—whose meanings he sought to know; and, for the medical student, there were new drugs, the composition and uses of which were essential to his practice. It is not surprising then to find books of the dictionary class among the first to be printed. . . . The Vocabularius . . . has four sections devoted to medicine: (1) De homine et de diversis membris, in which the parts of the body are defined in order, with the German equivalents; brief references to authors are given. (2) De nominibus balneatorum etc., containing all the terms relating to bathing, bleeding, and cupping. (3) De medicis et eorum que pertinent ad medicine artes. The definitions here are most interesting... Siringa is described as a metallic instrument with which a surgeon injects resolving medicines into the Virile member in order to dissolve calculi in the bladder. (4) De nominibus quorundam egritudinum, contains seven and a half folios of definitions of diseases." (Osler, Incunabula medica).

View Map + Bookmark Entry

The Earliest Printing of Any Book of the Bible in Greek 1481

In 1481 printer Bonus Accursius of Milan issued Psalterium Graeco-latinum cum Canticis. Edited by Johannes Crastonus, a Carmelite lexicographer, this liturgical psalter, publishing the Septuagint version of the Psalms with parallel Latin transation, was the earliest printing of any book of the Bible in Greek. Appended canticles including the Benedictus and the Magnificat, represented the first texts of the New Testament printed in Greek.

"The type in this psalter is similar to the very first Greek type ever cut. That first type was designed by calligrapher Demetrios Damilas, a Cretan of Milan. it was perhaps modelled on the hand of Michael Apostolis [Apostolios] (b. ca. 1422) a prolific scribe whose script was notable for its lack of ligatures (unlike, for example, the Greek types that would become favored by Aldus Manutius), making it an easy and readable handwriting to render into type. The type appeared first in an edition of Constantine Lascaris' Epitome (Milan, 1476), the first book printed entirely in Greek, and soon thereafter in two works issued by Bonus Accursius: a dictionary by Crastonus and an undated edition of Aesop. In about 1480, Accursius' books featured a new type, presumably because the earlier types were unavailable. This new type is a variant of the older one, but remains an upright cursive, relatively free of ligatures. The letters are larger, and there are many new letterforms introduced in this second version. This psalter is the fifth and last book Accursius printed with this type" (John E. Mustain, Monuments of Printing: Gutenberg Through the Book Arts Revival [2013] p. 30).

In 2012 Cornelia Linde published "Johannes Crastonus's 1481 Edition of the Psalms," The Library, XIII,Issue 2 pp. 147-63. doi: 10.1093/library/13.2.147.  In this paper she argued that Crastonus issued this edition of the Psalms for facilitating the learning of Greek. "On the basis of Psalm 1 and some additional examples, it explores how he employed the layout and changed the Latin text in order to achieve his goal. Furthermore, this article argues that the combination of works produced by Crastonus and his publisher Bonus Accursius were designed to provide a complete corpus for self-instruction in Greek." 

ISTC No.: ip01035000

View Map + Bookmark Entry

The Earliest Work Printed in England to Contain Color Printing 1486

An unidentified printer, known as the "Schoolmaster Printer," issued the Book of Hawking, Hunting, and Heraldry (also known as The Boke of St. Albans) from the town of St. Albans, England.

This work on hawking, hunting, heraldry, and etiquette was the earliest book printed in England to include color printing. It is also the first English book on heraldry and sports, and among the earliest, if not the earliest printed book written by a woman, whose name is variously given as Juliana Berners, though this attribution has been disputed. Little is known about the presumed authoress; some of the most basic information about her is given in the second edition of this work issued by Wynkyn de Worde from his press at Westminister in 1496. She is said to have been prioress of Sopwell nunnery near St Albans, and daughter of Sir James Berners, who was beheaded in 1388.

This work "was, in effect an etiquette manual, one of a number of books published at that time-- a period of social and linguistic flux following the Hundred Years War (1337-1463)-- that showed gentlemen the proper way to act. Thus, the preponderance of terms for birds and animals results from the fact that The Book of St. Albans was concerned largely with hunting, shooting, and the like. It provided instruction on how to comport oneself in the hunt, but also on how to kill, clean and cook fish and game, and in what seasons and times of the day to sally forth. The book concludes with a list of correct terms, so that one could safely say one was hunting a singular of boars--not, heaven forfend, a group of them. As such, it takes the typical function of jargon-defining and reinforcing an exclusive group--to poetical extremes that have lingered in the language since.

"Less remembered terms from the Boke reflect the social life of the age. Berners gives us the appropriate ways to speak about a group of maidens (a rage), housekeepers (a foresight), officers (an execution), and even jugglers (a neverthriving--the poor men!). The tension between relgiious and social freedom prevalent in that era is also palpable: the group term for nuns is a superfluity, and for monks it is an abominable sight. (Tellingly, for the Scottish Reformation to come, that country contains a disworship of Scots). There is no standard linguistic analysis of the different types of collective terms, but Lipton inventories them using six different categories: onomatopeia, characteristic, appearance, habitat, commentary, and error. Appearance brings us a knot of toads, for example, while characteristic gives us a building of rooks, for how rooks build their nests. By far the most illuminating are those that develop via characteristic to comment on social behavior, such as the foresight of housekeepers mentioned above; an abeisance of servants; an impatence of wives;and more cheeringly, a cajolery of taverners. In the category of errors, on the other hand, stands the rage of maidens--not related to any anger on the part of virgins, but rather coming from an Old French ragier, or wantonness (making for an unintentionally ironic commentary on maidenhood in 1486" (Gronlund,"Inventory /A Pendantry of Nouns," Cabinet-A Quarterly of Art and Culture, Issue 41, [Spring 2011] 10).  

Lipton, An Exhaltation of Larks or, The Venereal Game (1977).

ISTC no. ib01030000.

View Map + Bookmark Entry

The First Printed Grammar of a Vernacular August 18, 1492

Spanish scholar Antonio de Lebrija (also known as Elio Antonio de Lebrija, Antonius Nebrissensis, Aelius Antonius Nebrissensis, and Antonio of Lebrix; given name: Antonio Martinez de Calá) published Gramática de la lengua castellana, also known as the Grammatica Nebrissensis, and, in 15th century Spanish, Grammatica dela Lingua Castellianna. 

This work of 68 pages was published in Salamanca by an unidentified printer.

This was the first work dedicated to the Castillian or Spanish language, its orthography, prosody and syllables, etymology, diction, and syntax.  It was also the first printed grammar of a spoken language,  rather than Latin, the grammar of which was based on the written, rather than spoken language. 

In July 2011 a digital facsimile of the first edition was available at the website of the Universidad Complutense de Madrid. 

ISTC No.: ia00902000

Illich, In the Vineyard of the Text (1993) 73.

View Map + Bookmark Entry

1500 – 1550

The Transition from Latin to the Vernacular in the 16th Century Circa 1500 – 1600

"The well defined traditional groups of readers knew Latin, and many read it with ease and better than their own mother tongue. Books in the vernacular languages were for 'every man, as well rude as learned,' and the student of literacy and literary taste must be as much concerned with the 'rude' as with the learned. Latin, the language of the educated, was the international language throughout the Middle Ages; this fact is reflected by the book production. Slightly more than three-fourths of surviving incunables are in Latin, the rest in different verancular languages. Throughout the XVIth century the percentage of books in the verancular increased, caused in part by the mounting concern of authors, printers and publishers with the 'rude' (men, women and children who were able or willing to read books in their own tongue, but not in Latin). It is also true that the importance of Latin as the language of communication among the learned declined, in spite of the revival of learning and increased concern with the classics and their style. Already during the first half of the XVIth century books in Latin and those in the vernacular languages were much more evenly distributed, and by the end of the XVIth century the latter accounted probably for more than half of the total production. Latin had lost its international character except among the clergy (of the Catholic Church), a coterie of Neo-Latin writers, and limited groups of scholars and professionals. National languages had won the battle. The favorable reception of books in the mother tongue was only one of several causes. Political and religious ferment of this period involved an ever increasing number of persons. In order to reach the largest possible number, the leaders and the propagandists turned more and more to the vernacular. A third factor was the changing attitude of the educated towards their own native language" (Hirsch, Printing, Selling, Reading 1450-1550 [1967] 132). 

View Map + Bookmark Entry

The First Modern Dictionary: the Most Successful and Widely Reprinted Reference Work of the Early Modern Period 1502

In 1502 Italian lexicographer Ambrosio Calepino (or Calepio) issued  Ambrosii Calepini bergomatis eremitani dictionarium from Reggio Emilia, Italy at the press of Dionysio Bertochi.  This work became the most successful and most widely reprinted reference book of the early modern period, undergoing an astonishing 166 editions in the sixteenth century, followed by 32 in the seventeenth and 13 in the eighteenth.

Calepino, an Augustinian monk from Bergamo,

“devoted some thirty years to composing his dictionary, which focused on classical Latin usage and on encyclopedic information and literary exampled from ancient culture. In the years after his death many, mostly anonymous editors made modifications, corrections, and especially additions, often borrowing from other dictionaries . . . In the early modern period the Calepino not only became the most widely recognized brand of dictionary, still active in the early twentieth century, but it also came to stand for the entire dictionary genre . . . At the same time the success of the Calepino solidified the association of the title ‘dictionarium’ with the dictionary genre—only a few major dictionaries were called by another title” (Anne M. Blair, Too Much to Know: Managing Scholarly Information before the Modern Age [2010] 122).

The first edition of the Dictionarium was in Latin with a few Greek equivalents, but in 1545 editions began to be published with vernacular equivalents, and later editions boasted up to eleven languages.

“In the early modern period the Calepino not only became the most widely recognized brand of dictionary, still active in the early twentieth century, but it also came to stand for the entire dictionary genre” (Blair, 122).

The Dictionarium’s enormous success as a reference work meant that copies were “read to death”; also, the fact that the work underwent numerous revisions during its long publishing history suggests that the earlier editions might not have been retained in scholarly libraries. In January 2013 the first edition of Calepino's Dictionarium was quite rare: OCLC and the Karlsruhe Virtual Catalogue cited 11 copies in libraries, only one of which (the Indiana State University copy) was in the United States. 

View Map + Bookmark Entry

One of the First General Reference Works Produced for the Printed Book Market 1503

In 1503 Domenico Nani Mirabelli (Dominic Nannius Mirabellius) issued Polyanthea opus suavissimis floribus exornatum . . .  from Savona, Italy at the press of Francesco Silva. This encyclopedic work of roughly 680 pages was one of the first general reference works produced for the printed book market. It was also one of the most popular reference works printed in the sixteenth century.

Nani Mirabelli, rector of schools and archpriest of the cathedral in Savona, also served as papal secretary. The Polyanthea contained selections from the writings of over 150 authors from Aristotle to Dante, arranged in alphabetical order and covering subjects in the fields of classical antiquity, medieval history, natural history and medicine. In the preface to the work Nani Mirabelli

"boasted that he had selected the best of literature, appropriate for the moral edification of young and old and of both sexes, and desired it to 'be useful to as many people as possible' Nani devoted his ode to the reader to praising the censorship value of his selections—which 'plucked gold from amid filthy squalor' —perhaps precisely because he had cast his net more widely than his predecessors and feared criticism for doing so. He listed 163 authors excerpted and acknowledged that some of these had mocked the Holy Scriptures and taken positions contrary to the Catholic truth. But thanks to his careful selection, Nani promised safe passage through the shoals of pagan literature—both the raciness of Ovid or Horace and the obscurity of Aristotle—for the moral edification of Christians; he included quotations from a few recent authors like Dante and Petrarch. This theme of religious edification and safety was underscored by the engravings present in the first two editions. The title page of the first edition featured the author seated at an altar reaching for a basket of flowers around which were clustered religious and other worthy figures; the image helped elucidate a Greek title that Nani also explained in his preface lest readers not understand it as synonym for florilegium.

"At the same time as he played up the religious themes, Nani identified his principal audience as young people studying rhetoric. For them especially, Nani was proud to offer definitions and descriptions; Latin translations of all Greek expressions; sentences of philosophers, historians, and poets in Latin and Greek; and a tabular outline of the larger topics. The early Polyanthea served in part as a dictionary of hard words, offering in addition to the major articles, many very short ones, with just a definition, a Greek etymology, and one or even no quotation as an example" (Ann M. Blair, Too Much to Know: Managing Scholarly Information before the Modern Age [2010] 177-178; see also 179-85).

The Polyanthea went through at least 41 editions between 1503 and 1681, nearly all of which were revised and expanded by their successive editors. Blair estimated the length of the first edition at 430,000 words but esimated the 1619-20 Lyon edition at no less than 2.5 million words. Like other popular reference works of the early modern period, they tended to suffer hard usage and relatively few copies of the first edition survived. Blair was able to locate 20 copies of the first edition cited in online library catalogues; most of these copies are in Italy. In January 2013 OCLC recorded 10 copies, only three of which (Newberry Library, Harvard and U. Chicago) were in the United States.

Collinson, Encyclopedias: Their History Throughout the Ages (1964), 76-77. University of Chicago, Encyclopedism from Pliny to Borges, No. 17.

View Map + Bookmark Entry

1550 – 1600

The First Book Printed in a Goidelic Language April 24, 1567

Foirm na n-Urrnuidheadh (The Form of the Prayers), Bishop Séon Carsuel's (John Carswell's) translation into Gaelic of the Book of Common Order or "Knox's Liturgy", was published in Edinburgh at the press of Roibeard (Robert) Lekprevik. This was the first work printed in either Scottish or Gaelic, or any of the Goidelic languages.

"Its language has been characterised as 'exuberant, highly decorated classical common Gaelic', and helped forward the message of Scottish protestantism from the English-speaking south-east of the country into Gaelic-speaking Scotland. It was written in the traditional orthography of Irish Classical Common Gaelic, and Donald Meek has suggested that if it were not for Carsuel's training in this form of literacy and his decision to use it, Scottish Gaelic today may be employing, like the Manx language, a script with orthographic rules more similar to English and French than traditional Irish.

"It was also ground-breaking in its use of prose for non-heroic material, 'the first to use this type of formal Classical [Gaelic] prose'. And Carsuel had indeed complained in his work about earlier Gaelic writings, slamming the

'. . . darkness of sin and ignorance and design of those who teach and write and cultivate Gaelic, that they are more designed, and more accustomed, to compose vain, seductive, lying and worldly tales about the Tuatha De Danann and the sons of Mil and the heroes and Finn MacCoul and his warriors and to cultivate and piece together much else which I will not enumerate or tell here, for the purpose of winning for themselves the vain rewards of the world.'

"In the late 19th century, his skeleton was dug up; the skeleton measured seven feet in length, making Carsuel an extremely tall man by the standard of any era or geographical location (Wikipedia article on Séon Carsuel, accessed 12-11-2009).

Of the first edition of  Foirm na n-Urrnuidheadh, only three copies—all imperfect—are known to exist. One is in Edinburgh University Library.

View Map + Bookmark Entry

First Complete Slavic Bible July 20, 1580 – August 12, 1581

Ivan Ivan Fyodorov, Fedorov or Fedorovych (Russian: Iва́н Федоров) printed the first complete Slavic Bible. Fyodorov's book is known as the Ostrog Bible (Ukrainian: Острозька Біблія; Russian: Острожская Библия), because it was printed on the estate of the Ukrainian/Lithuanian prince, Konstanty Wasyl Ostrogski (Belarusian: Канстантын Васiль Астрожскi Lithuanian: Konstantinas Vasilijus Ostrogiškis Ukrainian: Костянтин-Василь Острозький) at Ostrog (Ostroh) Ukraine.

"The Ostrog Bible is unique among Church Slavonic Bibles in that the Old Testament was translated not from the (Hebrew) Masoretic text, but from the (Greek) Septuagint. This translation, comprising seventy-six books of the Old and New Testaments, was based on the Gennadius Bible and a manuscript of the Codex Alexandrinus. Some parts were based on Francysk Skaryna's translations.

The Ostrog Bibles were printed on two dates: 12 July 1580, and 12 August 1581. The second version differs from the 1580 original in composition, ornamentation, and correction of misprints. In the printing of the Bible delays occurred, as it was necessary to remove mistakes, to search for correct textual resolutions of questions, and to produce a correct translation. The editing of the Bible detained printing. In the meantime, Fyodorov and his company printed other biblical books. The first were those which did not require correcting: the Psalter and the New Testament.

"The Ostrog Bible is a monumental publication of 1,256 pages, lavishly decorated with headpieces and initials, which were prepared especially for it. From the typographical point of view, the Ostrog Bible is irreproachable. This is the first Bible printed in Cyrillic type. It served as the original and model for further Russian publications of the Bible. The importance of the first printed Cyrillic Bible can hardly be overestimated. Prince Ostrogski sent copies to Pope Gregory XIII and tsar Ivan the Terrible, while the latter presented a copy to an English ambassador. When leaving Ostroh, Fyodorov took 400 books with him. Only 300 copies of the Ostrog Bible are extant today" (Wikipedia article on Ostrog Bible, accessed 01-03-2010).

View Map + Bookmark Entry

The First Book Written by a European Printed in China 1583 – 1584

Jesuit Michele Ruggieri, missionary in China, and, along with father Matteo Ricci, one of the first sinologists, had his Catechism (Tianzhu shilu "True Account of God") printed in the Chinese language at Zhaoqing,  (Chao-ch’ing).  Printed by wood blocks, Ruggieri's Catechism was the first book written in Chinese by a European, and the first book written by a European in China and printed in China. 1200 copies were printed of which only two seem to have survived.

It is thought that during 1583-88 Ruggieri collaborated with Matteo Ricci "in creating a Portuguese-Chinese dictionary - the first ever European-Chinese dictionary, for which they developed a consistent system for transcribing Chinese words in Latin alphabet. A Chinese Jesuit Lay Brother Sebastiano Fernandez, who had grown up and [had] been trained in Macau, assisted in this work. Unfortunately, the manuscript was misplaced in the Jesuit Archives in Rome, and re-discovered only in 1934, by Pasquale d'Elia. This dictionary was finally published in 2001" (Wikipedia article on Michel Ruggieri, accessed 01-28-2012).

View Map + Bookmark Entry

The Vigenere Cipher 1586

French diplomat and cryptographer Blaise de Vigenère published in Paris Traicté des chiffres ou secrètes manières d'escrires

Vigenère's book described a text autokey cipher that became known as the Vigenère cipher because it was misattributed to Vigenère in the 19th century. The actual inventor of the text autokey cipher was Giovan Battista Bellaso (1563).

“Vigenère became acquainted with the writings of Alberti, Trithemius, and Porta when, at the age of twenty-six, he was sent to Rome on a two year diplomatic mission. To start with, his interest in cryptography was purely practical and was linked to his diplomatic work. Then, at the age of thirty-nine, Vigenère decided that he had accumulated enough money for him to be able to abandon his career and concentrate on a life of study. It was only then that he examined in detail the ideas of Alberti, Trithemius, and Porta, weaving them into a coherent and powerful new cipher … The cipher is known as the Vigenère cipher in honour of the man who developed it into its final form. The strength of the Vigenère cipher lies in its using not one, but 26 distinct cipher alphabets to encode a message… To unscramble the message, the intended receiver needs to know which row of the Vigenère square has been used to encipher each letter, so there must be an agreed system of switching between rows. This is achieved by using a keyword… Vigenère’s work culminated in his Traicté des Chiffres, published in 1586. Ironically, this was the same year that Thomas Phelippes was breaking the cipher of Mary Queen of Scots. If only Mary’s secretary had read this treatise, he would have knownabout the Vigenère cipher, Mary’s messages to Babington would have baffled Phelippes, and her life might have been spared” (Singh, The Code Book. The Secret History of Codes and Codebreaking, 46-51).

The Vigenère cypher was regarded as unbreakable for over 300 years, until Charles Babbage and Friedrich Kasiski independently developed a method of multiple tests to carry out successful cryptanalysis.

Leaves CCCXXVII-CCCXXXVI of Vigenère's work contain the first representations of Chinese and Japanese writing in a European printed book.

Galland, An Historical and Analytical Bibliography of the Literature of Cryptography, 193.

View Map + Bookmark Entry

1600 – 1650

The First Bibliography Published in the New World 1606

Franciscan Fray Juan Bautista published A Jesu Christo S.N. ofrece este Sermonario en lengua mexicana in Mexico, En casa de Diego Lopez Davalos. This was the second collection of sermons published Nahuatl (Aztec) prefaced with a two-page list of previously published works by Bautista. The listing of books was the first bibliography published in the Western Hemisphere.

"On signature **iii (recto and verso) is a list of 'las obras que hasta agora ha impresso el auctor' ('the works that until now the author has had published'). The list is not in chronological order nor is it alphabetical by title; nonetheless it is a bibliography and supplies us with information now known only because of its inclusion here. Of the 17 items listed, several have failed to survive in any known copy, including the second part of this sermonario: at the time of publication of part one 'de la sequnda parte esta ya impresso gran pedaço' ('of the second part a large piece is already printed')" (Szewczyk & Buffington, 39 Books and Broadsides Printed In America Before the Bay Psalm Book [1989] no. 19).

View Map + Bookmark Entry

Descartes Discusses the Idea of an Artificial Language 1629

In a letter to theologian, philosopher, and mathematician Marin Mersenne, philosopher, mathematician and physicist René Descartes proposed an artificial universal language, with equivalent ideas in different tongues sharing one symbol:

"Et si quelqu’un avait bien expliqué quelles sont les idées simples qui sont en l’imagination des hommes, desquelles se compose tout ce qu’ils pensent, et que cela fût reçu par tout le monde, j’oserais espérer ensuite une langue universelle, fort aisée à apprendre, à prononcer et à écrire."

"The notion of a universal language was based upon the idea of precisely cataloging the elements of the human imagination. The great advantage of such a language would be that it would represent everything 'distinctement.' Yet, the great problem faced by someone who wanted to create such a language was the nature of the human imagination itself. Although separate from the mind and reason, which were the foundations of Cartesian thought, the imagination nevertheless played an important role for Descartes. As he wrote elsewhere in the Meditations, the imagination not only conceptualized external things but also considers them, 'as being present by the power and internal application of my mind.' Imagination, in other words, produced the illusion of presence, figures appearing so that can the person can 'look upon them as present with the eyes of my mind.' As a result, Descartes remains highly suspicious of the imagination because it can produce appearances that have no corresponding reality. Descartes concluded his letter to Mersenne by dismissing hopes for a universal language or a real character as only being possible in a 'terrestrial paradise' or 'fairyland' because of the confused nature of signification and the variation of human understanding.

"Mais n’espérez pas de la voir jamais en usage; cela présuppose de grands changements en l’ordre des choses, et il faudrait que tout le Monde ne fût qu’un paradis terrestre, ce qui n’est bon à proposer que dans le pays des romans.

 "A universal language that would work at the level of the imagination, describing the actual 'things' of the external world, could only produce uniform results in the perfection of Eden or the ideal of fiction. One should, instead, stick with the institution of geometry as a method of rationalizing nature, a divine language grounded upon the cogito’s transmission of being. Descartes ultimately remains skeptical about any possibility of using alternative language games aside from mathematics in the project of rationalizing the world" (Batchelor, The Republic of Codes: Cryptographic Theory and Scientific Networks in the Seventeenth Century [1999] http://www.stanford.edu/dept/HPS/writingscience/Cryptography.html, accessed 01-22-2010).

View Map + Bookmark Entry

1650 – 1700

Possibly the Earliest Model for Machine Translation 1661

Physician and alchemist, Johann Joachim Becher, published Character, pro notitia linguarum universali in Frankfurt. This proposal for a universal language in numeric form may have, to some extent, anticipate the idea of machine translation.

Becher constructed a Latin dictionary that was almost ten times more vast (10,000 items). [...] For each item in Becher’s dictionary there is an Arabic number: the city of Zurich, for example, is designated by the number 10283. A second Arabic number refers the user to grammatical tables which supply verbal endings, the endings for the comparative and superlative forms of adjectives, or adverbial endings. A third number refers to case endings. The dedication 'Inventum Eminentissimo Principi' is written 4442. 2770:169:3. 6753:3, that is, '(My) Invention (to the) Eminent + superlative + dative singular, Prince + dative singular'. Unfortunately Becher was afraid that his system might prove difficult for peoples who did not know the Arabic numbers; he therefore thought up a system of his own for the direct visual representation of numbers. The system is atrociously complicated and almost totally illegible. [However, together with Gaspar Schott’s Technica curiosa (1664), Becher’s system has been seen] as tentative models for future practices of computer translation. In fact, it is sufficient to think of Becher’s pseudo-ideograms as instructions for electronic circuits, prescribing to a machine which path to follow through the memory in order to retrieve a given linguistic term, and we have a procedure for a word-for-word translation (with all the obvious inconveniences of such a merely mechanical program)’ (Umberto Eco, The Search for the Perfect Language, pp. 201–3).”  See Bernard Quaritch Ltd., Logic and Language [PDF] Autumn 2008, number 1.

View Map + Bookmark Entry

The First Complete Bible Published in the Western Hemisphere 1661 – 1663

English puritan clergyman and missionary in Roxbury, Massachusetts John Eliot, and printers Samuel Green and Marmaduke Johnson in Cambridge, Massachusetts issued The Holy Bible: Containing the Old Testament and the New, Translated into the Indian Language. 

This was the first complete edition of the bible published in the Western Hemisphere, and “the earliest example in history of the translation and printing of the entire Bible in a new language as a means of evangelization” (Darlow and Moule).

On July 27, 1649, the British Parliament enacted an "Ordinance for the Advancement of Civilization and Christianity Among the Indians." This act created The Society for the Propagation of the Gospel in New England, the first Protestant missionary society. Also in 1649 Eliot made the decision to attempt the translation of the Scriptures into the Algonquin language. Like other native American languages, Alogonquin had no written form, and it was considered one of the world's most difficult languages. The process of translation of the bible into the Natick dialect of the region's Algonquin tribes took Eliot ten years,  with the assistance of John Sassamon, a member of the local tribe, whose ability to speak and write English proved invaluable.

“When the manuscript was ready for publication, the Society for the Propagation of the Gospel in New England not only provided the funds to print it, but they also sent an English printer by the name of Marmaduke Johnson, a printing press, and a supply of paper. Johnson arrived in the New World and set to work with Samuel Green who had already started to print the New Testament. By 1661 they had completed the printing of fifteen hundred copies of the New Testament. One thousand of the New Testaments were reserved for binding with the Old Testament, when completed, to form an entire Bible. The remaining copies of the New Testament were distributed among the Algonquin tribe or sent to England as presentation copies.

"When the task of printing the New Testament was complete, Green and Johnson began printing one thousand copies of the Old Testament, which included a translation of the Metrical Psalms. The work proceeded quickly and by 1663 the printing was finished. The Old Testaments were bound with the reserved copies of the New Testament to produce one thousand copies of the entire Bible” (Samworth, John Eliot and America's First Bible, accessed 12-30-2008).

View Map + Bookmark Entry

A Universal Language Based on a Classification Scheme or Ontology, and a Universal System of Measurement 1668

English clergyman and natural philosopher John Wilkins published in London An Essay towards a Real Character and a Philosophical Language.

In this work Wilkins attempted to create a universal, artificial language, based upon an innovative classification of knowledge, by which scholars and philosophers as well as diplomats, scholars, and merchants, could communicate. Wilkins intended his "universal language" as a supplement to rather than a replacement for existing "natural" languages. His scheme has been called ingenious but completely unworkable.

In this book Wilkins also called for the institution of a "universal measure" or "universal metre," which would be based on a natural phenomenon rather than royal decree, and would also be decimal rather than the various systems of multipliers, often duodecimal, that coexisted at the time. The meter or metre would not gain traction until after the French Revolution.

By "real character" Wilkins meant:

"an ingeniously constructed family of symbols corresponding to an elaborate classification scheme developed at great labor by Wilkins and his colleagues, which was intended to provide elementary building blocks from which could be constructed the universe's every possible thing and notion. The Real Character is emphatically not an orthography in that it is not a written representation of oral speech. Instead, each symbol represents a concept directly, without (at least in the early parts of the Essay's presentation) there being any way of vocalizing it at all; each reader might, if he wished, give voice to the text in his or her own tongue. Inspiration for this approach came in part from (partially mistaken) accounts of the Chinese writing system.

"Later in the Essay Wilkins introduces his "Philospophical Language," which assigns phonetic values to the Real Characters, should it be desired to read text aloud without using any of the existing national languages. (The term philosophical language is an ill-defined one, used by various authors over time to mean a variety of things; most of the description found at the article on "philosophical languages" applies to Wilkins' Real Character on its own, even excluding what Wilkins called his "Philosophical Language")

"For convenience, the following discussion blurs the distinction between Wilkins' Character and his Language. Concepts are divided into forty main Genera, each of which gives the first, two-letter syllable of the word; a Genus is divided into Differences, each of which adds another letter; and Differences are divided into Species, which add a fourth letter. For instance, Zi identifies the Genus of “beasts” (mammals); Zit gives the Difference of “rapacious beasts of the dog kind”; Zitα gives the Species of dogs. (Sometimes the first letter indicates a supercategory— e.g. Z always indicates an animal— but this does not always hold.) The resulting Character, and its vocalization, for a given concept thus captures, to some extent, the concept's semantics.

"The Essay also proposed ideas on weights and measure similar to those later found in the metric system. The botanical section of the essay was contributed by John Ray; . . .  

 "Jorge Luis Borges wrote a critique of Wilkins' philosophical language in his essay El idioma analítico de John Wilkins (The Analytical Language of John Wilkins). He compares Wilkins’ classification to the fictitious Chinese encyclopedia Celestial Emporium of Benevolent Knowledge, expressing doubts about all attempts at a universal classification. Modern information theory also suggests that it is a bad idea to have words with similar but distinct meanings also sound similar, because mishearings and the resulting confusion would be much more prominent than in real-world languages. In The Search for the Perfect Language, Umberto Eco catches Wilkins himself making this kind of mistake in his text, using Gαde (barley) instead of Gαpe (tulip)" (Wikipedia article on An Essay towards a Real Character and a Philosophical Language, accessed 06-16-2010).

View Map + Bookmark Entry

Leibniz on Binary Arithmetic March 15, 1679 – 1705

A dated manuscript by Gottfried Wilhelm Leibniz, preserved in the Gottfried Wilhelm Leibniz, Bibliothek Niedersächsische Landesbibliothek, Hannover, “includes a brief discussion of the possibility of designing a mechanical binary calculator which would use moving balls to represent binary digits.”

Though Leibniz thought of the application of binary arithmetic to computing in 1679, the machine he outlined was never built, and he published nothing on the subject until his Explication de l'arithmétique binaire, qui se sert des seuls caracteres 0 & 1; avec des remarques sur son utilité, & sur ce qu'elle donne le sens des anciens figues Chinoises de Fohy' published in Histoire de l'Académie Royale des Sciences année MDCCIII. Avec les mémoires de mathématiques, which appeared in print in 1705.

"The publication of the Explication was prompted by Leibniz's correspondence with Joachim Bouvet, a member of the Jesuit Mission in China. Leibniz had developed an interest in China, and in April 1697 he edited a collection of letters and essays by members of the Mission, entitled Novissima Sinica. A copy of this came into the hands of Bouvet, who wrote to Leibniz on 18 October 1697 expressing his commendation of the work. Thus began an extended correspondence between the two men which proved to be very important for the dissemination of Leibniz's ideas about binary arithmetic. The crucial exchange began on 15 February 1701, when Leibniz wrote to Bouvet describing for his correspondent the principles of his binary arithmetic, including the analogy of the formation of all the numbers from 0 and 1 with the creation of the world by God out of nothing. Bouvet immediately recognised the relationship between the hexagrams of the I ching and the binary numbers and he communicated his discovery in a letter written in Peking on 4 November 1701. This reached Leibniz, after a detour through England, on 1 April 1703. With this letter, Bouvet enclosed a woodcut of the arrangement of the hexagrams attributed to Fu-Hsi, the mythical founder of Chinese culture, which holds the key to the identification. Within a week of receiving Bouvet's letter, Leibniz had sent to Abbé Bignon for publication in the Mémoires of the Paris Academy his Explication de l'Arithmétique binaire,... & sue ce qu'elle donne le sens des anciens figures Chinoises de Fohy. Ten days later he sent a brief account to Hans Sloane, the Secretary of the Royal Society. Leibniz viewed binary arithmetic less as a computational tool than as a means of discovering mathematical, philosophical and even theological truths. He remarked to Tschirnhaus in 1682 that he anticipated from the use of binary numbers discoveries in number theory that other progressions could not reveal. It was at the same time a candidate for the characteristica generalis, his long sought-for alphabet of human thought. With base 2 numeration Leibniz witnessed a confluence of several intellectual strands in his world view, including theological and mystical ideas of order, harmony and creation. Fontanelle, secretary of the Paris Academy, wrote the unsigned review of Liebniz's paper for the Mémoires section of the volume. He noted that arithmetic could have different bases besides ten; bases such as 12, and two as in the case of Leibniz's binary system. He also noted that although the binary system was not practical for common use Leibniz thought that it would be of advantage in advanced mathematics" (W.P. Watson, antiquarian book description, http://www.ilabdatabase.com/db/detail.php?booknr=360538539, accessed 01-21-2010).

This manuscript was first published in 1966 to commemorate the 250th anniversary of Leibniz's death as Herrn von Leibniz' Rechnung mit Null und Eins. That book included facsimiles of Leibniz's "Explication de l'arithmétique binaire" (1705), his two letters to Johann Christian Schulenberg on binary arithmetic (March 29 and May 17, 1698), published in the Opera Omnia of 1768, and historical articles and German translations.

View Map + Bookmark Entry

1700 – 1750

Popol Vuh, The Book of the People, Known from a Single Manuscript 1701 – 2012

Between 1701 and 1703 Domincan priest, scholar and linguist Francesco Ximénez, serving in the parish at Santo Tomás Chichicastenango, a town in the El Quiché department of Guatemala, transcribed the corpus of mytho-historical narratives of the Post Classic K'iche' kingdom known as Popol Vuh (Popol Buj, "Book of the Community", "Book of the Council", "Book of the People"). All editions of this work, written in the Classical K'iche' language, are based on the single manuscript that Father Ximénez transcribed, which is preserved in the Newberry Library, Chicago. Ximénez's manuscript recorded parallel texts in K'iche' and Spanish. What Ximénez transcribed was presumably a codex written shortly after the Spanish conquest by a Quiché native, who had learned to read and write Spanish, containing cosmological concepts and ancient traditions of this aboriginal American people, their history and origin, and the chronology of their kings down to the year 1550. The fate of the original manuscript after its transcription by Ximénez is unknown.

Prior to its arrival at the Newberry, the manuscript passed through several hands. In 1855, French writer, ethnographer, historian and archaeologist Charles Étienne Brasseur de Bourbourg found Ximénez's writings in the university library in Guatemala City, and perhaps absconded with the volume and took it to France. In 1861 Brasseur de Bourbourg published in Paris a French translation of the text as Popol Vuh, Le livre sacré et les mythes de l'antiquité américaine. After Brasseur's death in 1874 the Mexico-Guatémalienne collection containing Popol Vuh passed to French explorer, philologist, and ethnographer Alphonse Pinart, through whom it was sold to businessman and collector Edward E. Ayer, who donated his vast library on the history of native Americans in North and Central America to The Newberry Library in 1911. 

Father Ximénez's manuscript was reproduced online, with K'iche' text and Spanish and English translations by The Ohio State University

The first English translation of Popol Vuh was made by Delia Goetz and Sylvanus Griswold Morley from a translation into Spanish by Adrián Recinos and published in 1950 as The Book of the People: Popol Vuh. The National Book of the Ancient Quiché Maya. In 1954 this edition was reissued by the Limited Editions Club, finely printed by Saul Marks at The Plantin Press, Los Angeles. Late in 2012 I acquired copy of the the LEC edition. In their introduction to the translation (p. xv) the translators state that:

"Besides the Mansuscrito de Chichicastenango, the following are the original original Quiché documents which are preserved:

"1. The original manuscript of the Historia Quiché by Don Juan de Torres, dated October 24, 1580, which differens from the manuscript which Fuentes y Guzmán cites and which contains the account of the kings and lords, chiefs of the Great Houses, and of the chinamitales or calpules of the Quiché;

"2. The Spanish translation of the Títulos de los antiquos nuestros antepasados, los que ganaron las tierras de Otzoyá, written apparently in 1524 and bearing the signature of Don Pedro de Alvarado;

"3. The Spanish translation of the Título de los Señores de Totonicapán, dated 1554; and

"4. The Papel de Origen de los Señores included in the Descripción de Zapolitlán y Suchitepec, año de 1579.

"Despite thier brevity, these documents contain interesting accounts of the origin, political organization, and history of the Quiché people, which supplement the information given in the Popol Vuh."

♦ I had been unaware of the Popol Vuh until the later part of 2012 when various articles began appearing in the press concerning what was characterized as the 2012 phenomenon, "a range of eschatological beliefs that cataclysmic or transformative events would occur around 21 December 2012. This date was regarded as the end-date of a 5,125-year-long cycle in the Mesoamerican Long Count calendar, and as such, Mayan festivities to commemorate the end of the b'ak'tun 13 took place on 21 December 2012 in the countries that were part of the Mayan empire (Mexico, Guatemala, Honduras, and El Salvador), with main events at Chichén Itzá in Mexico, and Tikal in Guatemala." Because 12-21-12 happened to be my daughter Alex's 21st birthday, the topic became a source of amusement around our house.

View Map + Bookmark Entry

The First Book Printed by Muslims Using Movable Type 1729

In 1729, two years after he received permission to print, Ibrahim Muteferrika founded the first printing press in Turkey, in his home at Constantinople. According to extant Ottoman documents, during the intervening two years Muteferrika cut his own punches and cast his own Arabic type, a typeface different from European typefaces of the period, closer to Naskh (Naskhi, Nesih) the standard book hand of the Muslim world. Muteferrika's first publication—the first book printed in Arabic by Muslims— was a Turkish translation of an Arabic dictionary in two thick volumes, the first containing 666 pages and the second containing 756 pages. The edition consisted of 1000 copies.

"Known as the Sahah ('The Correct"), it was composed in the 10th century by al-Jawhari, and is one of the classics of Arabic lexiocography. It contains more than 22,000 root words, and each usage is illustrated by quotations from the poets" (http://muslimheritage.com/topics/default.cfm?ArticleID=988, accessed 06-10-2012). 

View Map + Bookmark Entry

1750 – 1800

The First Extensive Treatise on the Peruvian Knot-Based Counting Language, the Quipu 1750

In 1750 the Neopolitan polymath and inventor Raimondo di Sangro, Prince of Sansevero, issued Lettera apologetica dell'esercitato accademico della Crusca contenente la Difesa del Libro Intitolato Lettere d'una Peruana per rispetto alla supposizione de'Quipu from the press of Gennaro Morelli of Naples. This work, printed in color using a polychromatic printing process invented by the Prince, was the first extensive treatise on the Peruvian knot-based counting language, the Quipu.  

Quipu used a decimal positional system: a knot in a row farthest from the main strand represented one, next farthest ten, etc.; the absence of knots on a cord implied zero. The colors of the cords, the way the cords are connected together, the relative placement of the cords, the spaces between the cords, the types of knots on the individual cords, and the relative placement of the knots are all important parts of the recording system. ‘Quipucamayocs,’ the accountants of the Inca Empire, created and deciphered the Quipu knots, and were also capable of performing simple mathematical calculations such as adding, subtracting, multiplying, and dividing. Quipu accounts were kept by court historians in Peru that covered hundreds of years of history, but after the Conquest, the Spaniards began to resent having this second set of record-keepers contradict them. The Quipu was classified as idolatrous at the Third Council of Lima (1581-3), many examples were destroyed.  Thus, by the time Raimondo di Sangro published his book the Quipu was no longer practiced, and attempting to understand the language was a research project in cryptanalysis.

"To date, no link has yet been found between a quipu and Quechua, the native language of the Peruvian Andes. This suggests that quipus are not a glottographic writing system and have no phonetic referent. Frank Salomon at the University of Wisconsin has argued that quipus are actually a semasiographic language, a system of representative symbols—such as music notation or numerals—that relay information but are not directly related to the speech sounds of a particular language. The Khipu Database Project (KDP), begun by Gary Urton, may have already decoded the first word from a quipu—the name of a village, Puruchuco, which Urton believes was represented by a three-number sequence, similar to a ZIP code. If this conjecture is correct, quipus are the only known example of a complex language recorded in a 3-D system. (Wikipedia article on Quipu, accessed 04-07-2013).

View Map + Bookmark Entry

The Copiale Cipher is Decrypted: Initiation into a Secret Society of Oculists Circa 1760 – 1780

The Copiale Cipher, an encrypted manuscript perserved at the German Academy of Sciences at Berlin, consisting of 75,000 characters on 105 pages, was decoded in April 2011 by an international team lead by Kevin Knight of the University of Southern California, using computer techniques. 

The cipher employed in the manuscript consists of 90 different characters, from Roman and Greek letters, to diacritics and abstract symbols. Catchwords (preview fragments) of one to three or four characters are written at the bottom of left–hand pages. The plain-text letters of the message were found to be encoded by accented Roman letters, Greek letters and symbols, with unaccented Roman letters serving only to represent spaces.

"The researchers found that the initial portion of 16 pages describes an initiation ceremony for a secret society, namely the "high enlightened (Hocherleuchtete) oculist order" of Wolfenbüttel. A parallel manuscript is kept at the Staatsarchiv Wolfenbüttel. The document describes, among other things, an initiation ritual in which the candidate is asked to read a blank piece of paper and, on confessing inability to do so, is given eyeglasses and asked to try again, and then again after washing the eyes with a cloth, followed by an 'operation' in which a single eyebrow hair is plucked "(Wikipedia article on Copiale Cipher, accessed 12-11-2011).

View Map + Bookmark Entry

Reforming the Teaching of English in the United States 1783 – 1785

In 1783 American  lexicographer, textbook pioneer, English spelling reformer, political writer, editor, and prolific author Noah Webster issued from Hartford, Connecticut the first volume of A Grammatical Institute of the English Language, consisting of a speller (1783), a grammar first published in 1784, and a reader first published in 1785. 

"The Speller was arranged so that it could be easily taught to students, and it progressed by age. From his own experiences as a teacher, Webster thought the Speller should be simple and gave an orderly presentation of words and the rules of spelling and pronunciation. He believed students learned most readily when he broke a complex problem into its component parts and had each pupil master one part before moving to the next. Ellis argues that Webster anticipated some of the insights currently associated with Jean Piaget's theory of cognitive development. Webster said that children pass through distinctive learning phases in which they master increasingly complex or abstract tasks. Therefore, teachers must not try to teach a three-year-old how to read; they could not do it until age five. He organized his speller accordingly, beginning with the alphabet and moving systematically through the different sounds of vowels and consonants, then syllables, then simple words, then more complex words, then sentences.

"The speller was originally titled The First Part of the Grammatical Institute of the English Language. Over the course of 385 editions in his lifetime, the title was changed in 1786 to The American Spelling Book, and again in 1829 to The Elementary Spelling Book. Most people called it the "Blue-Backed Speller" because of its blue cover, and for the next one hundred years, Webster's book taught children how to read, spell, and pronounce words. It was the most popular American book of its time; by 1837 it had sold 15 million copies, and some 60 million by 1890—reaching the majority of young students in the nation's first century. Its royalty of a half-cent per copy was enough to sustain Webster in his other endeavors. It also helped create the popular contests known as spelling bees.

"Slowly, edition by edition, Webster changed the spelling of words, making them "Americanized." He chose s over c in words like defense, he changed the re to er in words like center, and he dropped one of the Ls in traveler. At first he kept the u in words like colour or favour but dropped it in later editions. . . .

"Webster's Speller was entirely secular. It ended with two pages of important dates in American history, beginning with Columbus's in 1492 and ending with the battle of Yorktown in 1781. There was no mention of God, the Bible, or sacred events. 'Let sacred things be appropriated for sacred purposes,' wrote Webster. As Ellis explains, 'Webster began to construct a secular catechism to the nation-state. Here was the first appearance of 'civics' in American schoolbooks. In this sense, Webster's speller becoming what was to be the secular successor to The New England Primer with its explicitly biblical injunctions' " (Wikipedia article on Noah Webster, accessed 06-05-2012).

View Map + Bookmark Entry

Foundation of Comparative Linguistics February 2, 1786 – 1788

Philologist William Jones delivered The third anniversary discourse . . . [On the Hindus] on February 2, 1786. This was first published in 1788 in Volume One of Asiatick Researches: Or, Transactions of the Society Instituted in Bengal, for Inquiring into the History and Antiquities, the Arts, Sciences and Literature, of Asia. In his paper, printed in Calcutta (Kolkata), India in the English language, Jones announced his discovery of the relationship between the Sanskrit, Greek, Latin, Gothic and Celtic languages, marking the foundation of comparative philology and historical linguistics. Jones’s “clear understanding of the basic principles of scientific linguistics provided the foundations on which Rask, Bopp and Grimm built the imposing structure of comparative Indo-European studies” (Carter & Muir, Printing and the Mind of Man [1967]) no. 235).

View Map + Bookmark Entry

The First Successful Speech Synthesizer 1791

Austro-Hungarian author and inventor, Wolfgang von Kempelen, published in Vienna Mechanismus der mensclichen Sprache nebst Beschreibung seiner sprechenden Maschine, in which he discussed the origins and development of languages, and described the first successful speech synthesizer.

Unlike von Kempelen’s fraudulent chess-playing Turk automaton , Kempelin's speech synthesizer actually worked.  Kempelen's synthesizer was the first that produced not only some speech sounds, but also whole words and short sentences. He believed that it was possible to acquire skill in using the machine within three weeks, especially if one chose to synthesize sentences in Latin, French, or Italian. German von Kempelen considered much more difficult to synthesize because of its many closed syllables and consonant clusters.

"The machine consisted of a bellows that simulated the lungs and was to be operated with the right forearm (uppermost drawing). A counterweight provided for inhalation. The middle and lower drawings show the 'wind box' that was provided with some levers to be actuated with the fingers of the right hand, the 'mouth', made of rubber, and the 'nose' of the machine. The two nostrils had to be covered with two fingers unless a nasal was to be produced. The whole speech production mechanism was enclosed in a box with holes for the hands and additional holes in its cover.

"The air flow was conducted into the mouth not only by way of an oscillating reed, but also through a narrow shunting tube. This allowed the air pressure in the mouth cavity to increase when its opening was covered tightly in order to produce unvoiced speech sounds. Driven by a spring, a small auxiliary bellows would then deliver an extra puff of air at the release.

"With the left hand, it was also possible to control the resonance properties of the mouth by varied covering of its opening. In this way, some vowels and consonants could be simulated in sufficient approximation. This was not really a simulation of natural articulation, since the shape of the mouth of the machine in itself remained constant. Some vowels and, especially, the consonants [d t g k] could not be simulated in this way, but only feigned, at best. An [l] could be produced by putting the thumb into the mouth.

"The function of the vocal cords was simulated by a slamming reed made of ivory (leftmost drawing). Although the effective length of the reed could be varied, this could not be done during speech production, so that the machine spoke on a monotone.

"Two of the levers to be actuated with the right hand served the production of the fricatives [s] and . . . as well as [z] and . . . by means of separate, hissing whistles (right drawing). A third one effectuated the production of a rattling [R] by dropping a wire on the vibrating reed (middle drawing)." (http://www.ling.su.se/staff/hartmut/kemplne.htm, accessed 12-14-2008).

Kempelin's final version of the machine, which differs slightly from the version shown in the book, is preserved in the Deutsches Museum, Munich, in the department of musical instruments.

Because Kempelin's speech synthesizer required a human for its operation it was not literally an automation but may be thought of as a forerunner of robotic or computer speech synthesizers.

View Map + Bookmark Entry

The Rosetta Stone July 15, 1799

Only July 15, 1799 Captain Pierre-François Bouchard, with Napoleon in Egypt, discovered a dark stone in the ruins of Fort St. Julien near the coastal city of Rosetta (Arabic: رشيد‎ Rašīd, French: Rosette), 65 kilometers east of Alexandria, on which was carved a decree from the Ptolemaic period in 196 BCE passed by a council of priests.

This stone was later understood to be one of a series of Ptolemaic decrees issued over the reign of the Hellenistic Ptolemaic dynasty, which ruled Egypt from 305 BCE  to 30 BCE, and put up in major temple complexes in Egypt. The Rosetta Stone affirmed the royal cult of the 13-year-old Ptolemy V as a living god on the first anniversary of his coronation. The decree was written in Egyptian Demotic script (the native script used for daily purposes), in classical Greek (the language of the administration), and in Egyptian hieroglyphs (suitable for a priestly decree). 

Following the death of Alexander the Great in 323 BCE, the Ptolemaic dynasty in Egypt had been established by the first Ptolemy, known as Ptolemy I Soter, one of Alexander's generals. Ignorant of the Egyptian language, the Ptolemies required their officials to speak
Greek and made Greek the language of their administration, a requirement that remained in effect throughout their dynasty which lasted for a thousand years. During their rule the Ptolemies made their capital city, Alexandria, the most advanced cultural center in the Greek-speaking world, for centuries second only to Rome. Among their most famous projects were the Royal Library of Alexandria and the Pharos Lighthouse, or Lighthouse of Alexandria, one of the Seven Wonders of the Ancient World

Perhaps an indirect result of the Ptolemaic dynasty's replacement of hieroglyphics by Greek among the educated non-priestly class was that most educated Egyptians gradually lost the ability to read their ancient pictographic language.  However, a more direct cause of this loss may have been the centuries of Muslim rule following the Ptolemies, under which the priests who retained the use of hieroglyphs were eliminated. Reconstructing knowledge of the ancient hieroglyphic language eventually became one of the greatest and most challenging problems for archeologists and linguists.

After its discovery in 1799 the three approximately parallel texts on the Rosetta Stone became key pieces of evidence in the research by Johan David Åkerblad and Thomas Young, culminating in Jean-François Champollion's translation of the hieroglyphic text on the stone in 1822.

The first publication on the Rosetta Stone was Antoine Isaac Silvestre de Sacy's, pamphlet: Lettre au Citoyen Chaptal . . . au sujet de l'inscription Égyptienne du monument trouvé à Rosette (Paris, 1802). In this brief work illustrated with one transcription of a portion of the stone, the orientalist and linguist Sacy, a teacher of Champollion, made some progress in identifying proper names in the demotic inscription. Within the same year another student of Sacy, the Swedish diplomat and orientalist, Johan David Åkerblad published another "lettre" in which described how he had managed to identify all proper names in the demotic text in just two months.  

"He could also read words like "Greek", "temple" and "Egyptian" and found out the correct sound value from 14 of the 29 signs, but he wrongly believed the demotic hieroglyphs to be entirely alphabetic. One of his strategies of comparing the demotic to Coptic later became a key in Champollion's eventual decipherment of the hieroglyphic script and the Ancient Egyptian language" (Wikipedia article on Johan David Akerblad, accessed 12-27-2012).

The Rosetta Stone was forfeited to the English in 1801 under the terms of the Treaty of Alexandria. In 1802 it was placed in the British Museum, where it remains.

"At some period after its arrival in London, the inscriptions on the stone were coloured in white chalk to make them more legible, and the remaining surface was covered with a layer of carnauba wax designed to protect the Rosetta Stone from visitors' fingers. This gave a dark colour to the stone that led to its mistaken identification as black basalt. These additions were removed when the stone was cleaned in 1999, revealing the original dark grey tint of the rock, the sparkle of its crystalline structure, and a pink vein running across the top left corner. Comparisons with the Klemm collection of Egyptian rock samples showed a close resemblance to rock from a small granodiorite quarry at Gebel Tingar on the west bank of the Nile, west of Elephantine in the region of Aswan; the pink vein is typical of granodiorite from this region. The Rosetta Stone is now 114.4 centimetres (45 in) high at its highest point, 72.3 cm (28.5 in) wide, and 27.9 cm (11 in) thick. It weighs approximately 760 kilograms (1,700 lb). It bears three inscriptions: the top register in Ancient Egyptian hieroglyphs, the second in the Egyptian demotic script, and the third in Ancient Greek. The front surface is polished and the inscriptions lightly incised on it; the sides of the stone are smoothed, but the back is only roughly worked, presumably because this would have not been visible when it was erected" (Wikipedia article Rosetta Stone, accessed 06-10-2011).

♦ When I revised this database entry in October 2012 I noted that the Rosetta Stone was the most widely viewed object in the British Museum. Reflective of this intense interest, the British Museum shop then offered a remarkably wide range of products with the Rosetta Stone motif, ranging from umbrellas, to coffee mugs, mousepads, neckties, and iPhone cases.

View Map + Bookmark Entry

1800 – 1850

Phasing Out Latin as the International Language 1800

Around the year 1800 publication of scientific and medical books in Latin— the international language of scholarship, religion, and science since the Roman Empire— gradually ceased. As the 19th century unfolded most scientific and medical books were published in their vernacular language of authorship, or in French, German or English. Works of scholarship or bibliography that involved Latin texts, and assumed knowledge of Latin, continued to be published in Latin mainly during the first half of the 19th century.

View Map + Bookmark Entry

Webster's Dictionary 1806 – 1828

In 1806 American lexicographer, textbook pioneer, English spelling reformer, and writer Noah Webster published from Hartford and New Haven, Connecticut, A Compendious Dictionary of the English Language. In which Five Thousand Words are added to the number found in the Best English Compends; The Orthography, in some instances, corrected; the Pronunciation marked by an Accent or other suitable Direction; and the Definitions of many Words amended and improved. This small octavo volume was the first dictionary of American English. It was innovative in several ways: through the reform of spelling, through its guides to pronunciation, through its inclusion of etymologies, and through the modernity of its word selection and its definitions.  The work was designed to be both brief and portable. Its 400 pages were mostly divided into two columns and definitions were printed in small type, across one column each, and margins on each page were minimal.

Almost as soon as his first dictionary was published Webster began composition of an expanded and fully comprehensive dictionary, which took him 18 years to complete.  In 1828 when Webster was 70 years old his An American Dictionary of the English Language was finally published in 2 thick quarto volumes containing 70,000 entries. 2500 copies were printed at the high cost of $20 each. Copies sold slowly, and were not all bound at the same time, resulting in binding variants.

"To evaluate the etymology of words, Webster learned twenty-six languages, including Old English (Anglo-Saxon), German, Greek, Latin, Italian, Spanish, French, Hebrew, Arabic, and Sanskrit. Webster hoped to standardize American speech, since Americans in different parts of the country used different languages. They also spelled, pronounced, and used English words differently.

"Webster completed his dictionary during his year abroad in 1825 in Paris, France, and at the University of Cambridge. His book contained seventy thousand words, of which twelve thousand had never appeared in a published dictionary before. As a spelling reformer, Webster believed that English spelling rules were unnecessarily complex, so his dictionary introduced American English spellings, replacing "colour" with "color", substituting "wagon" for "waggon", and printing "center" instead of "centre". He also added American words, like "skunk" and "squash", that did not appear in British dictionaries . . ." (Wikipedia article on Noah Webster, accessed 06-05-2012).

Webster's original manuscript of his 1828 dictionary is preserved in the Morgan Library & Museum.

View Map + Bookmark Entry

Deciphering the Hieroglyphs 1822

Having examined texts brought back from Egypt from Napoleon's Egyptian campaigns, Jean-François Champollion published in Paris Lettre à M. d'Acier relative à l'alphabet des hiéroglyphes phonétiques. In this 55-page work in which the evidence of the Rosetta Stone played a key role, Champollion began to identify a relationship between hieroglyphic and non-hieroglyphic scripts, deciphering Egyptian hieroglyphs, the meaning of which had been lost for over 1500 years.

View Map + Bookmark Entry

The First Indigenous Arabic Press in Egypt December 1822

In 1822 Muhammad Ali Pasha al-Mas'ud ibn Agha (Arabic: محمد علي باشا‎, Muḥammad ʿAlī Bāšā), self-declared Khedive of Egypt and Sudan, established a government press in Bulaq (Boulaq), Egypt, to print manuals for the military, an official manual for the administration, and textbooks for new schools.

This was the first indigenous Arabic press set up in Egypt by Muslims. It was also the first government press on the African continent, apart from the short-lived presses briefly established by Napoleon during his Egyptian campaign.

"In 1815 he [Muhammad Ali] sent Nicolas Musabiki to Rome and Milan to study type-founding and printing. Muhammad Ali also ordered three presses from Milan - along with the necessary paper and ink from Leghorn and Trieste - and, when Musabiki returned, made him manager of the Bulaq Press, working under 'Uthman Nur al-Din. The press itself, in the meantime, had been established in old Nile port of Bulaq, now a suburb of Cairo, and shortly afterwards, the second, and largest, student mission - it numbered 44 students - had returned from Paris. These men, under the leadship of Rifa'a Bey Rafi' al-Tahtawi, had studied French with a view to the translation of technical books into Arabic. The most prolific of these translators turned out to be al-Tahtawi himself. 

"Al -Tahtawi had been educated at al-Azhar University, then and now the most prestigious center for the study of the Islamic sciences in the Muslim world. There was apparently no opposition by the Shaikhs of al-Azhar to the innovation of printing. . . . Muhmmad Ali attached several professors from al-Azhar to the Bulaq Press to learn the art of printing; one became head of the foundry, another printer-in-chief, and others worked as compositors and proofreaders.

"Between 1822 and 1842, the press at Bulaq published 243 titles. . . . By far the largest number of books - 48 - were on military and naval subjects. Muhammad Ali had seen both the French and the English fleets in action, and realized how vulnerable Egypt was to invasion from the sea. He had also noted how successful the modern arms of the French had been against the antiquated weapons of the Mamluks.

"Interestingly though, the next largest category of books published by the Bulaq Press was poetry. Twenty-six works of poetry in Turkish, Persian and Arabic were published in the first 20 years of the press' operation; clearly the men associated with the Bulaq Press were as interested in traditional Islamic literature as they were in translation of European works on military tactics. After poetry comes grammar, with 21 titles, mathematics and mechanics with 16, medicine with 15 and veterinary medicine with 12. Thre rest of the books published by the press were on religion, botany, agriculture, political administration and so forth" (http://muslimheritage.com/topics/default.cfm?ArticleID=988, accessed 06-10-2012).

In December 1822 the Bulaq Press issued its first book, an Italian-Arabic dictionary by Raphael Antoine Zakhour, an Egyptian born Roman Catholic monk from Aleppo, who had accompanied Napoleon's French expedition on its return to France as a translator:

Dizionario Italiano e Arabo che Contiene in Succinto Tutti Vocaboli che Sono Piu in Uso e Piu Necessari per Imparpar a Parlare de Due Lingue Correttamente Egli e Diviso in Due Parti. Part 1. De Dizionario Disposto Com il Solito Nell-ordine Alfabetico. Parte II. Che Contiene Una Breve Raccolta di Nomi e di Verbi li Piu Neccesari, e Piu Utili all Studio Dell Due Lingue. Bolacco: Dall Stamperio Reale, M.D.CCC.XXII.


Conforming with the idea of Muhammad Ali of "openness toward Europe to achieve development," Italian delegations were sent to Italy, and Italian became the first foreign language taught in Egyptian schools.

By 1851 the Bulaq press issued 570 works.

Cheng-Hsiang Hsu, "A Survey of Arabic-character Publications Printed in Egypt during the Period of 1238-1267 (1822-1851)," Sadgrove (ed) History of Printing and Publishing in the Languages and Countries of the Middle East (2005) 1-16.

View Map + Bookmark Entry

Deciphering the Hieroglyphs 1823

English physician, scientist and polymath Thomas Young published An Account of Some Recent Discoveries in Hieroglyphical Literature, and Egyptian Antiquities.

"Young was also one of the first who tried to decipher Egyptian hieroglyphs, with the help of a demotic alphabet of 29 letters built up by Johan David Åkerblad in 1802 (15 turned out to be correct), but Åkerblad wrongly believed that demotic was entirely alphabetic. 'Dr Young however showed that neither the alphabet of Akerblad, nor any modification of it which could be proposed, was applicable to any considerable part of the enchorial portion of the Rosetta inscription beyond the proper names.'  By 1814 Young had completely translated the "enchorial" (demotic, in modern terms) text of the Rosetta Stone (he had a list with 86 demotic words), and then studied the hieroglyphic alphabet but initially failed to recognise that the demotic and hieroglyphic texts were paraphrases and not simple translations. Some of Young's conclusions appeared in the famous article "Egypt" he wrote for the 1818 edition of the Encyclopædia Britannica.

"When the French linguist Jean-François Champollion in 1822 published a translation of the hieroglyphs and the key to the grammatical system, Young (and many others) praised his work. In 1823 Young published an Account of the Recent Discoveries in Hieroglyphic Literature and Egyptian Antiquities in order to have his own work recognised as the basis for Champollion's system. In this he made it clear that many of his findings had been published and sent to Paris in 1816. Young had correctly found the sound value of six signs, but had not deduced the grammar of the language. Champollion was unwilling to share the credit. In the ensuing schism, strongly motivated by the political tensions of that time, the British championed Young, while the French supported Champollion. Champollion maintained that he alone had deciphered the hieroglyphs, although his understanding of the hieroglyphic grammar showed the same mistakes made by Young. However, after 1826, when Champollion was a curator in the Louvre he did offer Young access to demotic manuscripts" (Wikipedia article on Thomas Young, accessed 07-28-2009).

View Map + Bookmark Entry

Decipherment of the Mayan System of Counting 1832

Because of the destruction of most of the Maya codices in the sixteenth century, scholars had extremely limited access to the original texts. It was not until 1810 that the first reproduction of any Mayan codex— five pages from the Dresden Codex— were reproduced by Alexander von Humboldt in his Vues de cordillères, et monuments des peuples indigènes de l'Amérique. From this very limited reproduction in 1832 European-American autodidact polymath, mathematician, botanist, zoologist, and malachologist Constantine Samuel Rafinesque, while working in Philadelphia, deciphered the Maya's system of numerals.

In 1832 Rafinesque published his discovery in his periodical, the Atlantic Journal, and Friend of Knowledge: A Cyclopedic Journal and Review of Universal Science and Knowledge: Historical, Natural, and Medical Arts and Sciences: Industry, Agriculture, Education, and Every Useful Information. He announced it in a three-part article addressed to Jean-François Champollion, whose name he misspelled, "on the Graphic systems of America, and the Glyphs of Otolum or Palenque, in Central America." In the second part of this article, on page 42, Rafinesque briefly explained his discovery of the meaning of the Maya bar and dot system in which a dot equals one and a bar equals five. 

 "Later findings proved him right and also revealed that the Maya even had a symbol for zero, which appeared on Mesoamerican carvings as early as 36 B.C. (Zero didn't appear in Western Europe until the 12th century)"  (http://www.pbs.org/wgbh/nova/mayacode/time-flash.html, accessed 10-10-2009).

Like most of Rafinesque's numerous other publications, his Atlantic Journal enjoyed very limited success, and folded after only eight issues.  Copies of the original edition are extremely rare.  My copy is a facsimile reprint issued by the Arnold Arboretum, Boston, in 1946.

View Map + Bookmark Entry

Probably the First Book on a Secular Subject Printed in Arabic in Middle East 1836

In 1836 a pocket-sized Arabic grammar was issued from the American Press, in Beirut, Lebanon in an edition of 1000 copies. This was probably the first book on a secular subject printed in Arabic in the Middle East. The work by Nasif al-Yaziji, Kitab fasl al-khitab fi usul lughat al-a'rab (The Conclusive Discouse of the Rules of the Arab's Language)

". . . was printed by the Protestant missionaries of the 'American Board of Commissioners for Foreign Missions' (ABCFM) who had opened a printing shop in Beirut two years earlier in 1834. The author of the concise treatise on Arabic grammar was Nasif al-Yaziji (1800-1871) a local Greek Catholic scholar from a little village south of Beirut who later became one of the most celebrated Christian Arab authors of the nineteenth century. With his numerous philological works, but moreover with his poetry and rhyming prose he influenced a whole generation of Arab intellectuals and thus became a pioneer and outstanding protagonist of the so called Nahda, the renaissance of Arabic language and literature" (Lehrstuhl für Türkische Sprache, Geschichte und Kultur, Universität Bamberg, The Beginnings of Printing in the Near and Middle East: Jews, Christians and Muslims [2001] no. 5).

View Map + Bookmark Entry

1850 – 1875

Origins of the Oxford English Dictionary (OED) 1857

Richard Chenevix Trench, Dean of Westminster, published On some Deficiencies in our English Dictionaries. Being the Substance of two Papers read before the Philological Society, Nov. 5 and November 19, 1857. Trench's speeches laid down the desiderata for a new English dictionary based on historical principles.  Two months later the Philological Society resolved that A New English Dictionary, as it was first called, should be compiled, readers were called for, and the project began.

In 1860 Trench published a revised and enlarged second edition of his pamphlet, including "A Letter to the Author from Herbert Coleridge, Esq. on the Progress and Prospects of the Society's New English Dictionary."

View Map + Bookmark Entry

The Largest Dictionary in Book Form 1863

The first fascicule (A-Aanhaling) of the Woordenboek der Nederlandsche Taal (English: "Dictionary of the Dutch language") was published in The Hague in 1863. This became the largest dictionary in the world in print, eventually containing over 430,000 entries of Dutch words from 1500 to 1921 in 43 volumes and close to 50,000 pages. The last fasciculde (Zuid-Zythum) was published in 1998. Three supplements containing modern Dutch words were published in 2001.

Since January 27, 2007, the dictionary has been available online. There is no charge for access but registration is required.

View Map + Bookmark Entry

1875 – 1900

3,500,000 Quotations on Individual Slips of Paper 1882 – 1884

Scottish lexicographer and philologist James Murray, working in a corrugated out-building on the grounds of Mill Hill School, in Mill Hill, London, called "The Scriptorium, " began the process of accumulating and organizing the data for what became known as the Oxford English Dictionary.

In the summer of 1884, Murray and his family moved to a large house on the Banbury Road in north Oxford.  There Murray had a second corrugated iron Scriptorium lined with bookshelves and 1,029 pigeon-holes for quotation slips built in the back garden— a larger building than the first, with more storage space for the ever-increasing number of slips being sent to Murray and his team. Anything addressed to ‘Mr Murray, Oxford’ would always find its way to him, and such was the volume of post sent by Murray and his team that the Post Office erected a special post box outside Murray’s house.  Each day Murray received 1000 quotations from contributors to the A New English Dictionary on Historical Principles.

Murray eventually accumulated 3,500,000 quotations sent in by contributors, each on an individual slip of paper.

View Map + Bookmark Entry

The O E D Finally Begins Publication February 1, 1884

Twenty-three years after the project began, the first fascicule of  A New English Dictionary on Historical Principles; Founded Mainly on the Materials Collected by The Philological Society was published, under the editorship of James Murray

The 352-page volume, covering words from A to Ant, cost 12s.6d or U.S.$3.25. The total sales of this fascicule were 4000 copies. The dictionary was complete in 125 fascicules, the last of which was published on April 19, 1928. The name Oxford English Dictionary (OED) was first used for the work in 1895.

View Map + Bookmark Entry

"Memory: A Contribution to Experimental Psychology" 1885

In 1885 German psychologist Hermann Ebbinghaus published Über das Gedachtnis: Untersuchungen zur experimentellen Psychologie in Leipzig through Duncker & Humblot, publishers. As a result of this book Ebbinghaus was made professor at the University of Berlin. Almost 30 years after it was published Ebbinghaus's book was translated into English by Henry A. Ruger & Clara E. Bussenius as Memory: A Contribution to Experimental Psychology (New York, 1913), reflecting the continuing usefulness of his work.

". . . this monograph marked the beginning of programmatic experimental research on higher mental processes. Using himself as a subject, gathering data for over a year (1879-80), and then replicating the entire procedure (1883-4) before publishing, Ebbinghaus not only brought learning and memory into the laboratory, he set a standard for careful scientific work in psychology that has rarely been surpassed.

"In order to proceed with his research, Ebbinghaus had first to invent stimulus materials. These needed to be relatively simple, neutral as to meaning, and homogeneous. They needed to be available in large numbers and to allow quantitative manipulation of the amount of material to be retained. In answer to these needs, Ebbinghaus hit upon the idea of a 'nonsense syllable.' As he described it: 'Out of the simple consonants of the alphabet and our eleven vowels and diphthongs all possible syllables of a certain sort were constructed, a vowel sound being placed between two consonants. These syllables, about 2,300 in number, were mixed together and then drawn out by chance and used to construct series of different lengths, several of which each time formed the material for a test.'

"Next Ebbinghaus had to develop novel methods for controlling the degree of learning and measuring the amount of retention. At first glance, it would seem that the most obvious method for controlling learning would have been to standardize the number of learning trials. The problem with this method, however, is that the degree to which any given material is learned in a fixed number of trials may vary as a function of the material or the mental state (e.g., attention, fatigue) of the learner. To circumvent this limitation and assure that material was learned to approximately the same degree from test to test, Ebbinghaus introduced the method of learning to criterion. In learning to criterion, the subject repeated the material as many times as was necessary to reach an a priori level of accuracy (e.g., one perfect reproduction).  

"Measuring the amount of retention also presented Ebbinghaus with a puzzle. Because it is influenced by whole host of factors, conscious recall of material can vary from moment to moment even when the material has been well learned; worse yet, material may not be available to conscious recall at all even though it has been retained to some degree. To avoid this problem, Ebbinghaus invented the 'savings method'. Subtracting the number of repetitions required to relearn material to a criterion from the number originally required to learn the material to the same criterion provided an index of retention that was independent of whether the material could be consciously recalled.

"With these methods, Ebbinghaus obtained a remarkable set of results. He was the first to describe the shape of the learning curve. He reported that the time required to memorize an average nonsense syllable increases sharply as the number of syllables increases. He discovered that distributing learning trials over time is more effective in memorizing nonsense syllables than massing practice into a single session; and he noted that continuing to practice material after the learning criterion has been reached enhances retention.  

"Using savings as an index, he showed that the most commonly accepted law of association, viz., association by contiguity (the idea that items next to one another are associated) had to be modified to include remote associations (associations between items that are not next to one another in a list). He was the first to describe primacy and recency effects (the fact that early and late items in a list are more likely to be recalled than middle items), and to report that even a small amount of initial practice, far below that required for retention, can lead to savings at relearning. He even addressed the question of memorization of meaningful material and estimated that learning such material takes only about one tenth of the effort required to learn comparable nonsense material.  

"Finally, in the treatment of his results, Ebbinghaus made considerable use of mathematics. He not only assessed statistical significance but characterized his findings in mathematical terms. Given this quantitative treatment, Ebbinghaus's methodological innovations, and the care with which he carried out his research, it is not surprising that his results have stood the test of time. Indeed, in the century since the publication of his monograph, surprisingly little has been learned about rote learning and retention that was not already known to Ebbinghaus" (Robert A. Wozniak, Introduction to Memory, Hermann Ebbinghaus (1885/1913), accessed 12-30-2012).

View Map + Bookmark Entry

1930 – 1940

The First Electronic Speech Synthesizer 1936 – 1939

Between 1936 and 1939 electronic and acoustic engineer Homer Dudley and a team of engineers at Bell Labs produced the first electronic speech synthesizer, called the Voder ("Voice Operation DEmonstratoR").

The Voder was demonstrated at the 1939 World's Fair in Flushing Meadows, New York and the 1939 Golden Gate International Exposition on Treasure Island, San Francisco Bay, by experts who used a keyboard and foot pedals to play the machine and emit speech.

View Map + Bookmark Entry

1940 – 1950

Does Language Influence Thought? April 1940

In April 1940 American chemist, anthropologist and linguist Benjamin Lee Whorf published "Science and Linguistics," M.I.T.'s Technological Review, 42: no. 6 (April, 1940) 229-231, 247-248, in which he developed controversial ideas concerning linguistic relativity— the hypothesis that language influences thought.

View Map + Bookmark Entry

The Earliest Work Leading toward Machine Translation 1947

Working at the Princeton IAS machine, Andrew D. Booth and Kathleen Britten wrote a program for realizing a translation dictionary on an electronic computing machine, provided that the necessary storage capacity was available.

This may be the earliest work leading toward machine or computer translation.

View Map + Bookmark Entry

Warren Weaver Suggests Applying Cryptanalysis Techniques to Translation March 4 – May 9, 1947

On March 4, 1947 mathematician and Director of the Division of Natural Sciences at the Rockefeller Foundation in New York Warren Weaver sent the following letter to Norbert Wiener, suggesting that cryptanalysis techniques might be applied to translation, and that a computer could be built for the purpose. This letter, preserved at the Rockefeller Archives Center, may the origin of efforts at machine translation: 

"Dear Norbert:

I was terribly sorry, when in Cambridge recently, that I got un- avoidably held up by several unexpected jobs, and did not get a chance to see you.

One thing I wanted to ask you about is this. A most serious problem, for UNESCO and for the constructive and peaceful future of the planet, is the problem of translation, as it unavoidably affects the communication between peoples. Huxley has recently told me that they are appalled by the magnitude and the importance of the translation job.

 Recognizing fully, even though necessarily vaguely, the semantic difficulties because of multiple meanings, etc., I have wondered if it were unthinkable to design a computer which would translate. Even if it would translate only scientific material (where the semantic difficulties are very notably less), and even if it did produce an inelegant (but intelligible) result, it would seem to me worth while.

Also knowing nothing official about, but having guessed and inferred considerable about, powerful new mechanized methods in cryptography - methods which I believe succeed even when one does not know what language has been coded - one naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say "This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode."

Have you ever thought about this? As P. linguist and expert on computers, do you think it is worth thinking about?

Cordially,

Warren Weaver

In his reply dated April 30, 1947 Wiener was not optimistic regarding the possibility of machine translation:

"Dear Warren:  

First, I want to thank you and The Rockefeller Foundation for the almost unlimited number of favors that I have been receiving. I think and hope, at any rate, that we shall be able to come across in such a way as to at least partly justify your expenditure.

Second - as to the problem of mechanical translation, I frankly am afraid the boundaries of words in different languages are too vague and the emotional and international connotations are too extensive to make any quasi mechanical translation scheme very hopeful. I will admit that basic English seems to indicate that we can go further than we have generally done in the mechanization of speech, but you must remember that in certain respects basic English is the reverse of mechanical and throws upon such words as "get," a burden, which is much greater than most words carry in conventional English. At the present time, the mechanization of language, beyond such a stage as the design of photoelectric reading opportunities for the blind, seems very premature. By the way, I have been fascinated by McCulloch's work on such apparatus, and, as you probably know, he finds the wiring diagram of apparatus of this kind turns out to be surprisingly like the microscopic analogy of the visual cortex in the brain.

"I have heard that your health is much better, and I certainly hope so. I shall try to look you up before I sail for France.  

"Sincerely yours,

"Norbert Wiener

Weaver, however, maintained his belief in the possibility of machine translation in spite of Wiener's pessimism, writing back on May 9, 1947:

"Dear Norbert:

Thank you for your letter of April 30. I am sure that Dr. Morrison and I will both be very glad to have you tell us, from tine to time, about the progress of your collaborative program with Rosenblueth. And I will be most interested, after your re- turn from France, to hear your comments on your trip there.

"I am disappointed but not surprised by your comments on the translation problem. The difficulty you mention concerning Basic seems to me to have a rather easy answer. It is, of course, true that Basic puts multiple use on an action verb such as "get." But even so, the two-word combinations such as "get up," "get over," "get back," etc., are, in Basic, not really very numerous. Suppose we take a vocabulary of 2,000 words, and admit for good measure all the two-word combinations as if they were single words. The vocabulary is still only four million: and that is not so formidable a number to a modern computer, is it?

Cordially,

Warren Weaver"

(http://www.mt-archive.info/Weaver-1947-original.pdf, accessed 10-25-2011).

 

View Map + Bookmark Entry

"Nineteen Eighty-Four" 1949

Eric Arthur Blair, under his pseudonym, George Orwell, published the dystopian novel, Nineteen Eighty-Four in London. "The story follows the life of one seemingly insignificant man, Winston Smith, a civil servant assigned the task of falsifying records and political literature, thus effectively perpetuating propaganda, who grows disillusioned with his meagre existence and so begins an ultimately futile rebellion against the system.

"The novel has become famous for its satirical portrayal of surveillance and society's increasing encroachment on the rights of the individual. Since its publication the terms Big Brother and Orwellian have entered the popular vernacular."

"Nineteen Eighty-Four's impact upon the English language is extensive; many of its concepts: Big Brother, Room 101 (the worst place in the world), the Thought Police, the memory hole (oblivion), doublethink (simultaneously holding and believing two contradictory beliefs), and Newspeak (ideological language), are common usages for denoting and connoting overarching, totalitarian authority; Doublespeak is an elaboration of doublethink; the adjective "Orwellian" denotes that which is characteristic and reminiscent of George Orwell's writings, specifically 1984. The practice of appending the suffixes "-speak" and "-think" (groupthink, mediaspeak) to denote unthinking conformity. Many other works, in various forms of media, have taken themes from Nineteen Eighty-four" (Wikipedia article on Nineteen Eighty-Four).

View Map + Bookmark Entry

The Origin of Statistical Machine Translation July 15, 1949

Mathematician Warren Weaver, a student of Claude Shannon's information theory, and in charge of science grants at the Rockefeller Foundation, New York, circulated a memorandum entitled Translation, suggesting that language translation by computer might be possible.

Weaver's memorandum has been called the origin of statistical machine translation

(See Reading 10.1.)

View Map + Bookmark Entry

1950 – 1960

"Language and Communication" 1951

In 1951 American cognitive psychologist George Armitage Miller, then teaching at Harvard, published Language and Communication. Influenced by Claude Shannon's A Mathematical Theory of Communication (1948), this book

"used a probabilistic model imposed on a learning-by-association scheme borrowed from behaviorism, with Miller not yet attached to a pure cognitive perspective.The first part of the book reviewed information theory, the physiology and acoustics of phonetics, speech recognition and comprehension, and statistical techniques to analyze language. The focus was more on speech generation than recognition. The second part had the psychology: idiosyncratic differences across people in language use; developmental linguistics; the structure of word associations in people; use of symbolism in language; and social aspects of language use " (Wikipedia article on Goerge Armitage Miller, accessed 12-30-2012).

View Map + Bookmark Entry

Decipherment of Linear B 1952 – 1953

English architect and classical scholar Michael Ventris and John Chadwick, an English linguist and classical scholar at Cambridge, deciphered Linear B, proving that this Mycenaean language is an early form of Greek.

Ventris & Chadwick, Documents in Mycenaean Greek (1956), chapters 1-2.

Chadwick, The Decipherment of Linear B (1958).

View Map + Bookmark Entry

The Georgetown-IBM Experiment in Machine Translation January 7, 1954

Developed jointly by Georgetown University and IBM, the Georgetown-IBM experiment in computational linguistics involved completely automatic translation of more than sixty Russian sentences into English.

"Conceived and performed primarily in order to attract governmental and public interest and funding by showing the possibilities of machine translation, it was by no means a fully-featured system: It had only six grammar rules and 250 items in its vocabulary. Apart from general topics, the system was specialised in the domain of organic chemistry. The translation was done using a IBM 701 mainframe computer.

"Well publicized by journalists and perceived as a success, the experiment did encourage governments to invest in computational linguistics. The authors claimed that within three or five years, machine translation would be a solved problem."

View Map + Bookmark Entry

"The Magical Number Seven, Plus or Minus Two. . . " April 15, 1955 – 1956

In 1956 American cognitive psychologist George Armitage Miller, then teaching at Harvard, published "The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information," Psychological Review, Vol. 63, No. 2, 81-97. He had read the paper before the Eastern Psychological Association on April 15, 1955. 

"From the days of William James, psychologists had the idea memory consisted of short-term and long-term memory. While short-term memory was expected to be limited, its exact limits were not known. In 1956, Miller would quantify its capacity limit in the paper 'The magical number seven, plus or minus two'. He tested immediate memory via tasks such as asking a person to repeat a set of digits presented; absolute judgment by presenting a stimulus and a label, and asking them to recall the label later; and span of attention by asking them to count things in a group of more than a few items quickly. For all three cases, Miller found the average limit to be seven items. He had mixed feelings about the focus on his work on the exact number seven for quantifying short-term memory, and felt it had been misquoted often. He stated, introducing the paper on the research for the first time, that he was being persecuted by an integer. Miller also found humans remembered chunks of information, interrelating bits using some scheme, and the limit applied to chunks. Miller himself saw no relationship among the disparate tasks of immediate memory and absolute judgment, but lumped them to fill a one-hour presentation" (Wikipedia article on George Armitage Miller, accessed 12-30-2012). 

"The word ‘'chunking’' comes from a famous 1956 paper by George A. Miller, The Magical Number Seven, Plus or Minus Two: Some Limits on our Capacity for Processing Information. At a time when information theory was beginning to be applied in psychology, Miller observed that some human cognitive tasks fit the model of a 'channel capacity,' characterized by a roughly constant capacity in bits, but short-term memory did not. A variety of studies could be summarized by saying that short-term memory had a capacity of about "seven plus-or-minus two" chunks. Miller wrote that 'With binary items the span is about nine and, although it drops to about five with monosyllabic English words, the difference is far less than the hypothesis of constant information would require (see also, memory span ). The span of immediate memory seems to be almost independent of the number of bits per chunk, at least over the range that has been examined to date.' Miller acknowledged that 'we are not very definite about what constitutes a chunk of information.' Miller noted that according to this theory, it should be possible to effectively increase short-term memory for low-information-content items by mentally recoding them into a smaller number of high-information-content items. 'A man just beginning to learn radio-telegraphic code hears each dit and dah as a separate chunk. Soon he is able to organize these sounds into letters and then he can deal with the letters as chunks. Then the letters organize themselves as words, which are still larger chunks, and he begins to hear whole phrases.' Thus, a telegrapher can effectively 'remember' several dozen dits and dahs as a single phrase. Naive subjects can only remember about nine binary items, but Miller reports a 1954 experiment in which people were trained to listen to a string of binary digits and (in one case) mentally group them into groups of five, recode each group into a name (e.g. "twenty-one" for 10101), and remember the names. With sufficient drill, people found it possible to remember as many as forty binary digits. Miller wrote: 'It is a little dramatic to watch a person get 40 binary digits in a row and then repeat them back without error. However, if you think of this merely as a mnemonic trick for extending the memory span, you will miss the more important point that is implicit in nearly all such mnemonic devices. The point is that recoding is an extremely powerful weapon for increasing the amount of information that we can deal with " (Wikipedia article on Chunking (pschology), accessed 12-30-2012).

View Map + Bookmark Entry

Chomsky's Hierarchy of Syntactic Forms September 1956

American linguist, philosopher, cognitive scientist, and activist Noam Chomsky published "Three Models for the Description of Language" in IRE Transactions on Information Theory IT-2, 113-24.

In the paper Chomsky introduced two key concepts, the first being “Chomsky’s hierarchy” of syntactic forms, which was widely applied in the construction of artificial computer languages.

“The Chomsky hierarchy places regular (or linear) languages as a subset of the context-free languages, which in turn are embedded within the set of context-sensitive languages also finally residing in the set of unrestricted or recursively enumerable languages. By defining syntax as the set of rules that define the spatial relationships between the symbols of a language, various levels of language can be also described as one-dimensional (regular or linear), two-dimensional (context-free), three-dimensional (context sensitive) and multi-dimensional (unrestricted) relationships. From these beginnings, Chomsky might well be described as the ‘father of formal languages’ ” (Lee, Computer Pioneers [1995] 164). 

The second concept Chomsky presented here was his transformational-generative grammar theory, which attempted to define rules that can generate the infinite number of grammatical (well-formed) sentences possible in a language, and seeks to identify rules (transformations) that govern relations between parts of a sentence, on the assumption that beneath such aspects as word order a fundamental deep structure exists. As Chomsky expressed it in his abstract of the present paper,

"We investigate several conceptions of linguistic structure to determine whether or not they can provide simple and “revealing” grammars that generate all of the sentences of English and only these. We find that no finite-state Markov process [a random process whose future probabilities are determined by its most recent values] that produces symbols with transition from state to state can serve as an English grammar. We formalize the notion of “phrase structure” and show that this gives us a method for describing language which is essentially more powerful. We study the properties of a set of grammatical transformations, showing that the grammar of English is materially simplified if phrase-structure is limited to a kernel of simple sentences from which all other sentences are constructed by repeated transformation, and that this view of linguistic structure gives a certain insight into the use and understanding of language" (p. 113).

Minsky, "A Selected Descriptor-Indexed Bibliography to the Literature on Artificial Intelligence" in Feigenbaum & Feldman eds., Computers and Thought (1963) 453-523, no. 484. Hook & Norman, Origins of Cyberspace (2002) no. 531.

View Map + Bookmark Entry

Chomsky's Syntactic Structures 1957

Noam Chomsky's Syntactic Structures was published in S-Gravenhage (The Hague), Netherlands, by Mouton & Co. That it did not initially find an American publisher might have been reflective of the advanced nature of the contents. Through its numerous printings Syntactic Structures, a small book of 116 pageswas the vehicle through which Chomsky's innovative ideas first became more widely known.

Chomsky’s text was an expansion of the ideas first expressed in his “Three Models for the Description of Language," in particular the concept of transformational grammar. The cognitive scientist David Marr, who developed a general account of information-processing systems, described Chomsky’s theory of transformation grammar as a top-level computational theory, in the sense that it deals with the goal of a computation, why it is appropriate, and the logic of the strategy used to carry it out (Anderson and Rosenfeld, Neurocomputing: Foundations of Research [1988] 470–72). Chomsky’s work had profound influence in the fields of linguistics, philosophy, psychology, and artificial intelligence. 

Hook & Norman, Origins of Cyberspace (2002) no. 532.

View Map + Bookmark Entry

Human Versus Machine Intelligence and Communication 1959

"Somewhat the same problem arises in communicating with a machine entity that would arise in communicating with a person of an entirely different language background than your own. A system of logical definition and translation would have to be available. In order that meanings should not be lost, such a system of translation would also need to be precise. We are all familiar with the unhappy results of language translations which are either lacking in precision or where suitable words of equivalent meaning cannot be found. Likewise, translating into a machine language cannot be anything but an exact operation. Machines even more than people must be addressed with clarity and unambiguity, for machines cannot improvise on their own or imagine that about which they have not been specifically informed, as a human might do within reasonable limits of error. . . .

"We must now ascertain how concepts are formulated within the framework of computer language. For analogy, let us first consider the manner in which instructions are usually given to a non-mechanical entity. When we instruct, for example, a human being, we are aided by the fact that the human is usually able to fill in gaps in our instructions through acumen acquired from his own past experiences. It is seldom necessary that instructions be either detailed or literal, although we may have lost sight of this fact.

"The computer in a correlate example is a mechanical 'being' which must be instructed at each and every step. But it can be given a very long list of instructions upon which it can be expected to subsequently act with great speed and accuracy and with untiring repetition. Machine traits are: low comprehension, high retention, extreme reliability, and tremendous speed. The use of superlatives here to describe these traits is not exaggerative. Since speed becomes in practice the equivalent of number, the machine might be, and has sometimes been, equated to legions — an army, if you will — of lowgrade morons whose conceptualization is entirely literal, who remember as long as is necessary or as you desire them to, whose loyalty and subservience is complete, who require no holidays, no spurious incentives, no morale programs, pensions, not even gratitude for past service, and who seemingly never tire of doing elementary repetitive tasks such as typing, accounting, bookkeeping, arithmetic, filling in forms, and the like. In about all these respects the machine may be seen to be the exact opposite of nature's loftiest creature, the intellligent human being, who becomes bored with the petty and repetitious, who is unreliable, who wanders from the task for the most trivial reasons, who gets out of humor, who forgets, who requires constant incentives and rewards, who improvises on his own even when to do so is impertinent to the objectives being undertaken, and who in summary (let's face it) is unsuitable to most forms of industry as the latter are ideally and practically conceived in our times. It becomes apparent in retrospect that the only excuse we might ever have had for employing him to do many of civilization's more literal and repetitious tasks was the absence of something more efficient with which to replace him!

"It is not the purpose of this volume to explore further the ramifications of the above statements of fact. . . ."(Nett & Hetzler, An Introduction to Electronic Data Processing [1959] 86-88).

View Map + Bookmark Entry

Origins of Corpus Linguistics 1959

Randolph Quirk founded the Survey of English Usage, the first research center in Europe to carry out research in corpus linguistics.

"The original Survey Corpus predated modern computing. It was recorded on reel-to-reel tapes, transcribed on paper, filed in filing cabinets, and indexed on paper cards. Transcriptions were annotated with a detailed prosodic and paralinguistic annotation developed by Crystal and Quirk (1964) Sets of paper cards were manually annotated for grammatical structures and filed, so, for example, all noun phrases could be found in the noun phrase filing cabinet in the Survey. Naturally, corpus searches required a visit to the Survey.

"This corpus is now known more widely as the London-Lund Corpus (LLC), as it was the responsibility of co-workers in Lund, Sweden, to computerise the corpus" (Wikipedia article on Survey of English Usage, accessed 06-07-2010).

View Map + Bookmark Entry

The First Digital Poetry 1959

In 1959 German computer scientist Theo Lutz from Hochschule Esslingen created the first digital poetry using a text-generating program called “Stochastiche Text” written for the ZUSE Z22 computer. The program consisted of only 50 commands but could theoretically generate over 4,000,000 sentences.

Working with his teacher, Max Bense, one of the earliest theorists of computer poetry, Lutz used a random number generator to create texts where key words were randomly inserted within a set of logical constants in order to create a syntax. The programme thus demonstrated how logical structures like mathematical systems could work with language.

Funkhouser, Prehistoric Digital Poetry: An Archaeology of Forms 1959-1995 (2007).

View Map + Bookmark Entry

First Formal Definition of Hacker June 1959

Peter R. Samson, Public Relations Committee of the MIT Tech Model Railroad Club, defined the term "hack" in the Tech Model Railroad Club Dictionary as:

"1) an article or project without constructive end

"2) a project undertaken on bad self-advice

"3) an entropy booster

"4) to produce, or attempt to produce, a hack(3)."

Samson defined hacker as "one who hacks, or makes them."

Much of the Tech Model Railroad Club jargon was later incorporated into early computer culture. In 2005 Samson commented:

"I saw this as a term for an unconventional or unorthodox application of technology, typically deprecated for engineering reasons. There was no specific suggestion of malicious intent (or of benevolence, either). Indeed, the era of this dictionary saw some 'good hacks:' using a room-sized computer to play music, for instance; or, some would say, writing the dictionary itself" (http://www.gricer.com/tmrc/dictionary1959.html, accessed 06-01-2009).

View Map + Bookmark Entry

1960 – 1970

The Viterbi Algorithm 1967

While a professor at UCLA, Italian-American electrical engineer and businessman Andrew Viterbi developed the Viterbi algorithm,  "as an error-correction scheme for noisy digital communication links, finding universal application in decoding the convolutional codes used in both CDMA and GSM digital cellular, dial-up modems, satellite, deep-space communications, and 802.11 wireless LANs. It is now also commonly used in speech recognition, keyword spotting, computational linguistics, and bioinformatics. For example, in speech-to-text (speech recognition), the acoustic signal is treated as the observed sequence of events, and a string of text is considered to be the "hidden cause" of the acoustic signal. The Viterbi algorithm finds the most likely string of text given the acoustic signal" (Wikipedia article on Viterbi algorithm, accessed 12-29-2009).

View Map + Bookmark Entry

"Computational Analysis of Present-Day American English" 1967

Henry Kucera (born Jindřich Kučera) of Brown University and Nelson Francis published Computational Analysis of Present-Day American English.

A founding work on corpus linguistics, this book "provided basic statistics on what is known today simply as the Brown Corpus. The Brown Corpus was a carefully compiled selection of current American English, totaling about a million words drawn from a wide variety of sources. Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, psychology, statistics, and sociology" (Wikipedia article on Brown Corpus, accessed 06-07-2010)./

View Map + Bookmark Entry

The Beginning of Automated Essay Scoring 1967

In 1964 American educational psychologist at the University of Connecticut (StorrsEllis Batten Page, inspired by developments in computational linguistics and artificial intelligence, began research on automated essay scoring. Page published his initial research in 1967 as "Statistical and linguistic strategies in the computer grading of essays," Coling 1967: Conférence Internationale sur le Traitement Automatique des Langues, Grenoble, France, August 1967.  The same year he also published "The imminence of grading essays by computer," Phi Delta Kappan, 47 (1967) 238-243. The following year he published, with Dieter H. Paulus  The analysis of essays by computer (Final report, Project No. 6-1318). Washington, D. C.: Department of Health, Education, and Welfare; Office of Education; Bureau of Research. That year he published his successful work with a program he called Project Essay Grade (PEG) in "The Use of the Computer in Analyzing Student Essays," International Review of Education, 14(3), 253-263. Page's work is considered the beginning of automated essay scoring, the development of which could not become cost effective until computing became far cheaper and more pervasive in the 1990s. 

Later at Duke University, Page renewed his development and research in automated scoring and, in 1993, formed Tru-Judge, Inc., anticipating the potential for commercial applications of the software. In 2002, and in declining health, Page sold the intellectual property assets of Tru-Judge to Measurement Incorporated, educational company that provides achievement tests and scoring services for state governments, other testing companies and various organizations and institutions.

View Map + Bookmark Entry

The First Dictionary Based on Corpus Linguistics 1969

Houghton Mifflin of Boston published The American Heritage Dictionary of the English Language.

"The AHD broke ground among dictionaries by using corpus linguistics for compiling word-frequencies and other information. It took the innovative step of combining prescriptive information (how language should be used) and descriptive information (how it actually is used). The descriptive information was derived from actual texts. Citations were based on a million-word, three-line citation database[the Brown Corpus] prepared by Brown University linguist Henry Kucera" (Wikipedia article on The American Heritage Dictionary of the English Language, accessed 06-07-2010).

View Map + Bookmark Entry

1970 – 1980

Speech Recognition Technology 1971

IBM’s first operational application of speech recognition enabled customer engineers servicing equipment to “talk” to and receive “spoken” answers from a computer that could recognize about 5,000 words.

View Map + Bookmark Entry

Launching "Messages in a Bottle" into the Cosmic Ocean 1977

The Voyager Golden Records were included on the Voyager 1 and 2 spacecraft as a kind of time capsule intended to communicate a story of our world to extraterrestrials.

Each was a 12-inch gold-plated copper disk-shaped phonograph record containing sounds and images selected to portray the diversity of life and culture on Earth. The contents of the record were selected for NASA by a committee chaired by Carl Sagan of Cornell University. Sagan and associates assembled 115 images and a variety of natural sounds, such as those made by surf, wind and thunder, birds, whales, and other animals. To this they added musical selections from different cultures and eras, and spoken greetings from in fifty-five languages, and printed messages from President Jimmy Carter and U.N. Secretary General Kurt Waldheim.

Because it was believed that the Voyager spacecrafts would not encounter another solar system for 40,000 years, the production of these records seems to have involved a naive faith in the permanence of accessibility of analog data, and in the durability of such data to survive over extremely long periods of time. 

"Each record is encased in a protective aluminum jacket, together with a cartridge and a needle. Instructions, in symbolic language, explain the origin of the spacecraft and indicate how the record is to be played. The 115 images are encoded in analog form. The remainder of the record is in audio, designed to be played at 16-2/3 revolutions per minute. It contains the spoken greetings, beginning with Akkadian, which was spoken in Sumer about six thousand years ago, and ending with Wu, a modern Chinese dialect. Following the section on the sounds of Earth, there is an eclectic 90-minute selection of music, including both Eastern and Western classics and a variety of ethnic music. Once the Voyager spacecraft leave the solar system (by 1990, both will be beyond the orbit of Pluto), they will find themselves in empty space. It will be forty thousand years before they make a close approach to any other planetary system. As Carl Sagan has noted, 'The spacecraft will be encountered and the record played only if there are advanced spacefaring civilizations in interstellar space. But the launching of this bottle into the cosmic ocean says something very hopeful about life on this planet' (http://voyager.jpl.nasa.gov/spacecraft/goldenrec.html, accessed 02-27-2011).

View Map + Bookmark Entry

1980 – 1990

Keyboarding over 350,000,000 Characters 1983

Work began on computerizing the text of the Oxford English Dictionary, defining "414,825 words backed by five million quotations, of which some two million were actually printed in the dictionary text." This required retyping the entire text into a database.

"And so the New Oxford English Dictionary (NOED) project began. More than 120 keyboarders of International Computaprint Corporation in Tampa, Florida, and Fort Washington, Pennsylvania, USA, started keying in over 350,000,000 characters, their work checked by 55 proof-readers in England. Retyping the text alone was not sufficient; all the information represented by the complex typography of the original dictionary had to be retained, which was done by marking up the content in SGML. A specialized search engine and display software were also needed to access it. Under a 1985 agreement, some of this software work was done at the University of Waterloo, Canada, at the Centre for the New Oxford English Dictionary, led by F.W. Tompa and Gaston Gonnet; this search technology went on to become the basis for the Open Text Corporation. Computer hardware, database and other software, development managers, and programmers for the project were donated by the British subsidiary of IBM; the colour syntax-directed editor for the project, LEXX, was written by Mike Cowlishaw of IBM. The University of Waterloo, in Canada, volunteered to design the database."

The second edition of the OED was published on paper in 1989. 

View Map + Bookmark Entry

The Perseus Digital Library Project 1985

The Perseus Digital Library Project began at Tufts University, Medford/Somerville, Massachusetts. Though the project was ostensibly about Greek and Roman literature and culture, it evolved into an exploration of the ways that digital collections could enhance scholarship with new research tools that took libraries and scholarship beyond the physical book.

"Since planning began in 1985, the Perseus Digital Library Project has explored what happens when libraries move online. Two decades later, as new forms of publication emerge and millions of books become digital, this question is more pressing than ever. Perseus is a practical experiment in which we explore possibilities and challenges of digital collections in a networked world.

"Our flagship collection, under development since 1987, covers the history, literature and culture of the Greco-Roman world. We are applying what we have learned from Classics to other subjects within the humanities and beyond. We have studied many problems over the past two decades, but our current research centers on personalization: organizing what you see to meet your needs.

"We collect texts, images, datasets and other primary materials. We assemble and carefully structure encyclopedias, maps, grammars, dictionaries and other reference works. At present, 1.1 million manually created and 30 million automatically generated links connect the 100 million words and 75,000 images in the core Perseus collections. 850,000 reference articles provide background on 450,000 people, places, organizations, dictionary definitions, grammatical functions and other topics."

View Map + Bookmark Entry

WordNet Begins 1985

In 1985 psychologist and cognitive scientist George A. Miller and his team at Princeton began development of WordNet, a lexical database for the English language.

WordNet

"groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets. The purpose is twofold: to produce a combination of dictionary and thesaurus that is more intuitively usable, and to support automatic text analysis and artificial intelligence applications" (Wikipedia article on WordNet).

You can browse Wordnet at http://wordnet.princeton.edu/.

WordNet has been used for a number of different purposes in information systems, including word sense disambiguation, information retrieval, automatic text classification, automatic text summarization, and even automatic crossword puzzle generation.

View Map + Bookmark Entry

Critique of Computational Linguistics 1987

Integrational linguist Roy Harris published The Language Machine.

"This volume completes the trilogy which began with The Language-Makers (1980) and The Language Myth (1981). The Language Machine examines the impact of the electronic computer on modern conceptions of language and communication. When Swift wrote Gulliver’s Travels the notion that a machine could handle language was an absurdity to be satirized. Descartes regarded it as foolish to suppose that a robot could ever be built that would answer questions. But today it is widely assumed that mechanical speech recognition and automatic translation will be commonplace in tomorrow’s technology. Underlying these assumptions is a subtle shift in popular and academic conceptions of what a language is. Understanding a sentence is treated as a computational process. This in turn contributes powerfully to accepting a mechanistic view of human intelligence, and to the insulation of language from moral values" (http://www.royharrisonline.com/linguistic_publications/The_Language-machine.html, accessed 07-23-2010).

View Map + Bookmark Entry

Foundation of Computational Sylistics 1987

John Burrows published Computation into Criticism: A Study of Jane Austen's Novels and an Experiment in Method. This work, which showed that a quantitative study of function word use can reveal subtle and powerful patterns in language, founded computational stylistics, and pioneered the application of Principal Component Analysis to language data.

View Map + Bookmark Entry

The Unicode Universal Character Set August 29, 1988

Joseph D. Becker of Xerox Corporation, Rochester, New York, Lee Collins (also at Xerox) and Mark Davis of Apple developed a universal character set. Becker coined the word "Unicode" to cover the project in his report, Unicode 88:

"1.1. Abstract

"This document is a draft proposal for the design of an international/multilingual text character coding system, tentatively called Unicode.

"Unicode is intended to address the need for a workable, reliable world text encoding. Unicode could be roughly described as 'wide-body ASCII' that has been stretched to 16 bits to encompass the characters of all the world's living languages. In a properly engineered design, 16 bits per character are more than sufficient for this purpose.

"In the Unicode system, a simple unambiguous fixed-length character encoding is integrated into a coherent overall architecture of text processing. The design aims to be flexible enough to support many disparate (vendor-specific) implementations of text processing software.

"A general scheme for character code allocations is proposed (and materials for making specific individual character code assignments are well at hand), but specific code assignments are not proposed here. Rather, it is hoped that this document will evoke interest from many organizations, which could cooperate in perfecting the design and in determining the final character code assignments" (http://www.unicode.org/history/unicode88.pdf, accessed 01-29-2010).

View Map + Bookmark Entry

1990 – 2000

The Unicode Standard: Now 107,000 Charcters in 90 Scripts October 1991

The first volume of the Unicode standard 1.0 was published by the Unicode Consortium, Mountain View, California.

"Unicode is a computing industry standard allowing computers to consistently represent and manipulate text expressed in most of the world's writing systems. Developed in tandem with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version [5.2, 2009] of Unicode consists of a repertoire of more than 107,000 characters covering 90 scripts [including Egyptian hieroglyphs] a set of code charts for visual reference, an encoding methodology and set of standard character encodings, an enumeration of character properties such as upper and lower case, a set of reference data computer files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional display order (for the correct display of text containing both right-to-left scripts, such as Arabic or Hebrew, and left-to-right scripts) " (Wikipedia article on Unicode, accessed 01-29-2010).

View Map + Bookmark Entry

Development of Neural Networks 1993

Psychologist, neuroscientist and cognitive scientist James A. Anderson of Brown University, Providence, RI, published "The BSB Model: A simple non-linear autoassociative network," M. Hassoun (Ed), Associative Neural Memories: Theory and Implementation (1993).  Anderson's neural networks were applied to models of human concept formation, decision making, speech perception, and models of vision.

Anderson, J. A., Spoehr, K. T. and Bennett, D.J.  "A study in numerical perversity: Teaching arithmetic to a neural network,"  D.S. Levine and M. Aparicio (Eds.) Neural Networks for Knowledge Representation and Inference, (1994).

View Map + Bookmark Entry

Statistical Machine Translation 1993

Peter F. Brown and colleagues at IBM's Thomas J. Watson Research Center, Yorktown Heights, NY, published "The Mathematics of Statistical Machine Translation: Parameter Estimation," Computational Linguistics, 19 (2) 263-311:

"We describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another. We define a concept of word-by-word alignment between such pairs of sentences. For any given pair of such sentences each of our models assigns a probability to each of the possible word-by-word alignments. We give an algorithm for seeking the most probable of these alignments. Although the algorithm is suboptimal, the alignment thus obtained accounts well for the word-by-word relationships in the pair of sentences. We have a great deal of data in French and English from the proceedings of the Canadian Parliament. Accordingly, we have restricted our work to these two languages; but we,feel that because our algorithms have minimal linguistic content they would work well on other pairs of languages. We also feel, again because of the minimal linguistic content of our algorithms, that it is reasonable to argue that word-by-word alignments are inherent in any sufficiently large bilingual corpus."

"The first ideas of statistical machine translation were introduced by Warren Weaver in 1949, including the ideas of applying Claude Shannon's information theory. Statistical machine translation was re-introduced in 1991 by researchers at IBM's Thomas J. Watson Research Center and has contributed to the significant resurgence in interest in machine translation in recent years. Nowadays it is by far the most widely-studied machine translation method" (Wikipedia article on Statistical machine translation, accessed 05-14-2010).

View Map + Bookmark Entry

Speech Recognition Technology from 6,700 Characters 1996

IBM introduced continuous speech recognition technology for Mandarin Chinese. In developing the product, researchers identified and classified thousand of vocal tones and homonyms, created an algorithm that deconstructed syllables into parts, and developed a new language model to transform spoken words into the right combination drawn from 6,700 Chinese characters.

IBM also announced software that gave people a hands-free way to dictate text and navigate the desktop with the power of natural speech.

View Map + Bookmark Entry

Using Neural Networks for Word Sense Disambiguation 1998

Cognitive scientist / entrepreneur Jeffrey Stibel, physicist, psychologist, neural scientist James A. Anderson, and others from the Department of Cognitive and Linguistic Sciences at Brown University created a word sense disambiguator using George A. Miller's WordNet lexical database.

Stibel and others applied this technology in Simpli, "an early search engine that offered disambiguation to search terms. A user could enter in a search term that was ambiguous (e.g., Java) and the search engine would return a list of alternatives (coffee, programming language, island in the South Seas)."

"The technology was rooted in brain science and built by academics to model the way in which the mind stored and utilized language."

"Simpli was sold in 2000 to NetZero. Another company that leveraged the Simpli WordNet technology was purchased by Google and they continue to use the technology for search and advertising under the brand Google AdSense.

"In 2001, there was a buyout of the company and it was merged with another company called Search123. Most of the original members joined the new company. The company was later sold in 2004 to ValueClick, which continues to use the technology and search engine to this day" (Wikipedia article on Simpli, accessed 05-10-2009).

View Map + Bookmark Entry

2000 – 2005

OED Online March 14, 2000

An old interface of the Oxford Dictionary Online where users could subscribe to the online dictionary

(View Larger)

The Oxford English Dictionary Online (OED Online) became available to subscribers.

View Map + Bookmark Entry

ECHO (European Cultural Heritage Online) is Founded December 1, 2002

On December 1, 2002 the ECHO initiative was announced in Berlin.  Funded by the European Commission, it was founded by the Max Planck Institute for the History of Art in Rome, by the Max Planck Institute for Psycholinguistics in Nijmegen, and by the Max Planck Institute for the History of Science in Berlin, together with their international partners.

"The new European Commission-funded project ECHO (European Cultural Heritage Online) to create an IT-based infrastructure for the humanities is taking shape today with its kick-off-meeting held in Berlin. With a budget of approximately 1.6 million Euros 16 partners from 9 European countries including candidate countries together with their subcontractors, the initiative aims at achieving four major goals, scientific, technological, cultural and political, until May 2004:  

"By 1) improving the situation for the humanities concerning the new information technologies through

"2) the fostering of a new IT-based infrastructure, adequate to future information technologies,

"3) cultural heritage in Europe will be brought online and

"4) be made freely accessible without any commercial constraints.

"The project, coordinated by the Max Planck Institute for the History of Science in Berlin, is highly welcomed by the EU commission as a chance to strengthen the competiveness of European research by promoting an urgently needed concept for good practice in scholarly research in the humanities. In order to exploit the innovative potential of the new information technologies, the project will contribute to overcome the present fragmentation of approaches to transfer cultural heritage to the Internet.  

"At present Europe lags behind in developing a large-scale infrastructure for the humanities adequate to the Internet age and competitive with similar ventures in the US. As a Europe-wide effort, ECHO aims at developing high-quality research in line with the ambition of the European Research Area and competitive with US and Japanese ventures. Only by overcoming the limitations of national perspectives can the critical mass be brought together that ensures the self-organisation of culture in the new medium.  

"If the new media comprises an adequate representation of human cultural diversity they can offer also new opportunities reflecting on possible links and similarities e.g. between European and non-European cultures. A culturally informed Web may thus even constitute a public think-tank in which cultural diversity drives rather than conflicts with communication.  

"The ECHO project is constituted by its main partners as well as by subcontractors. Even now, however, the informal network of actors willing to contribute extends far beyond the group of applicants. Some 25 academic, governmental, and private institutions from 15 European and 3 non-European countries (China, Mexico, and the USA) have declared their adherence to the project; they will be contacted during its first phase.  

"The single most important added European value offered by the project to the citizens of Europe is a contribution to the preservation of, and an improved and extended access to, their own European cultural heritage. Its enhanced availability on the Internet will also create new opportunities for shaping a polyvalent European identity, including a realisation of the non-European origins of essential presuppositions of European culture as well as an awareness of its historical pitfalls. Border-crossing technologies such as language tools adapted to cultural sources contribute to European integration by making these treasures accessible to all Europeans (e-Europe). ECHO will provide web-accessible multimedia content together with navigation facilities, hence making it attractive for researchers, teachers, students, journalists, and also for the general public.  

"In addition, the ECHO project will be directly concerned with copyright laws and open source policies. It will provide an opportunity for reflecting on the ongoing developments from a practical point of view and may lead to the definition of new policies encouraging the transfer of cultural heritage to the existing and new media.  

"The project is defined in three major steps.  

"• An assessment of the present situation in relation to bringing European cultural heritage online. In view of the fragmentation of endeavours presently undertaken, it is necessary to assess the implementation of Information Technology for preserving, sharing, and studying this heritage in different disciplines and nations.

• The exploration of a novel IT-based cooperative research infrastructure. The project will create, within its limited scope, a model implementation of a new cooperative research infrastructure, that aims at mobilising and bringing together all relevant actors (universities, museums, libraries, archives, (national) research councils, digital heritage organisa-tions, and companies) in the broad field of the humanities and cultural heritage in Europe.

"• A paradigmatic proof of the new potentials for research offered by this infrastructure. By taking up four paradigmatic content areas in the humanities, from the history of art, the history of science, language studies, and social and cultural anthropology, respectively, the project aims at demonstrating the innovative potential for research offered by this infrastructure.

"The highly ambitious ECHO project aims at the creation of a progressively growing agora, defining the management structure, data formats, tools and workflows. This in turn is intended to serve as a model for a larger-scale network within the 6th Framework Program of the EU. The subsequent project, possibly labelled ECHO 2, shall bring a major contribution to the preservation of Europe's cultural heritage as well as improved and extended access to this heritage for both scholars and the general public alike. This transformation of the Internet into a semantic web allowing the exchange and processing of information in the language of human culture within an emerging Open Library will serve as a framework for cooperative work on the sources and for the presentation of its results. It will also show socio-economic effects such as becoming a central resource of technology for storing and distributing information for institutions who lack such means; or for creating a basis for virtual tourism into the digitised realm of our rich cultural heritage in Europe." 

View Map + Bookmark Entry

2005 – 2010

IBM Begins Development of the Watson Question Answering System 2007

David Ferrucci, leader of the Semantic Analysis and Integration Department at IBM’s Watson Research Center, Yorktown Heights, New York,  and his team began development of Watson, a special-purpose computer system designed to push the envelope on deep question and answering, deep analytics, and the computer's understanding of natural language.

View Map + Bookmark Entry

Second Life is Used for Teaching Foreign Languages July 2007

According to an article in LeMonde.fr the virtual reality site, Second Life, was being used for teaching foreign languages.

View Map + Bookmark Entry

The World Wide Telecom Web for Illiterate Populations August 2007

Arun Kumar and others at IBM Research - India, New Delhi,  published "WWTW: The World Wide Telecom Web", an Internet designed for illiterate populations:

"our vision of a voice-driven ecosystem parallel to that of the WWW. WWTW is a network of interconnected voice sites that are voice driven applications created by users and hosted in the network. It has the potential to enable the underprivileged population to become a part of the next generation converged networked world. We present a whole gamut of existing technology enablers for our vision as well as present research directions and open challenges that need to be solved to not only realize a WWTW but also to enable the two Webs to cross leverage each other."

View Map + Bookmark Entry

Towards the Open Advancement of Question Answering Systems April 22, 2009

David Ferrucci, leader of the Semantic Analysis and Integration Department at IBM's T. J. Watson's Research Center, Yorktown Heights, New York, Eric Nyberg, and several co-authors published IBM Research Report: Towards the Open Advancement of Question Answering Systems.

Section 4.2.3. of the report includes an analysis of why the television game show Jeopardy! provides a good model of the semantic analysis and integration problem.

View Map + Bookmark Entry

IBM's Watson Question Answering System Challenges Humans at Jeopardy April 27, 2009

IBM's Watson Question Answering (QA) System will challenge humans in the television quiz show Jeopardy!

"IBM is working to build a computing system that can understand and answer complex questions with enough precision and speed to compete against some of the best Jeopardy! contestants out there.

"This challenge is much more than a game. Jeopardy! demands knowledge of a broad range of topics including history, literature, politics, film, pop culture and science. What's more, Jeopardy! clues involve irony, riddles, analyzing subtle meaning and other complexities at which humans excel and computers traditionally do not. This, along with the speed at which contestants have to answer, makes Jeopardy! an enormous challenge for computing systems. Code-named "Watson" after IBM founder Thomas J. Watson, the IBM computing system is designed to rival the human mind's ability to understand the actual meaning behind words, distinguish between relevant and irrelevant content, and ultimately, demonstrate confidence to deliver precise final answers.

"Known as a Question Answering (QA) system among computer scientists, Watson has been under development for more than three years. According to Dr. David Ferrucci, leader of the project team, 'The confidence processing ability is key to winning at Jeopardy! and is critical to implementing useful business applications of Question Answering.

"Watson will also incorporate massively parallel analytical capabilities and, just like human competitors, Watson will not be connected to the Internet, or have any other outside assistance.  

"If we can teach a computer to play Jeopardy!, what could it mean for science, finance, healthcare and business? By drastically advancing the field of automatic question answering, the Watson project's ultimate success will be measured not by daily doubles, but by what it means for society" (http://www.research.ibm.com/deepqa/index.shtml, accessed 06-16-2010).

On June 16, 2010 The New York Times Magazine published a long article by Clive Thompson on IBM's Watson's challenge of humans in Jeopardy! entitled, in the question response language of Jeopardy!, "What is I.B.M.'s Watson?."

♦ Link to to FAQs concerning Watson and Jeopardy! on IBM's website, accessed 02-08-2011: http://www.research.ibm.com/deepqa/faq.shtml.

View Map + Bookmark Entry

Wolfram/Alpha is Launched May 16, 2009

Stephen Wolfram and Wolfram Research, Champaign, Illinois, launched Wolfram|Alpha, a computational data engine with a new approach to knowledge extraction, based on natural language processing, a large library of algorithms and an NKS (New Kind of Science) approach to answering queries.

The Wolfram|Alpha engine differs from traditional search engines in that it does not simply return a list of results based on a query, but instead computes an answer.

View Map + Bookmark Entry

Algorithm to Decipher Ancient Texts September 2, 2009

"Researchers in Israel say they have developed a computer program that can decipher previously unreadable ancient texts and possibly lead the way to a Google-like search engine for historical documents.

"The program uses a pattern recognition algorithm similar to those law enforcement agencies have adopted to identify and compare fingerprints.

"But in this case, the program identifies letters, words and even handwriting styles, saving historians and liturgists hours of sitting and studying each manuscript.

"By recognizing such patterns, the computer can recreate with high accuracy portions of texts that faded over time or even those written over by later scribes, said Itay Bar-Yosef, one of the researchers from Ben-Gurion University of the Negev.

" 'The more texts the program analyses, the smarter and more accurate it gets,' Bar-Yosef said.

"The computer works with digital copies of the texts, assigning number values to each pixel of writing depending on how dark it is. It separates the writing from the background and then identifies individual lines, letters and words.

"It also analyses the handwriting and writing style, so it can 'fill in the blanks' of smeared or faded characters that are otherwise indiscernible, Bar-Yosef said.

"The team has focused their work on ancient Hebrew texts, but they say it can be used with other languages, as well. The team published its work, which is being further developed, most recently in the academic journal Pattern Recognition due out in December but already available online. A program for all academics could be ready in two years, Bar-Yosef said. And as libraries across the world move to digitize their collections, they say the program can drive an engine to search instantaneously any digital database of handwritten documents. Uri Ehrlich, an expert in ancient prayer texts who works with Bar-Yosef's team of computer scientists, said that with the help of the program, years of research could be done within a matter of minutes. 'When enough texts have been digitized, it will manage to combine fragments of books that have been scattered all over the world,' Ehrlich said" (http://www.reuters.com/article/newsOne/idUSTRE58141O20090902, accessed 09-02-2009).

View Map + Bookmark Entry

The First Historical Thesaurus October 2009

Oxford University Press published as a printed book the Historical Thesaurus of the Oxford English Dictionary with Additional Material from A Thesaurus of Old English,edited by Christian Kay, Jane Roberts, Michael Samuels, and Irene Wotherspoon.

Forty years in the making, this 4448-page work was the first historical thesaurus to be compiled for any language, and the first to include almost the entire vocabulary of English, from Old English to the present. It was also the largest thesaurus resource in the world, covering more than 920,000 words and meanings, based on the Oxford English Dictionary.

The Historical Thesaurus lists synonyms listed with dates of first recorded use in English, in chronological order, with earliest synonyms first. For obsolete words, the Thesaurus also included the last recorded use of each word.

The work used a specially devised thematic system of classification. Its comprehensive index enabled complete cross-referencing of nearly one million words and meanings. It contained a comprehensive sense inventory of Old English and a fold-out color chart which showed the top levels of the classification structure. 

View Map + Bookmark Entry

ICANN Will Allow Web Addresses in Non-Latin Alphabets October 30, 2009

The Internet Corporation for Assigned Names and Numbers (ICANN) voted to allow Web addresses written completely in Chinese, Arabic, Korean and other languages using non-Latin alphabets.

"The decision is a 'historic move toward the internationalization of the Internet,' said Rod Beckstrom, Icann’s president and chief executive. 'We just made the Internet much more accessible to millions of people in regions such as Asia, the Middle East and Russia.' 

"This change affects domain names — anything that comes after the dot, including .com, .cn or .jp. Domain names have been limited to 37 characters — 26 Latin letters, 10 digits and a hyphen. But starting next year, domain names can consist of characters in any language. In some Web addresses, non-Latin scripts are already used in the portion before the dot. Thus, Icann’s decision Friday makes it possible, for the first time, to write an entire Internet address in a non-Latin alphabet.  

"Initially, the new naming system will affect only Web addresses with 'country codes,' the designators at the end of an address name, like .kr (for Korea) or .ru (for Russia). But eventually, it will be expanded to all types of Internet address names, Icann said.

"Some security experts have warned that allowing internationalized domain names in languages like Arabic, Russian and Chinese could make it more difficult to fight cyberattacks, including malicious redirects and hacking. But Icann said it was ready for the challenge.  'I do not believe that there would be any appreciable difference,' Mr. Beckstrom said in an interview. 'Yes, maybe some additional potential but at the same time, some new security benefits may come too. If you look at the global set of cybersecurity issues, I don’t see this as any significant new threat if you look at it on an isolated basis.'  

"The decision, reached after years of testing and debate, clears the way for Icann to begin accepting applications for non-Latin domain names Nov. 16. People will start seeing them in use around mid-2010, particularly in Arabic, Chinese and other scripts in which demand for the new 'internationalized' domain name system has been among the strongest, Icann officials say. Internet addresses in non-Latin scripts could lead to a sharp increase in the number of global Internet users, eventually allowing people around the globe to navigate much of the online world using their native language scripts, they said.  

"This is a boon especially for users who find it cumbersome to type in Latin characters to access Web pages. Of the 1.6 billion Internet users worldwide, more than half use languages that have scripts that are not based on the Latin alphabet." (http://www.nytimes.com/2009/10/31/technology/31net.html?hp)

View Map + Bookmark Entry

The Film "Avatar" and Visions of Reality, Virtual and Otherwise December 10, 2009

Avatar, an American science fiction epic film written and directed by film director, producer, screenwriter, editor, and inventor James Cameron, and starring Sam Worthington, Zoe Saldana, Sigourney Weaver, Michelle Rodriguez and Stephen Lang, was first released in London by Twentieth Century Fox, headquartered in Century City, Los Angeles.

"The film is set in the year 2154 on Pandora, a moon in the Alpha Centauri star system. Humans are engaged in mining Pandora's reserves of a precious mineral, while the Na'vi—a race of indigenous humanoids—resist the colonists' expansion, which threatens the continued existence of the Na'vi and the Pandoran ecosystem. The film's title refers to the genetically engineered bodies used by the film's characters to interact with the Na'vi.

"Avatar had been in development since 1994 by Cameron, who wrote an 80-page scriptment for the film. Filming was supposed to take place after the completion of Titanic, and the film would have been released in 1999, but according to Cameron, 'technology needed to catch up' with his vision of the film. In early 2006, Cameron developed the script, as well as the language and culture of the Na'vi. He said sequels would be possible if Avatar was successful, and in response to the film's success, confirmed that there will be another two.

"The film was released in traditional 2-D, as well as 3-D, RealD 3D, Dolby 3D, and IMAX 3D formats. Avatar is officially budgeted at $237 million; other estimates put the cost at $280–310 million to produce and $150 million for marketing. The film is being touted as a breakthrough in terms of filmmaking technology, for its development of 3D viewing and stereoscopic filmmaking with cameras that were specially designed for the film's production.

"Avatar premiered in London, UK on December 10, 2009, and was released on December 18, 2009 in the US and Canada to critical acclaim and commercial success. It grossed $27 million on its opening day domestically (in the United States and Canada) and $77 million domestically on its opening weekend. It opened two days earlier internationally and grossed $232 million worldwide in its first five days of international release. Within three weeks of its release, with a worldwide gross of over $1 billion, Avatar became the second highest-grossing film of all time worldwide, exceeded only by Cameron's previous film, Titanic" (Wikipedia article on Avatar (2009 film), accessed 01-16-2010).

♦ From my perspective the most significant aspect of Avatar, apart from its breathtaking computer graphic animation, and the fascinating artificial culture and language of the Na'vi, was the convincing portrayal of a total virtual reality experience, and the interplay between virtual reality, the reality of earth-born humans, some of whom animated the avatars, and the different reality of the Na'vi. The film presented visions of a reality that I could not have imagined before viewing. In its presentation of new views of reality it is reminiscent of the 1982 film, Blade Runner, directed by Ridley Scott.

Another aspect of the film that was highly timely was its depiction of the struggle between destructive exploitation of natural resources versus living in harmony with nature.

View Map + Bookmark Entry

2010 – 2011

"The Never-Ending Language Learning System" January 2010

Supported by DARPA and Google, Tom M. Mitchell and his team at Carnegie Mellon University initiated the Never-Ending Language Learning System, or NELL, in an effort to develop a method for machines to teach themselves semantics, or the meaning of language.

"Few challenges in computing loom larger than unraveling semantics, understanding the meaning of language. One reason is that the meaning of words and phrases hinges not only on their context, but also on background knowledge that humans learn over years, day after day" (http://www.nytimes.com/2010/10/05/science/05compute.html?_r=1&hpw). 

"NELL has been in continuous operation since January 2010. For the first 6 months it was allowed to run without human supervision, learning to extract instances of a few hundred categories and relations, resulting in a knowledge base containing approximately a third of a million extracted instances of these categories and relations. At that point, it had improved substantially its ability to read three quarters of these categories and relations (with precision in the range 90% to 99%), but it had become inaccurate in extracting instances of the remaining fourth of the ontology (many had precisions in the range 25% to 60%).  

"The estimated precision of the beliefs it had added to its knowledge base at that point was 71%. We are still trying to understand what causes it to become increasingly competent at reading some types of information, but less accurate over time for others. Beginning in June, 2010, we began periodic review sessions every few weeks in which we would spend about 5 minutes scanning each category and relation. During this 5 minutes, we determined whether NELL was learning to read it fairly correctly, and in case not, we labeled the most blatant errors in the knowledge base. NELL now uses this human feedback in its ongoing training process, along with its own self-labeled examples. In July, a spot test showed the average precision of the knowledge base was approximately 87% over all categories and relations. We continue to add new categories and relations to the ontology over time, as NELL continues learning to populate its growing knowledge base" (http://rtw.ml.cmu.edu/rtw/overview, accessed 10-06-2010).

View Map + Bookmark Entry

Google Introduces Translation Feature for Google Goggles May 6, 2010

Google announced a translation feature for Google Goggles, image recognition and search feature available on Android-based mobile devices.

"Here’s how it works:

"Point your phone at a word or phrase. Use the region of interest button to draw a box around specific words Press the shutter button

"If Goggles recognizes the text, it will give you the option to translate

"Press the translate button to select the source and destination languages."

"Today Goggles can read English, French, Italian, German and Spanish and can translate to many more languages. We are hard at work extending our recognition capabilities to other Latin-based languages. Our goal is to eventually read non-Latin languages (such as Chinese, Hindi and Arabic) as well."

View Map + Bookmark Entry

The First Internet Addresses in Non-Latin Characters May 6, 2010

"Three Mideast countries have become the first to get Internet addresses entirely in non-Latin characters.  

"Domain names in Arabic for Egypt, Saudi Arabia and the United Arab Emirates were added to the Internet's master directories on Wednesday, following final approval last month by the Internet Corporation for Assigned Names and Numbers, or ICANN. It's the first major change to the Internet domain name system since its creation in the 1980s.

"Registrations for websites to use those names are to begin soon. On Thursday, Egypt granted three companies approval to register names using the country's new Arabic suffix" (http://hosted.ap.org/dynamic/stories/M/ML_EGYPT_ARAB_DOMAIN_NAMES?SITE=AP&SECTION=HOME&TEMPLATE=DEFAULT, accessed 05-16-2010).

View Map + Bookmark Entry

2011 – 2013

Voice-Activated Translation on Cell Phones January 12, 2011

Google introduced an improved Google Translate for Android Conversation Mode: 

"This is a new interface within Google Translate that’s optimized to allow you to communicate fluidly with a nearby person in another language. You may have seen an early demo a few months ago, and today you can try it yourself on your Android device.  

"Currently, you can only use Conversation Mode when translating between English and Spanish. In conversation mode, simply press the microphone for your language and start speaking. Google Translate will translate your speech and read the translation out loud. Your conversation partner can then respond in their language, and you’ll hear the translation spoken back to you. Because this technology is still in alpha, factors like regional accents, background noise or rapid speech may make it difficult to understand what you’re saying. Even with these caveats, we’re excited about the future promise of this technology to be able to help people connect across languages" (http://googleblog.blogspot.com/2011/01/new-look-for-google-translate-for.html?utm_source=feedburner&utm_medium=email&utm_campaign=Feed:+blogspot/MKuf+(Official+Google+Blog), accessed 01-14-2011.

View Map + Bookmark Entry

IBM's Watson Question Answering System Defeats Humans at Jeopardy! February 14 – February 16, 2011

IBM's Watson question answering system supercomputer, developed at IBM's T J Watson Research Center, Yorktown Heights, New York, running DeepQA software, defeated the two best human Jeopardy! players, Ken Jennings and Brad Rutter. Watson's hardware consisted of 90 IBM Power 750 Express servers. Each server utilized a 3.5 GHz POWER7 eight-core processor, with four threads per core. The system operatesd with 16 terabytes of RAM.

The success of the machine underlines very significant advances in deep analytics and the ability of a machine to process unstructured data, and especially to intepret and speak natural language.

"Watson is an effort by I.B.M. researchers to advance a set of techniques used to process human language. It provides striking evidence that computing systems will no longer be limited to responding to simple commands. Machines will increasingly be able to pick apart jargon, nuance and even riddles. In attacking the problem of the ambiguity of human language, computer science is now closing in on what researchers refer to as the “Paris Hilton problem” — the ability, for example, to determine whether a query is being made by someone who is trying to reserve a hotel in France, or simply to pass time surfing the Internet.  

"If, as many predict, Watson defeats its human opponents on Wednesday, much will be made of the philosophical consequences of the machine’s achievement. Moreover, the I.B.M. demonstration also foretells profound sociological and economic changes.  

"Traditionally, economists have argued that while new forms of automation may displace jobs in the short run, over longer periods of time economic growth and job creation have continued to outpace any job-killing technologies. For example, over the past century and a half the shift from being a largely agrarian society to one in which less than 1 percent of the United States labor force is in agriculture is frequently cited as evidence of the economy’s ability to reinvent itself.  

"That, however, was before machines began to 'understand' human language. Rapid progress in natural language processing is beginning to lead to a new wave of automation that promises to transform areas of the economy that have until now been untouched by technological change.  

" 'As designers of tools and products and technologies we should think more about these issues,' said Pattie Maes, a computer scientist at the M.I.T. Media Lab. Not only do designers face ethical issues, she argues, but increasingly as skills that were once exclusively human are simulated by machines, their designers are faced with the challenge of rethinking what it means to be human.  

"I.B.M.’s executives have said they intend to commercialize Watson to provide a new class of question-answering systems in business, education and medicine. The repercussions of such technology are unknown, but it is possible, for example, to envision systems that replace not only human experts, but hundreds of thousands of well-paying jobs throughout the economy and around the globe. Virtually any job that now involves answering questions and conducting commercial transactions by telephone will soon be at risk. It is only necessary to consider how quickly A.T.M.’s displaced human bank tellers to have an idea of what could happen" (John Markoff,"A Fight to Win the Future: Computers vs. Humans," http://www.nytimes.com/2011/02/15/science/15essay.html?hp, accessed 02-17-2011).

♦ As a result of this technological triumph, IBM took the unusal step of building a colorful website concerning all aspects of Watson, including numerous embedded videos.

♦ A few of many articles on the match published during or immediately after it included:

John Markoff, "Computer Wins on 'Jeopardy!': Trivial, It's Not," http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?hpw

Samara Lynn, "Dissecting IBM Watson's Jeopardy! Game," PC Magazinehttp://www.pcmag.com/article2/0,2817,2380351,00.asp

John C. Dvorak, "Watson is Creaming the Humans. I Cry Foul," PC Magazinehttp://www.pcmag.com/article2/0,2817,2380451,00.asp

Henry Lieberman published a three-part article in MIT Technology Review, "A Worthwhile Contest for Artificial Intelligence" http://www.technologyreview.com/blog/guest/26391/?nlid=4132

♦ An article which discussed the weaknesses of Watson versus a human in Jeopardy! was Greg Lindsay, "How I Beat IBM's Watson at Jeopardy! (3 Times)" http://www.fastcompany.com/1726969/how-i-beat-ibms-watson-at-jeopardy-3-times

♦ An opinion column emphasizing the limitations of Watson compared to the human brain was Stanley Fish, "What Did Watson the Computer Do?" http://opinionator.blogs.nytimes.com/2011/02/21/what-did-watson-the-computer-do/

♦ A critical response to Stanley Fish's column by Sean Dorrance Kelly and Hubert Dreyfus, author of What Computers Can't Dowas published in The New York Times at: http://opinionator.blogs.nytimes.com/2011/02/28/watson-still-cant-think/?nl=opinion&emc=tya1

View Map + Bookmark Entry

The Impact of Automation on Legal Research March 4, 2011

"Armies of Expensive Lawyers Replaced by Cheaper Software," an article by John Markoff published in The New York Times, discussed the use of "e-discovery" (ediscovery) software which uses artificial intelligence to analyze millions of electronic documents from the linguistic, conceptual and sociological standpoint in a fraction of the time and at a fraction of the cost of the hundreds of lawyers previously required to do the task.

"These new forms of automation have renewed the debate over the economic consequences of technological progress.  

"David H. Autor, an economics professor at the Massachusetts Institute of Technology, says the United States economy is being 'hollowed out.' New jobs, he says, are coming at the bottom of the economic pyramid, jobs in the middle are being lost to automation and outsourcing, and now job growth at the top is slowing because of automation.  

" 'There is no reason to think that technology creates unemployment,' Professor Autor said. 'Over the long run we find things for people to do. The harder question is, does changing technology always lead to better jobs? The answer is no.'

"Automation of higher-level jobs is accelerating because of progress in computer science and linguistics. Only recently have researchers been able to test and refine algorithms on vast data samples, including a huge trove of e-mail from the Enron Corporation. 

“ 'The economic impact will be huge,' said Tom Mitchell, chairman of the machine learning department at Carnegie Mellon University in Pittsburgh. 'We’re at the beginning of a 10-year period where we’re going to transition from computers that can’t understand language to a point where computers can understand quite a bit about language.'

View Map + Bookmark Entry

Google Processes 1,000,000,000 Search Queries Per Day March 5, 2011

In March 2011 Google processed 1,000,000,000 search queries per day.

" . . . the future of search engines like Google and Microsoft’s Bing, according to computer scientists, will be to exploit advances in machine learning and language processing to become answer machines — to take a page from Watson, but as a consumer service. Both companies are already headed in that direction" (http://www.nytimes.com/2011/03/06/weekinreview/06lohr.html?pagewanted=2&hpw, accessed 03-06-2011)

View Map + Bookmark Entry

100 Million Words Translated per Week by Google Translate December 8, 2011

According to an infographic released by Google, in December 2011 100 million words in 200 different languages were translated weekly by Google Translate. 

View Map + Bookmark Entry

What Makes Spoken Lines in Movies Memorable April 30, 2012

Sentences that endure in the public mind are evolutionary success stories, comparing “the fitness of language and the fitness of organisms.” On April 30, 2012 Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, and Lillian Lee of the Department of Computer Science at Cornell University published "You had me at hello: How phrasing affects memorability," arXiv: 1203.6360v2 [cs.CL] 30 Apr 2012, (accessed 01-27-2013). Using the "memorable quotes" selected from the Internet Movie Database or IMDb, and the number of times that a particular movie line appeared on the Internet, they compared the memorable lines to the complete scripts of the movies in which they appeared—about 1,000 movies

"To train their statistical algorithms on common sentence structure, word order and most widely used words, they fed their computers a huge archive of articles from news wires. The memorable lines consisted of surprising words embedded in sentences of ordinary structure. 'We can think of memorable quotes as consisting of unusual word choices built on a scaffolding of common part-of-speech patterns,' their study said.  

Consider the line 'You had me at hello,' from the movie 'Jerry McGuire.' It is, Mr. Kleinberg notes, basically the same sequence of parts of speech as the quotidian 'I met him in Boston.' Or consider this line from 'Apocalypse Now': 'I love the smell of napalm in the morning.'Only one word separates that utterance from this: 'I love the smell of coffee in the morning.'

"This kind of analysis can be used for all kinds of communications, including advertising. Indeed, Mr. Kleinberg’s group also looked at ad slogans. Statistically, the ones most similar to memorable movie quotes included 'Quality never goes out of style,' for Levi’s jeans, and 'Come to Marlboro Country,' for Marlboro cigarettes.  

"But the algorithmic methods aren’t a foolproof guide to real-world success. One ad slogan that didn’t fit well within the statistical parameters for memorable lines was the Energizer batteries catchphrase, 'It keeps going and going and going.'

"Quantitative tools in the humanities and the social sciences, as in other fields, are most powerful when they are controlled by an intelligent human. Experts with deep knowledge of a subject are needed to ask the right questions and to recognize the shortcomings of statistical models.  

“ 'You’ll always need both,' says Mr. [Matthew] Jockers, the literary quant. 'But we’re at a moment now when there is much greater acceptance of these methods than in the past. There will come a time when this kind of analysis is just part of the tool kit in the humanities, as in every other discipline' " (http://www.nytimes.com/2013/01/27/technology/literary-history-seen-through-big-datas-lens.html?pagewanted=2&_r=0&nl=todaysheadlines&emc=edit_th_20130127, accessed 01-27-2013).

View Map + Bookmark Entry