A study upon word spaces in the Voynich manuscript

This article is an attempt to answer the most basic of questions: are the spaces between Voynich words arbitrary or purposeful?

Despite the essential simplicity of this question, it’s still a burning issue. If we can prove that spaces are arbitrary, then it’s a push towards the theories that the text is encoded or gibberish. But if we can prove that the spaces are purposeful, that they separate words in the same way as our modern usage, then it’s a push towards a natural or artificial language.

But how can we prove this either way?

Are words “words”?

As I have argued before, the text of the manuscript is divided up into clearly defined word-like glyph groups (what we would call words if we could assign a sense unit to each glyph group). These glyph groups have a non-trivial internal structure which is manifest in the severe restrictions imposed upon the positioning of glyphs within the glyph groups. From now on I will refer to these glyph groups as “words” (I am not a fan of Stolfi’s terminology of token as I find it confuses people).

Voynichese has a very strict phototactic structure for glyphs that appears to indicate that these words are assembled intentionally. They are bound together as if they were words.

We are used to the paradigm that words form a sentence with spaces between the words. The Voynich corpus (with the exception of labels, single words that are attached to images) appears to follow this paradigm (albeit with no punctuation). But it is possible that this is a deception. The spaces between words could be an encoded null character, or an arbitrary sp acet om akei t mo rediff cult for the uniniti atedto read.

If this were so, we would expect the words to have a low repeat value. Words would be broken up into sub-sections, or jumbled around, and this would mean that they would not repeat very often. On the other hand, if spaces are separating words, then we would expect words to be repeated throughout the corpus.

Knight and Reddy (What we know about the Voynich manuscript) prove that words are repeated throughout the VMS, and that furthermore the word frequency distribution of the manuscript follows Zipf’s law.

Furthermore, they note that Landini (2001) found that the corpus follows Zipf’s law of word lengths: there is an inverse relationship between the frequency and the length of a word.

From a slightly different angle, let us look at how often labels repeat within the corpus, as this allows us to see if words are repeated through different contexts. If the labels truly function as “labels”, ie sense units denominating illustrations or objects, we would expect a fair number of them to be repeated within the main corpus. And indeed we do: MarcoP found that 70% of all labels appear within the main corpus (study here).

So we find that voynich “words” obey frequency distribution laws; are repeated with a frequency which is normal for language; and furthermore that they are oft-repeated throughout different contexts.

These conclusions lead us towards the assertion that glyph groups are indeed words, and the spaces between said words are significant, serving to separate sense units.

 

Read More

How many glyphs are there in the Voynich alphabet?

How many glyphs are there in the Voynich alphabet?

Note: This page is a work in edit. Still fiddling. Comments and feedback more than welcome, they’re encouraged!

The very question itself is imbued with menace. Before we even get going, we have to first define the very semiotic basis of “glyph” within the manuscript. We first need to define a paradigm for what a glyph actually is.

Note: This page is mainly a compilation of work that is already out there. I wished to collate and to define the very basics of Voynichese before delving into some more complex topics, and to re-examine the assumptions that underpin all of our bigger theories. Most of this is NOT based on my own examination of Voynichese, but upon a compilation of what other people / working groups have observed, with sources, although I do attempt an analysis of certain combination glyphs further on. Certainly none of this is written “in stone” – the question, by its very nature, is subjective. And I can only work from previous work, from the transcription alphabets and the transcription corpus.

The alphabet of a language is the set of symbols, letters, or tokens (which in Voynichese are called glyphs) from which the strings of the language may be formed. The content strings (the signifier) formed from this alphabet are called words. A formal language is often defined by means of a formal grammar such as a regular grammar or context-free grammar, also called its formation rule. [^] (more…)

Read More

Exploring evolving epizeuxis within the Voynich Manuscript text

There is a unique feature within the Voynich Manuscript, namely the occurrence of very similar repeated sequences of text. These are a progression from Timms Pairs, an effect which is defined as two very similar words appearing within the same paragraph, usually with an additional suffix or prefix. However, an evolving epizauxis sequence, which I call Jackson sequences, are fragments of sentences in which a word appears to be repeated several times with slight modifications. The effect has been described before, from D’Imperio & Currier providing the first comment on the effect. However, as far as I am aware there has been no attempt to develop or analyse the effect. This is not an attempt to formally describe the effect but a quick overview of the characteristics that form the phenomenon and a suggestion for automatically detecting these sequences in the transcription files, which is building up to a more formal description of the reason driving the scribe who first penned this work.

Epizauxis is a term from formal rhetoric which describes the rapid repetition of a single word with no other words in between, albeit for the sake of emphasis. A classic example is from Macbeth: “O horror, horror, horror!”. There is, naturally enough, no term for repeating the same word with difference of spelling, as this makes no sense in natural language, outside of “stream of consciousness” writings which alliterate words, such as in a quasi-poetic style, crumble trumble bumble mumble… Or, of course, one of those clever poems designed to show how difficult it is to speak English:

Just compare heart, beard, and heard,
Dies and diet, lord and word,
Sword and sward, retain and Britain.
(Mind the latter, how it’s written.) [Excerpted from here]

Notwithstanding that, the Voynich manuscript contains many examples of Jackson sequences, especially in the text heavy pages towards the back of the manuscript. They are infrequent or non-existent in the light illustrated pages (although Timms Pairs do appear there) but appear numerous times on the later pages. They appear much more often in Currier B pages, although this may just be because the text heavy pages are written in B.

Note: All eva transcriptions are taken from the reading available in the VIB. However, the readings make more sense in the original document.

Let us look at a few examples:
<f112r.P.6;H> chedal.oteedy.okeey.qokeedy.olkeedy.oteey.oram
jacksonseqf112r-6
<f111r.P.3;H> dsheedy.lkeedy.chckhy.lchedy.qokeey.qokear.chal.qokeeas.cheokedy.sal.lokam

This example makes less sense in Eva, but if you start with the second word in this line and read along you can clearly see how the sequence evolves in a binary sequence. Words 1,3,6 form what is essentially a Jackson sequences, as do words 2,4,5,7.Captura de pantalla completa 06072015 160917Here is another example from the same page. I have copied two lines here because there are some beautiful examples of Timms Pairs (qokeey) here!

In this example we see how okeeo morphs into olchedy, lchedy, qokeey, okeeedy, okain, followed by a repeated chedy.
<f111r.P.9;H> ycheeodai!n.okeeo.olchedy.lchedy.qokeey.okeeedy.okai!n.chedy.chedy.teey.dal.lam

j2

There are many other such examples, but they are omitted here for brevity’s sake – a quick visual search on any of the text rich pages  in Currier B will quickly bring your attention to them.

What we are clearly seeing here are words which are being repeated with modifications as the scribe writes. Duplicate pairs such as the chedy chedy cited in the above example can be dismissed as scribal errors errors, but this explanation does not explain away the nature of Jackson sequences.

Before delving into possible reasons for these sequences, is it possible to automatically detect them in the transcription files?

Well, the above sequences all share the same features. They are a succession of words, usually in a linear sequence, that are very similar. In a sentence where w x y z forms the Jackson sequence, the shorter word between any two adjacent glyphs will usually share at least 80% of its glyphs with its larger partner and y, z will normally remove or insert the most visually striking glyph present or missing in w x.

The difference between words quickly forms a pattern. A word is generated. The next word either omits or adds a letter (or bigram): d become qo for example, usually at the front or back of the word. The subsequent word has a glyph modified (a become d, or ee become a benched gallows, for example), and if a prominent glyph is present this is dropped. The last word has any suffix dropped. Usually after four or five words the Jackson sequence is abandoned.

This rule suggests that Jackson sequences can be automatically found. However, the transcription files do make arbitrary differences between glyphs that can confuse the parser. For example, ch & ee are both visually very similar bigrams which should be treated as the same glyph by any parser.

However, the discovery of a supposed pattern does not imply that there is a technical reason behind the formation of Jackson sequences. The short nature of these words – averaging 5.5 glyphs a word – means that removing or inserting just one or two glyphs forms the 80% rule.

The following possible reasons occur to me:

  1. This is an effect of any possible encryption process. For example, a similar effect would be found in a simple transposition cipher on similar words, ie the atbash cipher or “pig latin” code.
  2. The lines are poetic in nature and we are seeing alliteration in action.
  3. The text is actually phonetic in nature and we are seeing similar sounding words, as in the extract from the poem above.
  4. The text is random, and this is an unintended effect created by the scribe. It would thus be the same phenomenon that produces Timms Pairs – the scribe, intending to produce natural language like text, is copying previous words and modifying them as he goes along to make them look different. This would explain the appearance or disappearance of the most visually striking glyphs.
  5. Anton Alipov makes an interesting suggestion below.
  6. Declension of verbs (see more below).

Comments welcome!

It’s worth mentioning that to counter balance the above, there are repeating sequences throughout the Voynich. The longest are as follows (from Petr Kazil) and are added here for future contemplation:

The text contains a sequence which is repeated not just twice, but four
or more times. Significantly, all of the occurrences are in ``Author B''
text:
                                       **************************
<f84r.10>
4OPS*89.9FCC89.4OFAEOE.ZC89.4OFC89.4OFCC89.4OFCC89.SC89.RAM.SC9.OPAR.8
<f43r.12>
8OR.ZOE.4OFOE.ZC89.4OPC89.4OFCC89.4OFO89.OFCC89.OPC89.ZC89.UP9.9P9.89.
<f75v.21>
4OFAN.OEZC*9.4OFAN.8AR.OE.ZC89.4OFC89.4OFCC89.4OPAR.OEZC89.OE89-
<f84r.3>
4OFCC9.8AR.ZC89.4OFC89.4OFCC89.4OFC89.SC89.OFAM.SC9.4OFC89.8AR.OEAO89-
                                           ***********************
<f84r.3>
4OFCC9.8AR.ZC89.4OFC89.4OFCC89.4OFC89.SC89.OFAM.SC9.4OFC89.8AR.OEAO89-
<f79v.12>
BZ89.OVS89.4OFC89.4OPCC89.4OFC89.4OEPC89.4OPC89.OF9-
<f77r.34>
4OFCC89.4OPC89.4OFCC89.4OFCC89.4OFCC9.RAM.AE-8SCCOE.SCC89.4OPAM.4OPCC89.4OPC
89.RAM-
<f83r.7> 2OEZC8.EZCC89.4CCC89.4OF9.O4OE.RZCC89.4OFC89.4OPCC89.4OPCC89-
<f84r.10>
4OPS*89.9FCC89.4OFAEOE.ZC89.4OFC89.4OFCC89.4OFCC89.SC89.RAM.SC9.OPAR.8

There are also a few twice-only repeats:

                                               ***********************
<f112v.33>
OR.SCCOR.OFCC89.4OFC89.4OFCC89.SC8AM.OFCCC89.OPAM.SCCFC9.SOE-
<f82v.5>
2OESC89.ESC89.8OEZC89.4OFAE.ZCX9.ZC9.PCCOE.OPCC89.4OFC89-4OPC9.4OFAE.ZCF9.4O
FAE.SC89.4OPAESC89-


                       ***********************
<f26r.4> 4OFC89.SCO2.9PC89.4OFC89.9PC89.SCFC89.8AM.O8AJ.2AE89-
<f81v.12>  4OE.OE.S89.ZC89.4OFC89.9PC89.SCPC89.EFC8C9.9PC89-

Later….

Anton Alipov suggests that:

Don’t know how in English with its quite simple grammar, but, for example, in Russian this kind of repetition well might be not for the sake of emphasis, but just in the regular course of declension. For example:

Косил косой косой косой.

Here we have three identically looking words, and the first word is also similar to three others. This is a valid sentence and it means: “The boss-eyed [person] mowed [something] with a crooked scythe”. The first word “косил” is past tense, masculine gender for the verb “косить” (to mow). The second word “косой” is a designation of a person (like a nickname) and actually means “one who is suffering from strabismus”. The fourth word “косой” is ablative case, singular number for the feminine gender noun “коса” (scythe). The third word “косой” is feminine gender, singular case adjective “косая” relating to the noun and thus put in ablative. Probably all four words share the common etymology, but actually they are all different in terms of meaning, except that the adjective “косая” and the nickname “косой” share the common meaning like “not straight”, and also the verb “косить” and the noun “коса” are semantically related: you usually mow (“косить”) with a scythe (“коса”) and, alternatively, what you usually do with the help of the scythe (“коса”) is that you mow (“косить”) with it.

Russian is not like the Voynich Manuscript in terms of abundance of such repetitions, but again this is a valid and natural linguistic example.

Anton’s comments also made me think of declensions. Declension of verbs would show a similar effect, not in English (run,run,ran), but in most of the Romance languages or indeed, Latin itself which has four main patterns of conjugation (ie currō, currere, cucurrī, cursus (to run, to race)).

Read More

Is it worth trying to work out what the plants in the Voynich Manuscript are?

There are many “plants” (herbs if you will, although I doubt all them are) in the Voynich Manuscript. Is it worthwhile trying to identify them?
For any identification attempt is a two edged sword that can easily lead us astray.
First off, we have to consider whether
a) the plants are drawn in the traditional sense or
b) are the results of an individual working off their own experience.
or c)…… that they don’t actually have a maning.
If a),

then they are being copied from earlier sources, and hence will correspond to the bulk of the literary tradition in Europe. If we assume they are, then there will be many clues that give us access to their identifications as their use will be symbolised. Remember that there are many herbals in existence – most of them, as Don on the mailing list has been discovering, are just copies of earlier or contemporaneous works, following set patterns, even if the individual monastery did add commentaries to the “official” text.

People simply did not want innovation in their herbs – we are talking about medicine here. Without going deeply into the subject, the literary tradition of medicine was institutionalised, it was traditional. Herbals were part of a tradition from the past, based usually on the doctrine of signatures, medicine that was assumed to work, and nobody wanted to be the guinea pig for some quack with new ideas.
Herbals of the age followed the tradition. We obviously cannot know what local doctors (wise women or men, leechs, hedge magicians, call them what you will) knew or thought, for they left no written record, but it seems a safe bet that oral teaching would filter out from the monasteries, communicating their knowledge, and that this knowledge would be passed between villages and medics. We know that the common name for herbs changes drastically from region to region, even village to village in old England, but their essential purpose remains the same.
As an example, the Old English Herbolarium, an AngloSaxon turn of the millennium work, is a herbal written in Old English in the continental style, translating the original continental works. However, most of the herbs depicted are unrecognisable, which lead scholars to assume that the scribes who translated the work didn’t have access to any original illustrations (many of the herbs are, in any case, not native to the British Isles). The assumption was that the scribes had no real life models, and so after several editions of the work had been copied, the original illustrations had morphed unrecognisably. Not so: Voigts in his 1979 work proved that the herbs are depicted in their dried form, the only way that Brits would have had access to them (via trade to central and southern Europe), and far more useful a depiction to them than their fresh form. The scribes had kept the knowledge and power of the authoritative written text, but had changed the illustration to fit their needs.
But the symbolised “clues” are still there. Basilicia, adderwort, a herb assumed to protect against adders continues to have its association with the three snakes and so can be recognised. Adderwort without the snake & basilisk association serves no point!
So if we assume a), we can then go ahead and look for symbolic clues in the Voynich. Let us look at 49r. A plant with multi colour golden (well, reddish) bulbs and snakes around the roots. Ah ha! It’s Adderwort.
Or is it?
Well, adderwort traditionally has three snakes, not two as depicted in the Voynich. The snakes are usually called Eriseos, Stillatus & Hematites (or Crysofalus) according to Pollington, at least in the old English tradition, with their associated characteristics that give the plant its power (I skip over the details here). So why does the Voynich only have two? And are they really snakes? Where are their fangs, or the vertical stripes showing that these are indeed the poisontooth snakes of antiquity, the adder family?
So the symbology does not help us. Either the symbology is adhered to as per tradition, or it is thrown out of the window and a new schematic is inserted. We cannot pick on one half recognised detail and expand it to the rest of the material without proof.
Let’s consider b).

The Voynich is the work of someone not following the traditional patterns.

Well, in this case, we cannot assume. We must be sure. And how can we be sure if the text is not there to describe what we are seeing?
Ah ha! We think. This is a rose. No, replies the author, it’s a dog rose, or a badly drawn daisy. How dare you think it is a rose.
Ah ha! This is Adderwort. Look at the snakes. No, replies the author, for that is the medicine of the old guard, not the new exciting stuff I am developing and anyway those are worms showing that these flowers grow in the decay of waste, signifying a phoenix like revival from the ashes of our waste. Or whatever.
We cannot match these illustrations to plants, for the simple reason that the genre is just too large.
Yes, it looks like a red onion. But why should it be a red onion? It could be that the author is referring to a specific type of potato… no wait, potatoes came in later. You know what I mean. Maybe a fat carrot or any other tuber of a specific shape.
But there is a further problem with b). The fact that it doesn’t fit in with our accepted understanding of how later medieval medicine would work.
Early / middle medieval thought discarded original thought. Biblical teachings said that the Ancients possessed all knowledge as granted by God, and that human hubris had lead to this information being lost. Therefore, there was no point in poking around thinking up new things for yourself, you had to rely on the teachings of the Ancients.

That’s not to say that people weren’t curious, of course they were. It’s to say that in “formal” discussion and argument, rhetoric based on the arguments of the ancients was standardised and would overturn any original thought, even when the ancient information was clearly wrong. There is a story that Aristotle claimed the honeybee has eight legs, when any fool can see that it only has six – but this was accepted as fact right up until the Renaissance!

Monasteries copied books because they, in some way, transmitted information as revealed by God in the past and it was their duty to do so. They modified the useful bits of them as they went along, but the essential knowledge was protected – it was their duty to protect the holy knowledge of times past, and of course, they believed implicitly in it.

That’s part of the reason Rudolph II was revered by the early European intellectual – he was the original Renaissance patron, hunting out new information. He was living right at the time when new access to information and greater literacy was starting to evolve thought into the Renaissance, but the old regime continued with their medieval mindset elsewhere. His Spanish Uncle for example was most dismissive of his nephew and his intellectual mindset – it wasn’t something that was “done”. The Italian princes had been doing it for years, by the way, but they were never Holy Roman Emperors – Rudolph main-streamed this rather eccentric pasttime.

And look at Paracelsus. He is known now not for any innovation in medicine (his cures were as claptrap as the ones they were replacing) but because he broke with tradition and urged innovation, trial and error, experimentation and actually discarding old knowledge that didn’t lead anywhere. That’s why he was revolutionary. He was the first figure to become famous for such work, in the same way that his contemporaries such as Martin Lucer would become famous for defying the Catholic Church. OK, neither of them was the first to advocate such a movement, but they were the first to actually create movements. Which, I understand, does not imply that the VM cannot have been an earlier attempt, some visionary who realised that medicine was claptrap and attempted to create his own medicine. But this is a circular argument – for since we cannot read the text, we return to the beginning of this argument!
But all this came after the VM, in the middle 16th century.
There is a c).

That the content in the book simply doesn’t lead anywhere. That the illustrator had access to herbals but no understanding (or interest) of medicine or their purpose, and so just used them as a basis for his work as he went along. Which explains why we only have two snakes instead of three, the illustrator was unaware of the significance of three snakes.

So….

No matter which of the three arguments we choose, there isn’t a lot of point in trying to identify the plants in the VM, since we know (after decades of trying) that they aren’t real life representations.

We can build up logical arguments pointing to this plant or the other, but we cannot be sure. We cannot know the true intention of the artist, because we have no textual confirmation. And so far, we have never been able (Prof Bax aside, ah hem) to use a plant ID to identify words.

Read More

A logical consideration of the Voynich Manuscript

A recent discussion with H.R. SantaColoma on the VMS discussion list about the mathematical standard of proof needed to prove the VM is a “forgery” set me to thinking – is it possible to construct a logical inductive argument to investigate the “reality” of the VMS, and hence to say what it is, on probability, after weighing the available evidence?

There is more than enough work “out there” on the web on the Voynich. In fact, everytime I have an idea, I find that somebody else has already had that idea and has investigated it. Surely there is enough evidence out there to start getting a glimpse of the truth? We have over 100 years of very intelligent people looking, prodding and fiddling with the manuscript, and nobody knows what it is?

Even if nothing useful comes out of this little project, maybe it will help direct further research effort / or just be useful for the links within.

The full essay is here in pdf format.

Here’s a summary.

I want to test the following hypothesis, or it´s null (contrary) hypothesis.

  • H0 : The Voynich Manuscript contains understandable content.
  • H1 : The Voynich Manuscript does not contain understandable content.

Conclusions

You can read the full essay  to see how I reach the following Conclusions. The number at the end of each Conclusion indicates the page number in the essay.

  • C1: The parchment of the manuscript is from the first half of the 15th century. 5
  • C2: The ink used on the parchment was available at this period in time. 5
  • C3: The ink used for the writing and the drawings is essentially the same. 5
  • C4: The ink used for the numbering of the quires / pages, and the Latin alphabet, differ from one another and the writing / drawing. 5
  • C5: The illustrations were sketched first and then the text added afterwards. 6
  • C6: The quire and page numbers do not correspond to the true pagination of the VM as indicated by the illustrations and text flow. 7
  • C7: There is no consensus on the Voynich alphabet. 8
    • Corollary to C7: Analysis using only transcripts must attempt to adjust for erroneous input. 9
  • C8: The VM is not written in a known writing system. 10
  • C9: The VM is not written in shorthand alone. 10
  • C10: The VM is not a natural language. 14
  • C11: The VM may be a restricted constructed language. 14
  • C12: The VM appears (remember C7!) to have a strong underlying pattern that hints at a generator. 14
  • C13: The VM was written in a fluent and confident manner. 15
  • C14: The VM is not a simple substitution cipher nor code. 16
  • C15: The VM is not written in a sophisticated cipher nor code. 16
  • C16: A simple and fast process to generate random text for the VM was available. 17

Looking at the above list of conclusions and applying to our Hypothesis test, what do we find?

I would say, having weighed over 100 years worth of study of the VM in my hand, that we have three possibilities:

  • The VM is written in a long lost script and language. In which case, unless we find more examples or a key, we’ll never know what it says.

  • The VM is written in an amazingly sophisticated code. In which case, it’s probably a modern day hoax.

  • The VM is gibberish.

So, the VM null hypothesis is to be favoured:

the VM does not contain understandable content.

Here’s a theory on what happened (just a possible solution).

Read More

Benford’s law as applied to the Voynich (paragraphs)

I was musing on the implementation of Zipf’s Law to the Voynich. Several people have carried out studies into this, and all come back saying it falls within parameters. I did my own Zipf study and generally got the same results.

Now, Zipf is a theory that words used within a written language should fall within a logarithmic scale. Voynich does. Doesn’t prove anything other than the text is not purely random.

For more, and a full explanation of how Zipf can be used to analyse the text, see Seravana Reddy and Kevin Knight in section 4 of their paper What We Know About The Voynich Manuscript.

But I then tried to turn Zipf around. After all, Zipf “law” is really only a sub clause of Benford’s law, which states that in any large dataset, the numbers 1-9 should fall within a logarithmic scale.

You might think that in any large dataset, 1 would appear 11% of the time (it being one of 9 possible numbers). It usually doesn’t. It appears about 30% of the time. Strange, eh?

So the percentages of the times each number appears results in a chart like this:

1 should appear 30.1%, 2 17.6%, etc
1 should appear 30.1%, 2 17.6%, etc

Does Benford’s Law apply for, let us say, paragraph length? If it does for word distribution, why not for paragraph length? (more…)

Read More

Positions of glyphs in the VM

Sean B Palmer wrote an interesting little app to measure the positions of VM glyphs within words. See it here.

Now, he measured glyph position within “words” using the VM 101 transcription. (See his page for details).

Sean explains that the chart must be interpreted as follows:

Explanation

It is well known that certain Voynich MS glyphs appear always in certain positions in words. The most obvious of these are 4, which always comes at the beginning of a word, and * and p, which always come at the end. In fact, in these cases the important unit appears to be the line rather than the word.

It has also been noticed that glyphs in general appear to be quite well ordered within words. This diagram is a measure of how strong that ordering is. A pure ordering would have only blue in the top right triangle (◥), and only red in the bottom left triangle (◣), a mirror image of the top right triangle. As we can see, this rule is very closely adhered to, meaning that most glyphs are in fact ordered within a word. There are, however, some notable exceptions.

For example, if you compare H to f, you will find that it gives a red square in a sea of blue, or vice versa. This means that although H tends to come later in a word than f normally does, when H and f appear together, then the H comes first. This may suggest some kind of connection between these glyphs.

Metric

“Comes before” and “comes after” are measured in a binary way, meaning that distance and number of occurrences are ignored. So for example “ab” ranks as “b-after-a” just as much as “acccbbb”. The strength or dullness of the colours indicates the frequency of the beforeness or afterness. So if one glyph comes before another 100%, it will be a bright colour. As you get towards 50%, it becomes dull. It becomes bright again as it moves to 0%, because this indicates the reverse: i.e. coming before another glyph 0% of the time is the same as coming after it 100% of the time.

So dullness means that the relative positioning between the two glyphs is quite balanced, that they tend not to be strictly ordered. Bright colours indicate strong ordering. Since most of the chart is strongly coloured, most glyph combinations are strongly ordered. An example of a dull squared glyph is 9, which tends overall, as an average, to come towards the end of words. But since its squares are more dull than its surroundings, it is more “moveable” than most other glyphs. Other relatively dull squared glyphs include y and e.

In some cases we find that the glyph contains many exceptions within its squares. A good example is i, which has plenty of red in the blue side, and blue in the red side. It is clear that the ordering of the glyphs in general follows the typology of the glyphs themselves. We start with the “gallows” characters such as g, then the variations of 1, followed by variations of c and finally variations of i. But i itself comes relatively early within words overall, whilst also showing one of the greatest number of exceptions. The typological ordering isn’t a strict rule, in any case, as glyphs such as 9 provide exceptions.

What glyph comes after another in the VM?
What glyph comes after another in the VM?

 
I’ve recreated this graph from his data independently, and they seem to match:

Fair enough, though I. Can I compare this to English?

So I modified his scripts to accept English instead of VM and ran it on an article of some 2,400 words. (I lowercased it and ignored all punctuation / numbers). Here’s the result:

What comes where in English
What comes where in English

Substantially different, as I supposed. What about Spanish?

Well, I grabbed a legal article (the Law on Printing for some reason), added in á,e,í,ó,ú,ñ to the mix and ran the script. I was surprised to see those white lines – but double checking, I realised that in 4784 words, neither K or W appeared even once! Which is a salutary lesson on assuming that the rules of English apply to everyone else.

Spanish legal article
Spanish legal article

And finally, I tracked down an example of machine stenography, which as it’s based on verbal usage of English I thought would make a good comparison:

stenography exampleHere’s all four tests together on one page:

All four trials together (original VM bottom left)
All four trials together (original VM bottom left)

So… doesn’t look like any sort of alphabet. Anyone got any abbreviated texts out there with lots of repetition? 🙁

Read More

oMar9 – a label that pops up on several pages and leads to new labels

My attention was first drawn to oMar9 when I spotted it on f57/r and I there casually labelled it as one of the four ordinal points on a compass.

f57/v - who are these old coots?
f57/v – who are these old coots? NW,NE,SW,SE Greek gods of the winds?

But I then spotted the same label on the Rosette page pullout, in what looks to be a compass on the bottom left of the pullout.

The Rosette "compass" from bottom right
The Rosette “compass” from bottom left

Here are the two labels side by side:

Image colours manipulated for easier comparision
Image colours manipulated for easier comparision

I’ve now found the label next to a nymph at the top of f80r.

Third from right
Third from right

Pedants will notice that in one of the labels, the gallows glyph varies. I tend to the school that says this extra little loop on the left stroke doesn’t matter and is just the scribe.

There is also confusion over the two middle letter between the gallows and the r. I cannot say if they are truely different glyphs, or the same glyphs which just run together. Can you?

But then the labels on f80r continue to repeat on subsequent pages:

f80r oMaP
Note oMaP top left (f80r)….
..and oMaP top left (f82r)
..and oMaP top left (f82r)

Looks like the same girl on both pages – long hair and a braid ontop. The second from left nymph is labelled oMaR89… very nearly oMaR9.

And the labels continue – for example, some of these nymphs labels are reused on f71v, etc.

 

Read More