A logical consideration of the Voynich Manuscript

A recent discussion with H.R. SantaColoma on the VMS discussion list about the mathematical standard of proof needed to prove the VM is a “forgery” set me to thinking – is it possible to construct a logical inductive argument to investigate the “reality” of the VMS, and hence to say what it is, on probability, after weighing the available evidence?

There is more than enough work “out there” on the web on the Voynich. In fact, everytime I have an idea, I find that somebody else has already had that idea and has investigated it. Surely there is enough evidence out there to start getting a glimpse of the truth? We have over 100 years of very intelligent people looking, prodding and fiddling with the manuscript, and nobody knows what it is?

Even if nothing useful comes out of this little project, maybe it will help direct further research effort / or just be useful for the links within.

The full essay is here in pdf format.

Here’s a summary.

I want to test the following hypothesis, or it´s null (contrary) hypothesis.

  • H0 : The Voynich Manuscript contains understandable content.
  • H1 : The Voynich Manuscript does not contain understandable content.


You can read the full essay  to see how I reach the following Conclusions. The number at the end of each Conclusion indicates the page number in the essay.

  • C1: The parchment of the manuscript is from the first half of the 15th century. 5
  • C2: The ink used on the parchment was available at this period in time. 5
  • C3: The ink used for the writing and the drawings is essentially the same. 5
  • C4: The ink used for the numbering of the quires / pages, and the Latin alphabet, differ from one another and the writing / drawing. 5
  • C5: The illustrations were sketched first and then the text added afterwards. 6
  • C6: The quire and page numbers do not correspond to the true pagination of the VM as indicated by the illustrations and text flow. 7
  • C7: There is no consensus on the Voynich alphabet. 8
    • Corollary to C7: Analysis using only transcripts must attempt to adjust for erroneous input. 9
  • C8: The VM is not written in a known writing system. 10
  • C9: The VM is not written in shorthand alone. 10
  • C10: The VM is not a natural language. 14
  • C11: The VM may be a restricted constructed language. 14
  • C12: The VM appears (remember C7!) to have a strong underlying pattern that hints at a generator. 14
  • C13: The VM was written in a fluent and confident manner. 15
  • C14: The VM is not a simple substitution cipher nor code. 16
  • C15: The VM is not written in a sophisticated cipher nor code. 16
  • C16: A simple and fast process to generate random text for the VM was available. 17

Looking at the above list of conclusions and applying to our Hypothesis test, what do we find?

I would say, having weighed over 100 years worth of study of the VM in my hand, that we have three possibilities:

  • The VM is written in a long lost script and language. In which case, unless we find more examples or a key, we’ll never know what it says.

  • The VM is written in an amazingly sophisticated code. In which case, it’s probably a modern day hoax.

  • The VM is gibberish.

So, the VM null hypothesis is to be favoured:

the VM does not contain understandable content.

Here’s a theory on what happened (just a possible solution).

