The Cainteoir Text-to-Speech Engine is split into several groups that relate to the different phases of the speech synthesis process. These phases are:

  1. Document Processing
  2. Text To Words
  3. Words To Phonemes
  4. Phoneme Morphology
  5. Phoneme Synthesis

The design is such that each phase can be run independently of each other, allowing, for example, the process to start or end at a phoneme transcription.

The Cainteoir Engine currently uses the eSpeak API to handle phases 2 to 5. The intention is to implement these within the Cainteoir Engine itself, allowing more advanced functionality to be provided than is provided by eSpeak.


  1. Languages, Voices, Accents and Dialects.
  2. Voice Quality.