OT-ish? McIlroy's "Synthetic English Speech by Rule"

20 Jul 2006

Graham Toal wrote:
...
   My only claim
to being *possibly* on-topic is the age
 of the article (1974) and the fact that it inspired
 many of the early phoneme-driven speech synthesizers
 (Votrax, etc.).

 Does anyone have access to a suitably good engineering
 library with a copy of:

 McIlroy, M D,  "Synthetic English Speech by Rule",
 Bell Telephone Labs, CSTR #14, 1973 (though I have
 also seen it referenced as 1974!)

 or:

 Ainsworth, W A, "A System for Converting English Text
 to Speech", IEEE Trans Audio & Electroacoustics AU-21 #3
 pp 288-290, 1973

 The former is far more interesting to me than the
 latter  :-(  
 Just maybe I have either or both; I'll check when I get home. 
Excellent!  I would be surprised if you *did*.  I have
pretty much resigned myself to a trip to the university's
engineering library (though I think I will wait until some
of this heat and humidity disappears...)

...
  You know about the old post on net.sources?

 Have a look at some of the stuff in here:

 http://www.gtoal.com/wordgames/text2speech/ 
This seems to be the NRL ruleset embellished (for use
with a spell-checker? -- if so, you might want to look
at things like double metaphone for the approach you
are/were taking...)

...
  It's the same vintage, may be of interest. 
Yes, I have a version of the NRL code that I translated
...
 from SNOBOL ~25 years ago.  But, it has the same
limitations (in terms of pronunciation accuracy) as the original ruleset.

I was hoping a peek at McIlroy's and Ainsworth's rules
would shed some additional insights not immediately
discernible from the Elovitz et al. paper.

More modern synthesizers suffer from big time code bloat
(e.g., flite can easily grow to 10-20MB while executing;
festival ten times that...).  For "doing things on the cheap"
you need to look back in time  :-(

...
  Also, I hacked the navy code around a bit to make it
more
 accurate and to assist with using a large phonetic
 word list.  And to parameterize the tables from an editable
 data file rather than being hard coded in the C source. 
I'd already done that.  As well as shrinking the tables
considerably (I think my table is less than 2.5KB including
delimiters, pointers, etc.).  There are also other efficiency
hacks you can do to speed up the searches, etc.

...
  The algorithm is considerably improved if you subject
the words
 to TeX's hyphenation algorithm before applying the grapheme->phoneme
 rewrite rules.  Hyphenation points roughly correspond to phoneme
 boundaries, and stop words like haphazard from sounding half-assed. 
Ah, that's a clever idea!  Though it depends on how much
overhead that adds to the complexity of the algorithm.  I
am REALLY squeezing hard to get this, a klatt-style
synthesizer, OS, etc. into a small application specific
CPU core so every byte has to pay for itself  :>

Thanks!
--don

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

OT-ish? McIlroy's "Synthetic English Speech by Rule"