OT-ish? McIlroy's "Synthetic English Speech by Rule"

26 Jul 2006

Greetings!

Graham Toal wrote:

...
  You know about the old post on net.sources?

 Have a look at some of the stuff in here:

 http://www.gtoal.com/wordgames/text2speech/

 It's the same vintage, may be of interest.

 Also, I hacked the navy code around a bit to make it more
 accurate and to assist with using a large phonetic
 word list.  And to parameterize the tables from an editable
 data file rather than being hard coded in the C source. 
I finally got a chance to dig through most of this code
(though some of your tables/wordlists are puzzling...
maybe "work in progress"?).

Did you derive the rulesets *directly* from the NRL
code or was it something you "inherited"?  I find
several discrepancies between your rules and mine
and wonder if they are typographical errors *or*
improvements/enhancements you derived (my rules
mimic the NRL rules exactly).

The problem with this sort of algorithm is you're never
sure when it's "right" -- unlike an algorithm that
adds a column of numbers and prints a total!  :>

...
  The algorithm is considerably improved if you subject
the words
 to TeX's hyphenation algorithm before applying the grapheme->phoneme
 rewrite rules.  Hyphenation points roughly correspond to phoneme
 boundaries, and stop words like haphazard from sounding half-assed. 
This makes perfect sense!  Treat the word as "word-lets"
since sounds can't span a hyphen.

Where do I find this algorithm?

(sigh)  Amazingly complicated for such a "simple" task...  :>

Thanks!
--don

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

OT-ish? McIlroy's "Synthetic English Speech by Rule"