Fixing the dictionary file generated from the Sphinx Knowledge Base Tool

The dictionary file generated by the Sphinx Knowledge Base Tool contains text which is in all uppercase which will cause a silent problem if you try to use it with PocketSphinx. To resolve this, the target word must be converted to lowercase. The following example reads in a dictionary file, modifies it, and saves it with a 2 at the end of the file name.

filepath = '/home/james/jamesrobertson.eu/qbx/r/isabella/shopping.dic'
buffer = File.read filepath
rows = buffer.lines.map {|x| x.sub(/^([\w-]+)/){|w| w.downcase }}
File.write filepath + '2', rows.join

shopping.dic (extract):

APPLES  AE P AH L Z
BEANS   B IY N Z
BISCUITS    B IH S K AH T S
BREAD   B R EH D
BROWN   B R AW N
BUTTER  B AH T ER
CARROTS K AE R AH T S
CARROTS(2)  K EH R AH T S
CEREAL  S IH R IY AH L

shopping.dic2 (extract):

apples  AE P AH L Z
beans   B IY N Z
biscuits    B IH S K AH T S
bread   B R EH D
brown   B R AW N
butter  B AH T ER
carrots K AE R AH T S
carrots(2)  K EH R AH T S
cereal  S IH R IY AH L

Resources