Preliminary homophone finder written in perl

I wrote this little perl script to find homophones in text documents, so if you save your word doc as a text file, in theory you can find all the homophones with it.  I’m using 943 homophones and running the first part of a Winston Churchill speech through it 🙂

To run the script, you’d need to know a little perl and how to use it. So for most people, this isn’t particularly user-friendly.  It’s more for fun, as well as a proof of concept on a hypothetical tool writers could use to keep silly mistakes out of their writing.  While the script runs, you hit ‘enter’ occasionally to go to the next line with one or more homophones in it.

The script needs:

  • words.txt – a list of homophones one after the other  (it makes sense to edit out of this any words you’d never mess up, for example, “I” vs. “eye” or “were” vs. “whirr”)
  • ms.txt –  your manuscript saved as a text file

First, the code:

#!/usr/bin/perl

open(WORDS,"words.txt") or die "Can't open words.txt: $!\n";
@words = <WORDS> ;
close(WORDS);

open(MS,"ms.txt") or die "Can't open ms.txt: $!\n";

while(<MS>) {
   chomp();
   $aline = $_;
   lc($aline);
   $match = 0;
   foreach $i (@words) {
        chomp($i);
      if ($aline=~/\s+$i\s+/g) {
        $match = 1;
        $uppercase = uc($i);
         $aline=~s/\s+$i\s+/ \*$uppercase\* /g;
      }
   }

   if ($match == 1) {
      print "$aline\n";
      print "[ hit enter to continue ]\n";
      $ans= <> ;
   }
}
close(MS);

Here’s what happens to the first part of this famous speech:

I spoke the other day of the colossal military disaster *WHICH* occurred when the French High Command failed *TO* withdraw the northern Armies from Belgium at the moment when they *KNEW* that the French front was decisively broken at Sedan and on the Meuse. This delay entailed the loss of fifteen *OR* sixteen French divisions and *THREW* out of action *FOR* the critical period the whole of the British Expeditionary Force. Our Army and 120,000 French troops *WERE* indeed rescued *BY* the British Navy from Dunkirk *BUT* only with the loss of *THEIR* cannon, vehicles and modern equipment. This loss inevitably took *SOME* weeks *TO* repair, and *IN* the first *TWO* of those weeks the battle *IN* France has *BEEN* lost. When *WE* consider the heroic resistance *MADE* *BY* the French Army against heavy odds *IN* this battle, the enormous losses inflicted upon the enemy and the evident exhaustion of the enemy, it may well *BE* the thought that these 25 divisions of the best-trained and best-equipped troops *MIGHT* have turned the scale. However, General Weygand had *TO* fight without them. Only three British divisions *OR* *THEIR* equivalent *WERE* able *TO* stand *IN* the line with *THEIR* French comrades. They have suffered severely, *BUT* they have *FOUGHT* well. We *SENT* every man *WE* could *TO* France as fast as *WE* could re-equip and transport *THEIR* formations.
[ hit enter to continue ]

I am *NOT* reciting these facts *FOR* the purpose of recrimination. That *I* judge *TO* *BE* utterly futile and even harmful. We cannot afford it. *I* recite them *IN* order *TO* explain why it was *WE* did *NOT* have, as *WE* could have had, between twelve and fourteen British divisions fighting *IN* the line *IN* this *GREAT* battle instead of only three. Now *I* put *ALL* this aside. *I* put it on the shelf, from *WHICH* the historians, when they have time, will select *THEIR* documents *TO* tell *THEIR* stories. We have *TO* think of the future and *NOT* of the past. This also applies *IN* a small *WAY* *TO* *OUR* own affairs at home. There are many who *WOULD* hold an inquest *IN* the House of Commons on the conduct of the Governments-and of Parliaments, *FOR* they are *IN* it, too-during the years *WHICH* *LED* up *TO* this catastrophe. They seek *TO* *INDICT* those who *WERE* responsible *FOR* the guidance of *OUR* affairs. This also *WOULD* *BE* a foolish and pernicious process. There are *TOO* many *IN* it. Let each man search his conscience and search his speeches. *I* frequently search mine.
[ hit enter to continue ]

Advertisement

Leave a comment

Filed under Grammar/Punctuation, Tools for Writers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s