using perl in editing | John L. Monk

I recently switched to Scrivener for writing my documents. Much more enjoyable interface than Word, with lots of nifty features for writers. One big issue: I’m still getting used to Scrivener’s spellchecker. Microsoft Word finds doubled words right out of the box, but Scrivener does not.

The script below is written in perl, which comes pre-installed on Macs. If you paste it into a text file, make the file executable, and then run it in the same directory with a file called “infile.txt” (a cut/paste from Word to the file will do nicely), it will report your doubled words.

*update* – the script won’t catch things like: “bang bang” because the quotes make it think it’s 2 patterns. Working on it 🙂

Example input (infile.txt):
This is a line This is another line And yet another line wow I sure do a lot of lines, "Don't I?" he said (in a funny voice)... Wow it sure is is fune typing all this I like dogs and cats and stuff. Big big is funner than small people. how are are the dodgers doing this year? Nobody knows. more lines and stuff... etc. etc. good things come to those who write scripts in perl and post them on the internet

Example Output:

is is ----> Wow it sure is is fune typing all this big big ----> Big big is funner than small people. are are ----> how are are the dodgers doing this year? Nobody knows. etc. etc. ----> etc. etc.

And now the script: rep.pl

#!/usr/bin/perl
open(FILE,"infile.txt") or die "Can't open infile.txt: $!";
$section_breaks = "*";  # I have * * * as section breaks. The script sees them as words and should ignore them.
while(<FILE>) {
   chomp();
   $a_line = $_;
   @line = split(/ /, $_);
   $prev = 0;   
   foreach $i (@line) {
      $i = lc($i);
      if ($i eq $prev && $i ne $section_breaks) {
         print "$prev $i ---->  $a_line\n";
      }
      $prev = $i;
   }
}
close(FILE);

Tag Archives: using perl in editing

Finding doubled words using perl

Tag Archives: using perl in editing

Finding doubled words using perl

Share this: