Last week we discussed the basics of using wildcards to find text in a Microsoft Word document. You can read last week's newsletter here:
http://www.topica.com/lists/editorium/read/message.html?mid=1705963026
This week we'll talk about how to combine wildcards, which will let you get pretty fancy about the stuff you want to find. Basically, you just need to know that you *can* combine wildcards. Then you can get as crazy as you like.
Last week we used the "?" wildcard to find every three-letter combination starting with b and ending with t--"bet," "but," "bit," "bat," and so on--by searching for "b?t" with "Use Pattern Matching" turned on in the Find dialog box.
Now let's say we wanted to find the same characters but add others as well. For example, we might want to find every three-letter combination starting with b and ending with d--"bed," "bud," "bid," "bad," and so on--in *addition* to the combinations ending in t. Can we really do that? Sure!
After bringing up the Find dialog (Edit > Find) and turning on "Use Pattern Matching," we'll start by entering the letter b into the "Find What" box, telling Microsoft Word to find that letter.
Next, we'll enter the ? wildcard, which tells Microsoft Word to find any single character.
Finally, we'll enter a new wildcard: [td]. Microsoft Word will find any *one* of the characters specified in the brackets.
Altogether, the string of characters looks like this--
b?[td]
--and there we are, doing wildcard combinations! This particular combination tells Microsoft Word to find the letter b followed by any other single character followed by t or d.
How will something like this help you in editing? Suppose you're working on a manuscript in which the author has misspelled a name in nearly every way possible. You could comb through the manuscript over and over, hoping to catch all the variations. Or, you could be *sure* to catch them all by searching with wildcards. For example, let's say your manuscript is a book about India and the name in question is Gandhi. Your author has misspelled it as "Ghandi," "Gahndi," and "Ganhdi." (Not possible? Hah!) You can find every last one of them with the following string:
G[andh][andh][andh][andh]i
Then, if you've put the correct spelling, "Gandhi," in the "Replace With" box, you can find and replace each wrong spelling with the right one in a single pass, which is much more efficient than finding and replacing each variation separately.
You may be wondering why you couldn't just use the * wildcard to represent the whole string of letters, like this:
G*i
You could. But remember, the * wildcard represents *any* string of characters--including spaces. It's not limited to characters within a word (and neither are other wildcards). That means, in addition to finding the misspelled names, it will find the first 14 characters of the following phrase: "Go to the officer's hall." So be careful, especially if you're planning to use "Replace All" rather than finding and replacing one item at a time.
There is a way to simplify the wildcard combination, however. Consider this string:
G[andh]{3}i
It's functionally the same as G[andh][andh][andh][andh]i. The {3} tells Word to find exactly three more occurrences of the previous "expression," which is [andh].
But now a complication: Suppose that our slapdash author has also spelled Gandhi's name as "Gandi." Uh-oh. Our original string won't catch that, because this new misspelling is one character shorter than our string specifies. But consider this:
G[andh]{2,3}i
The {2,3} tells Word to find from 2 to 3 occurrences of the previous expression, so this string will catch all of our misspelled variations so far.
What if we want to allow for more or fewer characters, being particularly unsure of our author? We can use this string:
G[andh]@i
The @ wildcard tells Microsoft Word to find *one or more* occurrences of the previous expression. That ought to cover nearly anything our author throws at us. If we want to get a little more specific, we can use {2,}, which tells Word to look for *at least* two occurrences of the previous expression.
By this time you've probably noticed a pattern to these wildcards, but if not, I'll summarize:
A question mark ? finds any single character.
An asterisk * finds any string of characters.
Square brackets [] specify the characters to find.
Curly braces {} specify how many occurrences of the characters to find.
{n} finds an exact number (such as 2) of the preceding character or expression.
{n,} finds at least n occurrences (such as 3) of the preceding character or expression.
{n,n} finds from n to n occurrences (such as 3 to 5) of the preceding character or expression.
@ finds one or more occurrences of the preceding character or expression.
Here's a parting tip: What would happen if we put a lowercase rather than a capital G at the beginning of our string? Word wouldn't find the misspelled names. Why? Because with "Use Pattern Matching" turned on, Word automatically matches case--a useful thing to know.
That brings us to the subject of finding a range of characters--something we'll talk about next week.