Wildcard Combinations

Last week we discussed the basics of using wildcards to find text in a Microsoft Word document. You can read last week's newsletter here:

http://www.topica.com/lists/editorium/read/message.html?mid=1705963026

This week we'll talk about how to combine wildcards, which will let you get pretty fancy about the stuff you want to find. Basically, you just need to know that you *can* combine wildcards. Then you can get as crazy as you like.

Last week we used the "?" wildcard to find every three-letter combination starting with b and ending with t--"bet," "but," "bit," "bat," and so on--by searching for "b?t" with "Use Pattern Matching" turned on in the Find dialog box.

Now let's say we wanted to find the same characters but add others as well. For example, we might want to find every three-letter combination starting with b and ending with d--"bed," "bud," "bid," "bad," and so on--in *addition* to the combinations ending in t. Can we really do that? Sure!

After bringing up the Find dialog (Edit > Find) and turning on "Use Pattern Matching," we'll start by entering the letter b into the "Find What" box, telling Microsoft Word to find that letter.

Next, we'll enter the ? wildcard, which tells Microsoft Word to find any single character.

Finally, we'll enter a new wildcard: [td]. Microsoft Word will find any *one* of the characters specified in the brackets.

Altogether, the string of characters looks like this--

b?[td]

--and there we are, doing wildcard combinations! This particular combination tells Microsoft Word to find the letter b followed by any other single character followed by t or d.

How will something like this help you in editing? Suppose you're working on a manuscript in which the author has misspelled a name in nearly every way possible. You could comb through the manuscript over and over, hoping to catch all the variations. Or, you could be *sure* to catch them all by searching with wildcards. For example, let's say your manuscript is a book about India and the name in question is Gandhi. Your author has misspelled it as "Ghandi," "Gahndi," and "Ganhdi." (Not possible? Hah!) You can find every last one of them with the following string:

G[andh][andh][andh][andh]i

Then, if you've put the correct spelling, "Gandhi," in the "Replace With" box, you can find and replace each wrong spelling with the right one in a single pass, which is much more efficient than finding and replacing each variation separately.

You may be wondering why you couldn't just use the * wildcard to represent the whole string of letters, like this:

G*i

You could. But remember, the * wildcard represents *any* string of characters--including spaces. It's not limited to characters within a word (and neither are other wildcards). That means, in addition to finding the misspelled names, it will find the first 14 characters of the following phrase: "Go to the officer's hall." So be careful, especially if you're planning to use "Replace All" rather than finding and replacing one item at a time.

There is a way to simplify the wildcard combination, however. Consider this string:

G[andh]{3}i

It's functionally the same as G[andh][andh][andh][andh]i. The {3} tells Word to find exactly three more occurrences of the previous "expression," which is [andh].

But now a complication: Suppose that our slapdash author has also spelled Gandhi's name as "Gandi." Uh-oh. Our original string won't catch that, because this new misspelling is one character shorter than our string specifies. But consider this:

G[andh]{2,3}i

The {2,3} tells Word to find from 2 to 3 occurrences of the previous expression, so this string will catch all of our misspelled variations so far.

What if we want to allow for more or fewer characters, being particularly unsure of our author? We can use this string:

G[andh]@i

The @ wildcard tells Microsoft Word to find *one or more* occurrences of the previous expression. That ought to cover nearly anything our author throws at us. If we want to get a little more specific, we can use {2,}, which tells Word to look for *at least* two occurrences of the previous expression.

By this time you've probably noticed a pattern to these wildcards, but if not, I'll summarize:

A question mark ? finds any single character.

An asterisk * finds any string of characters.

Square brackets [] specify the characters to find.

Curly braces {} specify how many occurrences of the characters to find.

{n} finds an exact number (such as 2) of the preceding character or expression.

{n,} finds at least n occurrences (such as 3) of the preceding character or expression.

{n,n} finds from n to n occurrences (such as 3 to 5) of the preceding character or expression.

@ finds one or more occurrences of the preceding character or expression.

Here's a parting tip: What would happen if we put a lowercase rather than a capital G at the beginning of our string? Word wouldn't find the misspelled names. Why? Because with "Use Pattern Matching" turned on, Word automatically matches case--a useful thing to know.

That brings us to the subject of finding a range of characters--something we'll talk about next week.

This entry was posted in Editing. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

You must be logged in to post a comment.

  • The Fine Print

    Thanks for reading Editorium Update (ISSN 1534-1283), published by:

    The EDITORIUM, LLC
    http://www.editorium.com

    Articles © on date of publication by the Editorium. All rights reserved. Editorium Update and Editorium are trademarks of the Editorium.

    You may forward copies of Editorium Update to others (but not charge for it) and print or store it for your personal use. Any other broadcast, publication, retransmission, copying, or storage, without written permission from the Editorium, is strictly prohibited. If you’re interested in reprinting one of our articles, please send an email message to editor@editorium.com

    Editorium Update is provided for informational purposes only and without a warranty of any kind, either express or implied, including but not limited to implied warranties of merchantability, fitness for a particular purpose, and freedom from infringement. The user (you) assumes the entire risk as to the accuracy and use of this document.

    The Editorium is not affiliated with Microsoft Corporation or any other entity.

    We do not sell, rent, or give our subscriber list to anyone. Period.

    If you’d like to subscribe, please enter your name and email address below. We publish the newsletter once a week, and on rare occasions we may send an important announcement. We never, ever send spam. Thank you for signing up!