in Editing

Using the "Find What Expression" Wildcard

For the past few weeks we've been talking about using wildcards to find and replace text in Microsoft Word. Last week I introduced the "Find What Expression" wildcard (n) and promised to show you how to use it to move things around.

Let's say you've got a list of authors, like this:

Emily Dickinson

Ezra Pound

Willa Cather

Ernest Hemingway

and you need to put last names first, like this:

Dickinson, Emily

Pound, Ezra

Cather, Willa

Hemingway, Ernest

You can use the "Find What Expression" wildcard to do this in a snap.

Start the Replace dialog (Edit > Replace) and put a check in the "Use wildcards" or "Use Pattern Matching" box (you may need to click the "More" button before this is available). Then, in the "Find What" box, enter this:

^013([A-z]@) ([A-z]@)^013

If you've been reading Editorium Update, you'll probably understand these codes and wildcards:

^013 represents a paragraph mark.

[A-z] represents any single alphabetic character, from uppercase A to lowercase z.

@ represents any additional occurrences of the previous character--in this case, any single alphabetic character, from uppercase A to lowercase z.

() groups [A-z]@ together as an "expression" representing an author's first name. (This grouping is the key to using the "Find What Expression" wildcard in the "Replace With" box.)

The space after the first ([A-z]@) expression represents the space between first name and last name.

The next ([A-z]@) group represents the author's last name.

The final ^013 represents the paragraph mark after the name.

Now, in the "Replace With" box, enter this:

^p2, 1^p

The ^p codes represent paragraph marks. "Wait a minute," you say. "You just used ^013 for a paragraph mark. Why the change?"

Excellent question. The answer has two parts:

1. If we could use ^p in the "Find What" box, we would. But since Word won't let us do that when using wildcards (it displays an error message), we have to resort to the ANSI code, ^013, instead. You can learn more about this here:

http://www.topica.com/lists/editorium/read/message.html?mid=1703875043

2. If we use ^p in the "Replace With" box, Word retains the formatting stored in the paragraph mark (a good thing). If we use ^013, Word loses the formatting for the paragraph (a bad thing). In a list of author names, this probably doesn't matter, but you'll need to know this when finding and replacing with codes in more complicated settings.

Continuing with our example, ^p2, 1^p:

2 is the "Find What Expression" wildcard for our *second* expression (hence the 2) in the "Find What" box--in other words, it represents the last name of an author in our list.

The comma follows this wildcard because we want a comma to follow the author's last name.

A space follows the comma because we don't want the last and first names mashed together, like this: "Pound,Ezra."

1 is the "Find What Expression" wildcard for our *first* expression (hence the 1) in the "Find What" box--in other words, it represents the first name of an author in our list.

Now click the "Replace All" button. The authors' names will be transposed:

Dickinson, Emily

Pound, Ezra

Cather, Willa

Hemingway, Ernest

You've always wondered how to do that, right? But now you're wondering about middle initials. And middle names. And Ph.D.s.

All of those make things more complicated. But here, in a nutshell, are the Find and Replace strings you'll need for some common name patterns (first last, first middle last, first initial last, and so on). First comes the name pattern, then the Find string, and finally the Replace string, like this:

NAME PATTERN

FIND WHAT

REPLACE WITH

William Shakespeare

^013([A-z]@) ([A-z]@)^013

^p2, 1^p

Alfred North Whitehead

^013([A-z]@) ([A-z]@) ([A-z]@)^013

^p3, 1 2^p

Philip K. Dick

^013([A-z]@) ([A-Z].) ([A-z]@)^013

^p3, 1 2^p

L. Frank Baum

^013([A-Z].) ([A-z]@) ([A-z]@)^013

^p3, 1 2^p

G. B. Harrison, Ph.D.

^013([A-z].) ([A-Z].) ([A-z]@,) (*)^013

^p3 1 2, 4^p

J.R.R. Tolkien

^013([A-Z].)([A-Z].)([A-Z].) ([A-z]@)^013

^p4, 123^p

That list doesn't show every pattern you'll encounter, but it should provide enough examples so you'll understand how to create new patterns on your own--which is the whole point of this article. Once you've created all of the patterns you need, you could record all of that finding and replacing in a single macro that you could run whenever you need to transpose names in a list.

_________________________________________

READERS WRITE

After reading last week's newsletter, Mary L. Tod (mtod@earthlink.net) wrote:

In your Editorium Update for today, is it necessary to enclose the space in parentheses? Since it isn't being replaced by itself, can't the expression in the Find box be reduced to

(^013[0-9]@.)

(with just the space entered after the first expression)?

Mary is absolutely right about this. I put the space in parentheses because I wanted to briefly introduce the idea that you could have more than one "Find What Expression" wildcard--in this case, 2. For that to work, the space has to be in parentheses so it's recognized as an expression. But I didn't actually *use* the 2 in the example, so a simple space would have worked just fine.

Mary continued:

In a related question, does the @ symbol in the wildcard field also allow for no repeats of the previous character? Otherwise, it would start the list at 10, wouldn't it?

2. followed by a number ([0-9])

3. followed by one or more numbers (@)

Again, this is right on the mark. The @ really means "followed by one or more numbers *if there are any.*" A more technical way to put it is "followed by *zero* or more numbers."

Thanks to Mary for her astute comments.