Lyonizing Word: We Can Do This the Easy Way, or . . .

We Can Do This the Easy Way,
or We Can Do This the Hard Way

by Jack Lyon

American Editor Rich Adin called me recently with a puzzle. He was editing a list of citations that looked like this:

Lyon J, Adin R, Poole L, Brenner E, et al: blah blah blah.

But his client wanted the citations to look like this:

Lyon J, Adin R, Poole L, et al: blah blah blah.

In other words, many of the citations included one author name too many; the client wanted a limit of three rather than four. And there were hundreds of citations. Rich really didn’t want to remove the superfluous names by hand; it would have taken hours to do, and hours are money. And so, Rich queried, “Is there a way to remove the fourth name automatically?”

There’s nearly always a way. Rich had already tried using a wildcard search, but without success. Microsoft Word kept telling him, “The Find What pattern contains a Pattern Match expression which is too complex.”

The Too-Complex Find What

I’m not sure what wildcard search Rich tried to use, but it might have looked like this:

Find what:

([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )(et al:)

Replace with:

1235

That’s definitely too complex for Word to handle. Here’s what it means:

Find a capital letter ([A-Z])
followed by a lowercase letter ([a-z])
repeated any number of times (@)
followed by a space
followed by a capital letter ([A-Z])
followed by a comma
followed by a space
with all of that in parentheses to form a “group.”

All of that is repeated three more times, then followed by “et al:” in parentheses to form a group.

The “Replace with” string tells Word to replace what it finds with the contents of groups 1, 2, 3, and 5 — in other words, with the first three names followed by “et al:”.

What’s the Handle?

If Word could handle it, that should work. But Word can’t handle it, so we’ll need to simplify. So we ask ourselves, “What, besides letters, do all of the names have in common?” In other words, “What’s the handle? What can we grab onto?” Well, that’s easy — each name is followed by a comma and a space. That’s our handle!

(For more on this, please see my article "What’s Your Handle?" (2003) at the Editorium Update.)

The Find That Works

The handle means we can simplify our wildcard search string to something like this:

Find what:

([!^013]@, [!^013]@, [!^013]@, )[!^013]@, (et al:)

Replace with:

12

Here’s what that means:

Find any characters except a carriage return ([!^013])
repeated any number of times (@)
followed by a comma
followed by a space
with all of that repeated three times
and enclosed in parentheses to form a “group.”
Then it’s repeated one more time, ungrouped
and followed by “et al:” in parentheses to form a group.

The “Replace with” string tells Word to replace what it finds with the contents of groups 1 and 2 — in other words, with the first three names (group 1) followed by “et al:” (group 2). The fourth name is simply ignored.

To Group or Not to Group Using Parens

Rich ran the new find and replace, then replied, “Thanks, Jack, that works like a charm. Why isn’t the second ‘group’ grouped, that is, in parentheses? I thought that was necessary.”

I replied, “No, it’s not necessary. You group only the items that you want to reference (by 1, 2, etc.) in the ‘Replace with’ box. You could group the other item, in which case you would use ‘13’ in the ‘Replace with’ box. But there’s no need to do so.”

Note that this method of finding the names offers another advantage. Not only will it find names that look like this:

Lyon J,

it will also find names that look like this:

Lyon JM,

or even this:

Lyon JMQ

It will even find names like this:

Thaler-Carter Ruth,

or this:

Harrison G.B.H.,

In fact, it will find anything (except a carriage return) followed by a comma and a space.

Why the Carriage Return?

“Why,” you may be wondering, “specify anything but a carriage return? Why not specify letters instead?” Well, we could have done that, using something like this:

Find what:

([A-z ]@, [A-z ]@, [A-z ]@, )[A-z ]@, (et al:)

Replace with:

12

That means:

any capital or lowercase letter or space ([A-z ])
repeated any number of times (@)
followed by a comma
followed by a space
And so on.

Such a wildcard string would find names like this:

Lyon J,

but not this:

Thaler-Carter R,

Yes, we could add a hyphen to our string, but then we start to wonder about other characters we might need to include, and then things get complicated again. And besides, it’s true that we don’t want to include carriage returns in our search, so it makes sense to exclude them. If we tried to simplify too far, we might use this:

Find what:

(*, *, *, )*, (et al:)

Replace with:

12

The problem with using the asterisk wildcard (*) is that it finds any character any number of times, including tabs, spaces, carriage returns, and everything else you can think of. Sometimes that’s useful, but more often it just leads to confusion. We want to keep things simple but not too simple.

Why Wildcard

To return to our original problem: Rich could have removed all those extra names one at a time, by hand, which is doing it the hard way and eats into the profit line — remember that time is money. Microsoft Word includes powerful tools for doing things the easy way, so why not learn them and use them? If you’ve read this far, you’re doing that, so congratulations.

If you’d like to learn more about how to use wildcard searches, you can download my free paper “Advanced Find and Replace in Microsoft Word.” Working through the paper requires some thought and effort, but the payoff is huge.

Coming Soon

I hope you’ll watch for my forthcoming Wildcard Cookbook for Microsoft Word. I’m still trying to find more real-life examples for the book, so if you have some particularly sticky problems that might be solved using a wildcard search, I hope you’ll send them my way. Maybe I can save you some work and at the same time figure out solutions that will help others in the future. Thanks for your help!

For EditTools Users

If you are a user of EditTools, you can manually create the find and replace strings in the Wildcard Find & Replace macro and then save the macro for future use. However, to do so you need to enter the Find string slightly differently:

Find Field #1: [!^013]@, [!^013]@, [!^013]@,
Find Field #2: [!^013]@,
Find Field #3: et al:

Note that you omit the parens for grouping because EditTools automatically inserts them, which means that you break the string into its group components. (IMPORTANT: Be sure to include in Find Fields 1 and 2 the ending space, i.e., the space following the final comma, which is not visible above.)

Because EditTools treats each of the three fields as a group, your Replace string is:

Replace Field #1: 1
Replace Field #2: 3

After manually entering the information in each of the fields, click Add to WFR Dataset and save this macro for future use. Next time you need it, just click Retrieve from WFR Dataset, retrieve this string, and run it. That is one of the advantages to using EditTools' Wildcard Find & Replace — you can write a wildcard macro once and reuse it as many times as you need without having to recreate the macro each time.

Jack Lyon (editor@editorium.com) owns and operates the Editorium, which provides macros and information to help editors and publishers do mundane tasks quickly and efficiently. He is the author of Microsoft Word for Publishing Professionals and of Macro Cookbook for Microsoft Word. Both books will help you learn more about macros and how to use them.

Looking for a Deal?

You can buy EditTools in a package with PerfectIt and Editor's Toolkit at a special savings of $78 off the price if bought individually. To purchase the package at the special deal price, click Editor's Toolkit Ultimate.

 

The Business of Editing: Wildcarding for Dollars

Freelancers often lack mastery of tools that are available to us. This is especially true of wildcarding. This lack of mastery results in our either not using the tools at all or using them to less than their full potential. These are tools that could save us time, increase accuracy, and, most importantly, make us money. Although we have discussed wildcard macros before (see, e.g., The Only Thing We Have to Fear: Wildcard Macros, The Business of Editing: Wildcard Macros and Money, and Macro Power: Wildcard Find & Replace; also see the various Lyonizing Word articles), after recent conversations with colleagues, I think it is time to revisit wildcarding.

Although wildcards can be used for many things, the best examples of their power, I think, are references. And that is what we will use here. But remember this: I am showing you one example out of a universe of examples. Just because you do not face the particular problem used here to illustrate wildcarding does not mean wildcarding is not usable by you. If you edit, you can use wildcarding.

Identifying the What Needs to Be Wildcarded

We begin by identifying what needs a wildcard solution. The image below shows the first 3 references in a received references file. This was a short references file (relatively speaking; I commonly receive references files with 500 to 1,000 references), only 104 entries, but all done in this fashion.

references as received

references as received

The problems are marked (in this essay, numbers in parens correspond to numbers in the images): (1) refers to the author names and the inclusion of punctuation; (2) shows the nonitalic journal name followed by punctuation; and (3) shows the use of and in the author names. The following image shows what my client wants the references to look like.

references after wildcarding

references after wildcarding

Compare the numbered items in the two images: (1) the excess punctuation is gone; (2) the journal title is italicized and punctuation free; and (3) the and is gone.

It is true that I could have fixed each reference manually, one-by-one, and taken a lot of time to do so. Even if I were being paid by the hour (which I'm not; I prefer per-page or project fees), would I want to make these corrections manually? I wouldn't. Not only is it tedious, mind-numbing work, but it doesn't meet my definition of what constitutes editing. Yes, it is part of the editing job, but I like to think that removing punctuation doesn't reflect my skills as a wordsmith and isn't the skill for which I was hired.

I will admit that in the past, in the normal course, if the reference list were only 20 items long, I would have done the job manually. But that was before EditTools and its Wildcard macro, which enables me to write the wildcard string once and then save it so I can reuse it without rewriting it in the future. In other words, I can invest time and effort now and get a reoccurring return on that investment for years to come. A no-brainer investment in the business world.

The Wildcard Find

CAUTION: Wildcard macros are very powerful. Consequently, it is recommended that you have a backup copy of your document that reflects the state of the document before running wildcard macros as a just-in-case option. If using wildcard macros on a portion of a document that can be temporarily moved to its own document, it is recommended that you move the material. Whenever using any macro, use caution.

Clicking Wildcard in EditTools brings up the dialog shown below, which gives you options. If you manually create Find and Replace strings, you can save them to a wildcard dataset (1) for future recall and reuse. If you already have strings that might work, you can retrieve them (2) from an existing wildcard dataset. And if you have taken the next step with Wildcards in EditTools and created a script, you can retrieve the script (3) and run it. (A script is simply a master macro that includes more than 1 string. Instead of retrieving and running each string individually, you retrieve a script that contains multiple strings and run the script. The script will go through each string it contains automatically in the order you have entered the strings.)

Wildcard Interface

Wildcard Interface

As an example, if I click Retrieve from WFR dataset (#2 above), the dialog shown below opens. In this instance, I have already created several strings (1) and I can choose which string I want to run from the dropdown. Although you can't see it, this particular dataset has 40 strings from which I can choose. After choosing the string I want to run, it appears in the Criteria screens (2 and 3), divided into the Find portion of the string and the Replace portion. I can then either Select (4) the strings to be placed in primary dialog box (see Wildcard Interface above) or I can Edit (5) the strings if they need a bit of tweaking.

Wildcard Dataset Dialog

Wildcard Dataset Dialog

If I click Select (4 above), the strings appear in the primary Wildcard dialog as shown below (1 and 2). Because it can be hard to visualize what the strings really look like when each part is separated, you can see the strings as they will appear to Microsoft Word (3). In addition, you know which string you chose because it is identified above the criteria fields (purple arrow). Now you have choices to make. You can choose to run a Test to be sure the criteria work as expected (4), or if you know the criteria work, as would be true here, you can choose to Find and Replace one at a time or Replace All (5).

The Effect of Clicking Select

The Effect of Clicking Select

I know that many readers are saying to themselves, "All well and good but I don't know how to write the strings, so the capability of saving and retrieving the strings isn't of much use to me." Even if you have never written a wildcard string before, you can do so quickly and easily with EditTools.

Creating Our String

Let's begin with the first reference shown in the References as Received image above. We need to tackle this item by item. Here is what the author names look like as received:

Kondo, M., Wagers, A. J., Manz, M. G., Prohaska, S. S., Scherer, D. C., Beilhack, G. F. et al.:

What we have for the first name in the list is

[MIXED case multiletter surname][comma][space][single UPPERCASE letter][period][comma]

which makes up a unit. That is, a unit is the group of items that need to be addressed as a single entity. In this example, each complete author name will constitute a unit.

This first unit has 6 parts to it (1 part = 1 bracketed item) and we have identified what each part is (e.g., [MIXED case multiletter surname]). To find that first part we go to the Wildcard dialog, shown below, click the * (1) next to the blank field in line 1. Clicking the * brings up the Select Wildcard menu (2) from which we choose we choose Character Menu (3). In the Character Menu we choose Mixed Case (4) because that is the first part of the unit that we need to find.

Wildcard First Steps

Wildcard First Steps

When we choose Mixed Case (4 above), the Quantity dialog below appears. Here you tell the macro whether there is a limit to the number of characters that fit the description for this part. Because we are dealing with names, just leave the default of no limit. However, if we knew we only wanted names that were, for example, 5 letters or fewer in length, we would decheck No Limit and change the number in the Maximum field to 5.

How many letters?

How many letters?

Clicking OK in the Quantity results in entry of the first portion of our string in the Wildcard dialog (1, below). This tells the macro to find any grouping of letters — ABCd, Abcde, bCdaefTg, Ab, etc. — of any length, from 1 letter to 100 or more letters. Thus we have the criteria for the first part of our Find unit even though we did not know how to write wildcardese. In the dialog, you can see how the portion of the string really looks to Microsoft Word (2) and how, if you were to manually write this part using Microsoft Word's Find & Replace, it would need to be written.

How this part looks in wildcardese

How this part looks in wildcardese

The next step is to address the next part, which can be either [comma] alone or [comma][space]. What we need to be careful about is that we remember that we will need the [space] in the Replace string. If we do [comma][space] and if we do not have just a [space] entry, we will need to provide it. For this example, I will combine them.

Because these are simple things, I enter the [comma][space] directly in the dialog as shown below. With my cursor in the second blank field (1), I simply type a comma and hit the spacebar. You can verify this by looking below in the Find line of wildcardese (2), where you can see (, ):

Manually adding the next part

Manually adding the next part

The remaining parts to do are [single UPPERCASE letter][period][comma]. They would be done using the same techniques as the prior parts. Again, we would have to decide whether the [period] and [comma] need to go on separate lines or together on a single line. Why? Because we want to eliminate the [period] but keep the [comma]. If they are done together as we did [comma][space], we will manually enter the [comma] in the Replace.

For the [single UPPERCASE letter], we would follow the steps in Wildcard First Steps above except that instead of Mixed Case, we would select UPPER CASE, as shown here:

Selecting UPPER CASE from the Characters Menu

Selecting UPPER CASE from the Characters Menu

This brings up the Quantity dialog where we decheck No Limit and, because we know it is a single letter we want found, use the default Minimum 1 and Maximum 1, as shown here:

A Quantity of 1

A Quantity of 1

Clicking OK takes us to the main Wildcard dialog where the criteria to find the [single UPPERCASE letter] has been entered (1, below). Looking at the image below, you can see it in the string (2). For convenience, the image also shows that I manually entered the [period][comma] on line 4 (3 and 4).

The rest of the Find criteria

The rest of the Find criteria

The Wildcard Replace

The next step is to create the Replace part of the string. Once again, we need to analyze our Find criteria.

We have divided the Find criteria into these 4 parts, which together make up the Find portion of the string:

  1. [MIXED case multiletter surname]
  2. [comma][space]
  3. [single UPPERCASE letter]
  4. [period][comma]

The numbers represent the numbers of the fields that are found in the primary dialog shown above (The Rest of the Find Criteria). What we need to do is determine which fields we want to replace and in what order. In this example, what we want to do is remove unneeded punctuation, so the Replace order is the same as the Find order. We want to end up with this:

  1. [MIXED case multiletter surname]
  2. [space]
  3. [single UPPERCASE letter]
  4. [comma]

The way we do so is by filling in the Replace fields. The [space] and the [comma] we can enter manually. You can either enter every item manually or you can let the macro give you a hand. Next to each field in the Replace column is an *. Clicking on the * brings up the Select Wildcard dialog:

Select Wildcard

Select Wildcard

Because what we need is available in the Find Criteria, we click on Find Criteria. However, the Select Wildcard dialog also gives us options to insert other items that aren't so easy to write in wildcardese, such as a symbol. When we click Find Criteria, the Use Find Criteria dialog, shown below, appears. It lists everything that is found in the Find criteria by line.

Use Find Criteria dialog

Use Find Criteria dialog

Double-clicking the first entry (yellow highlighted) places it in the first line of the Replace, but by a shortcut — 1 — as shown in the image below (1). If we wanted to reverse the order (i.e., instead of ending up with Kondo M, we want to end up with M Kondo,), we would select the line 3 entry in the Use Find Criteria Dialog above, and double-click it. Then 3 would appear in the first line of Replace instead of 1.

The completed wildcard macro

The completed wildcard macro

For convenience, I have filled the Replace criteria (1-4) as The Completed Wildcard Macro image above shows. The [space] (2) and the [comma] (4) I entered manually using the keyboard. The completed Replace portion of the string can be seen at (5).

The next decision to be made is how to run the string — TEST (6) or manual Find/Replace (7) or auto Replace All (8). If you have not previously tried the string or have any doubts, use the TEST (6). It lets you test and undo; just follow the instructions that appear. Otherwise, I recommend doing a manual Find and Replace (7) at least one time so you can be certain the string works as you intend. If it does work as intended, click Replace All (8).

You will be asked whether you want to save your criteria; you can preempt being asked by clicking Add to WFR dataset (9). You can either save to an existing dataset or create a new dataset. And if you look at the Wildcard Dataset dialog above (near the beginning of this essay), you will see that you can not only name the string you are saving, but you can provide both a short and a detailed description to act as reminders the next time you are looking for a string to accomplish a task.

Spend a Little Time Now, Save Lots of Time Later

Running the string we created using Replace All on the file we started with, will result in every instance of text that meets the Find criteria being replaced. I grant that the time you spend to create the string and test it will take longer than the second and subsequent times you retrieve the string and run it, but that is the idea: spend a little time now to save lots of time later.

I can tell you from the project I am working on now that wildcarding has saved me more than 30 hours of toiling so far. I have already had several chapters with 400 or more references that were similar to the example above (and a couple that were even worse). Wildcarding let me clean up author names, as here, and let me change cites from 1988;52(11):343-45 to 52:343, 1988 in minutes.

As you can see from this exercise, wildcarding need not be difficult. Whether you are an experienced wildcarder or new to wildcarding, you can harness the power of wildcarding using EditTools' WildCard Find & Replace. Let EditTools' WildCard Find & Replace macro system help you. Combine wildcarding with EditTools' Journals macro and references become quicker and easier.

Richard Adin, An American Editor

Related An American Editor essays are:

_________________________

Looking for a Deal?

You can buy EditTools in a package with PerfectIt and Editor's Toolkit at a special savings of $78 off the price if bought individually. To purchase the package at the special deal price, click Editor's Toolkit Ultimate.