in Microsoft Word, Wildcards

Wildcard Secrets Revisited

A few weeks ago I sent out an article called "Three Wildcard Secrets." I thought they were pretty good secrets, too! You can see them here.

In a nutshell, here are the first two:

The wildcard range [A-z], meant to find any uppercase or lowercase letter, will not find accented letters. You have to use [A-Za-z] instead. So I suggested using [!A-z] (not A-z) to find any characters that are accented.

Similarly, if you need to find any unspecified Unicode character, you can use the not range [!^000-^255]. That should work, as 255 is the upper limit on ANSI characters, so anything the range finds must be Unicode.

Then I received a corrective email from macro expert Paul Beverley. The nerve! Here's what Paul had to say about secret #1:

You see the problem? It did what you asked, not what you wanted. It finds any character at all, except A-z.

And Paul is right! The range [!A-z] finds not just accented characters but also spaces, punctuation, and other stuff that isn't letters—something I knew if I'd actually thought about it. You can solve the problem by adding more things that you want to skip. Here's an example:

[!A-z 0-9.,;:\-\?\!^001-^064]

(For more information, see my Wildcard Cookbook for Microsoft Word.)

Next, Paul had this to say about secret #2:

On my PC [!^000-^255] throws up an error:

Now technically, I was right about the range being [!^000-^255]. The problem is that Microsoft Word wants [!^001-^255] instead. And to make things even worse, that wildcard range correctly skips the ASCII characters (numbered 0-126) but incorrectly finds the extended ASCII characters (numbered 127-255), even though we've told it not to. Microsoft strikes again!

But wait, there's more!

  • The range [!^128-^255] gives us the same error message as [!^000-^255].
  • The range [!^127-^255] finds Unicode characters (which it should) and extended ASCII characters (which it should not).
  • The range [!^127-^254] skips extended ASCII characters (which it should) and Unicode characters (which it should not).

All of this weirdness seems to hinge on the points where ASCII becomes extended ASCII, and extended ASCII ends.

Might any of this be useful in your editing work? Yes, if you're using wildcard searches:

  • Use the range [!^127-^255] to find Unicode and extended ASCII characters.
  • Use the range [!^127-^254] to skip Unicode and extended ASCII characters.

That should work, at least until Microsoft decides to fix these problems.

Many thanks to Paul Beverley for his valuable feedback. If you'd like a bunch of free editing macros with instructions on how to use them, you'll want to download Paul's book Macros for Editors.