Replacing a needle in a haystack, but not in links and images

  |   By  |  0 Comments

A client wanted a glossary of terms on a WordPress site and was looking for automated links of glossary entries.

Trouble is each “needle” could be a word/phrase on its own (or in brackets and with punctuatin), or a word/phrase in an anchor tag or img tag. Nightmare!

str_replace and the case-insensitive str-ireplace just won’t do! So preg_replace it has to be.

Here be dragons and days of work…

The regex for a needle not in an anchor tag, link text or img tag

/\bneedle\b(?!(?:(?!<\/?a\b[^>]>).)?<\/a>)(?!(?:(?!<\/?img\b[^>]>).)?\/>)/

When used in

$search='/\bneedle\b(?!(?:(?!<\/?a\b[^>]>).)?<\/a>)(?!(?:(?!<\/?img\b[^>]>).)?\/>)/';
$replace='haystack';

$content='Needle, needle
<a href="https://needle.com">needle.com</a>
<p><img class="needle" width="50" height ="50" src="https://www.needle.com/needle.jpg"/></p>
<p>Nails can be quite sharp, needles are too, espcially just one needle.</p>';


$content = preg_replace($search, $replace,$content);
echo $content;

To my great relief gives the output

Needle, haystack
<a href="https://needle.com">needle.com</a>
<p><img class="needle" width="50" height ="50" src="https://www.needle.com/needle.jpg"/></p>
<p>Nails can be quite sharp, needles are too, espcially just one haystack.</p>

Regex tester page

Not case sensitive but that was acheived with a foreach loop working through an array with search keys and replace values. Note sort the array with decreasing key length to stop issues of a longer string not being recognised because it contained a shorter string.

Here’s how to sort an array by descending key length

uksort($array, "sort_callback");
function sort_callback($a, $b){
    return strlen($b) - strlen($a);
}

name

ABOUT THE AUTHOR - ANDY MOYLE

Andy Moyle is a church leader and web developer. His biggest project is the Church Admin WordPress plugin and app. He also runs, mainly so he can eat pizza.