Tuesday, April 12, 2011

PHP: Encode Text Into Numeric HTML Entities While Keeping HTML

If for whatever reason you need to have some text with special characters encoded into numeric HTML entities, but you also have HTML that you do not want to encode, well here you have it!


Why Numeric HTML Entities?

If you've worked with RSS before I'm sure you understand the dilemma. For those of you that don't know, RSS has a problem with validating if there are non-numeric HTML entities (the normal output you get from htmlentities()). More headaches, more XML fun!


Why Keep HTML?

Well, to be honest, it's just because we have a special scenario right now. But, I'm sure some people may find this helpful.



*Note* The numeric HTML entities foreach loop is based off of Michael Krenz's xml_character_encode function. Thank you sir!


function htmlentities_keephtml($text) {
$entities = get_html_translation_table(HTML_ENTITIES);
unset($entities['"']);
unset($entities['<']); unset($entities['>']);
unset($entities['&']);
foreach ($entities as $k=>$v)
$entities[$k] = "&#" . ord($k) . ";";
$s = array_keys($entities);
$r = array_values($entities);

$text = html_entity_decode($text, ENT_NOQUOTES); // decode the named entities
$text = str_replace($s, $r, $text); // now encode to numeric entities

return $text;
}

No comments: