Simple scripts always ship.

Simple needs, simple deeds.

A need for a simple one-off occurred Monday evening at work. Some non-engineer coworkers needed to be shown how to make some non-Latin-1 text into html entities.

Basically, they needed a 5 line php script.

Right before sending them to any number of sites that already do this 5-line operation, I decide: what the hell, I’ll just make one of my own. And so, that night after unwinding, I did.

Here’s my HTML Entitizer. Suitable for all your HTML entitizing needs, large or small!

PHP in Escape from L.A.(jokesfornerds)

One of the reasons PHP is so popular is the fact that it’s got so many handy little functions. In this case, htmlentities() is the blade of PHP’s swiss army knife that we’re using. Of course, indiscriminate use of functions like this cause problems like double-escaped characters (&) showing up in databases.

PHP, the swiss-army knife.PHP’s solution to this sort of thing is generally to throw new corkscrews and toothpicks onto it’s ever-growing pile of tools on it’s knife. If you look at the documentation page for htmlentities(), you can see that as of version 5.2.3 of PHP, another optional parameter was added: double_encode. Which will

Getting the right escape order can be hard, especially for novices or small teams inheriting code from other small teams. To date I don’t think I’ve inherited work on a webapp whose database wasn’t littered with extraneous &’s and rogue \\’s. It’s only by virtue of verge-rpg.com having a single developer who is really, really annoyed by this problem the the point of neuroticism that & never appears in the database except in posts that actually wanted to display &.

Although even this level of OCD-database-cleansing didn’t prevent escaping-related errors on the first go-round.

Teh (sic) Implementation.

So, the only time-consuming part of getting this script into my wordpress site (other than me taking a bloody half hour to blog about a 2-minute job) was convincing wordpress that it should allow <?php fragments into my posts.

Luckily, other people before me have wanted this very thing. WordPress’s biggest strength is again one of PHP’s: if you want it, it’s already been made for you. In this case the Exec-PHP module will let admin-level users of your blog post for-reals, actual PHP into your posts! You’d better hope your admins don’t have their accounts compromised! (…one moment. Changing my password.)

In PHP’s case, the code that you want that’ already been written for you is in every single function’s talkback thread. These functions may not be hyper-optimized or the best solution for any given problem, but they generally are what you want to prove a point. You scan the docs, you grab the code, you put it in your site, and you move on to the next problem in your own app.

Quick and dirty, the way PHP likes it.

For the curious, here’s the solution I spat out (copy included):

<h1>Escape html</h1>
<p>This is a simple script to give the escaped codes for some html. Useful
for making foreign languages play nice with html, regardless of how the
server handles string encoding.</p>
<?
if( isset($_POST['escape_me']) ) {
echo('<h2>Your escaped html</h2><div'.
     'style="background-color: #ddd; padding: 8px;">');
echo(str_replace('&','&amp;',htmlentities(stripslashes($_POST['escape_me']),
     ENT_NOQUOTES,'UTF-8')));
echo( '</div>' );
}
?>
<form action='/html-entitizer/' method='POST'>
Text to escape:
<textarea name='escape_me' style='width: 550px; height: 200px;'></textarea>
<input type='submit' value='Create my html entities'>
</form>

(weird formatting so it doesn’t run off the side of the screen; I don’t actually code like that. Mostly. ;)

Coda

It is of note that this tiny amount of effort got me two really sincere “thank yous” from the non-engineers. It’s important to note that things that are mindlessly trivial to a programmer can be tedious tasks to someone without the power to bully the computer into doing labor for them. One of the guys, a computer saavy guy who had access to lists of what each escaped character translated to, insisted that he’d just search/replace the offensive characters by hand.

Ostensibly, this was offered to save me work.

How much time did my 2 minutes of code save him?

(Don’t forget a google for “html entity converter” would’ve saved even that, had I cared to not make the tiny toy script of my own. I’ll ramble on about re-using other people’s work more later, but it’s key to remember even what little work I did end up doing was extraneous and largely an exercise in vanity.)

Also read...

Comments

  1. I have beeing after the Internet for this info and i wanted to say thanks to u for the post. BTW, just off topic, where can i get a version of this theme? – Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>