Posts tagged HTML
Whether you’re scraping content from a website, or simply dealing with the “tag soup” generated from your own site’s WYSIWYG, you probably know that reliably parsing HTML is a pain at best, extremely difficult at worst. Not only do you have to contend with unpredictable content, but there’s no guarantee that the content you try to parse will be well-formatted.
In my first pass at CFGloss, I made a pretty big error in parsing the documentation from ColdFusion, namely, I tried to do it with regular expressions. Ultimately, it worked out *okay*, but it was inefficient, unpredictable, and resulted in miles and miles of regular expression soup. Now that I’m overhauling CFGloss, I want to revisit how I parsed the scraped content and try to make it better.
Enter jsoup. This is an awesome little Java library that takes the headache out of parsing HTML. Besides turning unpredictable, and potentially mal-formed HTML into something usable, jsoup is additionally packed with some awesome features, most notably leveraging CSS/jQuery-esque selectors for manipulating parsed HTML content.An Example
To get a feel of the tip-of-the-iceberg of what jsoup can do, let’s take an example from the Railo documentation. Take a look at the source of this page: http://railodocs.org/index.cfm/function/each/version/current. Our More >
In keeping with my New Year’s resolution to not get involved in “black-hole” projects so that I have more flexibility to pursue fun stuff on a whim, I whipped up a fun experiment in CSS3.The Main Idea
If you design a lot of forms, you know that radio and checkbox lists are hard to style. And no matter how perfectly you lay them out, they are still just plain boring. I thought it would be fun to make checkbox and radio lists a bit more fun by going an entirely different route, and by getting a bit more 3D.
Enter “Chunky Checker.” This set of styles takes an ordinary list of checkboxes or radio buttons and converts them into a super-chunky 3D list. As the radios/checkboxes are activated, the particular segment of the list transforms, creating a “depressed” look, sort of as if a button was physically pressed.The Markup
<fieldset> <legend>Favorite Color:</legend> <input type="checkbox" name="color" value="Red" id="red" class="red"/> <label for="red">Red</label> <input type="checkbox" name="color" value="Orange" id="orange" class="orange" /> More >
At work, we recently kicked off an effort to make sure that all of the emails coming from our applications are Outlook 2010 compliant…meaning that they render in a reasonably acceptable manner–at least as emails go. On the surface, this doesn’t appear to be that big of a deal. However, for us it is, because the prior version for which we tested emails was Outlook 2003.
So what’s the problem? Well, if you didn’t hear, Microsoft made the brilliant move of deciding to switch their email-rendering engine from IE (Outlook 2003 and before) to Word. In short, this means a massive decrease in support for HTML standards, CSS support, etc. In fact, the move is so counter-standards that an entire movement has been started to try to pressure Microsoft into fixing the problem…so far, nothing.
But arguments about the “smartness” of this change aside, the facts on the ground still require that my co-workers and I fix some emails. So far, my experience has been that in terms of layout, not a huge amount of revision has been necessary. What has caused issues, however, is the lack of CSS support. If you ever find yourself needing to get an email looking decent in Outlook More >
If you use one or more of the billion and seven content managment systems out there, you're probably more than familiar with a what-you-see-is-what-you-get (WYSIWYG) text editor. These *fun* little textareas allow users to mimic, with decent accuracy, the styles and organization that they would normally apply in word-processing documents.
The original intention of WYSIWYGs, of course, was to put control of text styling and organzation for web-based content into the hands of the people managing the content, so that every bit of text-editing doesn't have to go through the web developer. The result, however, is a disaster.
Why a disaster? Well, before discussing the reasons why, try this little experiment. The next time you're around a web designer/developer, say nothing to them. Simply shift your eyes, turn your head, and randomly blurt out "WYSIWYG!!!". They will either 1.) punch you in the mouth out of built up frustration or 2.) cower in a corner as they are reminded of how they are abused by WYSIWYGs on a daily basis.
So what's the big deal?
Well, to begin, most content managers have no business even thinking the word "design," much less actively participate in it. Giving them the reigns to "make the More >
Okay, so at some point as a web designer you're going to come across a project where you have to design some kind of gallery with image thumbnails.
In an ideal world (e.g., Photoshop), all of the thumbnails will be precisely the same size, so plugging them into nice little skins (the "pretty" that you put around them) is cake. Unfortunately, the real world ain't like that. In the real world, you have hundreds of thumbnails to deal with, none of which will probably ever be exactly the same size.
Now, of course, the thumbnail skins still have to work. So what should you do?
The first thing to NOT do is hard-code the "width" and "height" attributes of the <img> tag. Super bad idea. Okay, not a TERRIBLE idea, but it will inevitably lead to some funny looking thumbnails as the img tag will stretch or shrink your image to fit these values.
So what's the alternative? Well, ideally, you'd be using Coldfusion 8 and could use the super-cool built-in image manipulation tools to properly scale and crop every image to be the same size. Ah, but we're not in an ideal world!
The second alterative I've found that works to a limited More >