When I first got into web development, one of my first projects was to create a custom blog for myself. Apart from the sheer necessity of needing a blog at the time, I embarked on this coding journey because I had read somewhere that developing a blog would provide a good introduction to the nitty-gritty of application design.

While this wasn’t 100% accurate, it also was not terrifically far from the truth. Through many struggles and achievements, I finally wound up with my own custom blog, complete with commenting system, RSS delivery, and eventually automatic posting to Twitter.

This modest blog served me pretty well for a few months, but I quickly outgrew it. I found myself pouring precious hours into little development projects to try to get it to do cool stuff that I came across in other, more robust systems.

Eventually, however, I ran out of time and motivation. First, the continual development stopped. Then, out of total laziness, important things like bug fixes and comment-security fell to the wayside. While I still loved to blog, the sheer effort of posting (my interface was a bit clunky…) was a big hindrance, so the posts became quite sparse.

Then I got my iPhone. More than anything, this killed my blogging life because my site was just not robust enough to talk to my phone, and I was tired of having to boot up my dinosaur of a PC to make a post.

Finally, I decided to make the move to WordPress, and so here I am.

Now, as you probably know, WordPress offers a pretty nice selection of import tools, and there are plenty of third-party add-ons available for platforms that WordPress doesn’t officially provide import scripts for. What no one really offers, however, is custom blog importing. So the problem that faced me was how, exactly, to import not just one, but two custom blogs into WordPress without the nightmare of moving each entry post by post (not to mention comments!).

Well, it turns out that while WordPress doesn’t have an automagic custom blog importer, their custom XML schema is pretty darn smart, and it’s super-easy to roll a custom script from your own blog to write out an XML file that the WordPress importer will understand.

So enough with the blah-blah-blah. Let me show you what I did.

First, you need to know the important stuff about the WordPress XML file. The best thing I found was to create a throw-away WordPress site and make a few posts. For each of these, add both categories and tags (if these are something you have on your custom blog), and make a few comments as well.

Your dump (well, the important bits at least) should look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
          <title>Myblog's Blog</title>
          <description>Just another WordPress.com site</description>
          <pubDate>Sat, 29 May 2010 09:42:53 +0000</pubDate>
          <cloud domain='existdissolvetest.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
               <title>Myblog's Blog</title>
          <atom:link rel="search" type="application/opensearchdescription+xml" href="http://existdissolvetest.wordpress.com/osd.xml" title="Myblog's Blog" />
          <atom:link rel='hub' href='http://existdissolvetest.wordpress.com/?pushpress=hub'/>
          .....[rest of posts/comments go here].....

Pretty straightforward. This is the guts of your import–tells WordPress everything it needs to know about your blog. I’m not sure what’s really needed here or not…I just left it alone for my purposes. The most important part is what comes next…populating the posts, categories, tags, and comments from your custom blog.

Let’s start with the posts. I’m using ColdFusion, so here’s what I did to create the “posts” section of this XML file:

<cfoutput query="getposts">
     <cfset urltitle = trim(lcase(replace(posttitle,' ','-','all'))) />
     <cfset pubdate = "#dateformat(postdate,'dd mmm yyyy')# #timeformat(postdate,'HH:mm:ss')# +0000" />
     <cfloop from="1" to="#listlen(posttags)#" index="i">
          <category domain="tag"><![CDATA[#trim(listgetat(posttags,i))#]]></category>
          <category domain="tag" nicename="#lcase(replace(trim(listgetat(posttags,i)),' ','-','all'))#"><![CDATA[#trim(listgetat(posttags,i))#]]></category>
     <guid isPermaLink="false"></guid>
     <cfset postdate = "#dateformat(postdate,'dd mmm yyyy')# #timeformat(postdate,'HH:mm:ss')#" />
     ......[comments go here].....

Nothing terribly interesting here.  I will note, however, that there are a few of the “meta” key/values that I left out.  I didn’t see that they were necessary, and it turns out they weren’t :)

Finally, the last thing you’ll want to do is loop over your comments (if you have any).  Basically, comments for posts are added immediately after the post meta tags, each collection of comments nested within each post “item.”  My final CF code for comments looked like this:

<cfquery name="qcomments" datasource="#dsn#">
     select *
     from comments c
     join users u on c.userid_fk = u.userid
     where postid_fk = #postid#
<cfif qcomments.recordcount>
<cfloop query="qcomments">
           <cfset commdate = "#dateformat(commentdate,'dd mmm yyyy')# #timeformat(commentdate,'HH:mm:ss')#" />
     <cfset cc = cc + 1>

As before, nothing amazing going on.  One note I will make, however, is about the “comment_id”.  While it doesn’t seem that WP cares what the ID is, it does seem to care that an id of some sort is provided.  I tried without it, and it had only grabbed the last comment for every post…and I had to start all over :(

So, I simply created an auto-increment number, and used it as the comment_id.  All of the comments imported, and were properly associated to their respective posts.

The End

That’s about it.  Importing into WordPress from a custom blog is very simple–sometimes, it just takes a bit of playing around to get the right answer.  I hope this is helpful to someone, and please let me know if there is anyway I can improve this walk-through.