wordpress-import/README.rdoc

141 lines
4.8 KiB
Plaintext
Raw Normal View History

2014-03-04 22:47:53 +00:00
= Wordpress-import
This little project is an importer for WordPress XML dumps into Rails.
2011-06-02 15:13:56 +00:00
It's been somewhat customized for one particular project; you probably want to fork this and modify it to fit your app's schema.
It's a fork of Marc Remolt's Refinerycms-wordpress-import ( https://github.com/mremolt/refinerycms-wordpress-import )
2014-03-04 22:47:53 +00:00
You can find the source code on github: https://github.com/zyphlar/wordpress-import
2011-06-03 09:47:27 +00:00
2011-06-13 17:02:25 +00:00
Keep in mind that links to other pages of your blog are just copied, as WordPress exports them as <a>-Tags.
2011-06-05 12:01:47 +00:00
If your site (blog) structure uses new urls, the links WILL break! For example, if you used
the popular WP blog url structure "YYYY-MM/slug", be warned that Refinery just uses "blog/slug".
So your inner site links will point to the old WP url.
== Prerequisites
2014-03-04 22:47:53 +00:00
TODO
2011-06-05 12:01:47 +00:00
2011-06-02 15:13:56 +00:00
== Installation
Just add the gem to your projects Gemfile:
2014-03-04 22:47:53 +00:00
gem 'wordpress-import'
Or if you want to stay on the bleeding edge:
2011-06-02 15:13:56 +00:00
gem 'wordpress-import', :git => 'git://github.com/zyphlar/wordpress-import.git'
2011-06-02 15:13:56 +00:00
and run
bundle
2011-06-05 12:01:47 +00:00
2011-06-02 15:13:56 +00:00
== Usage
2011-06-05 12:01:47 +00:00
Importing the XML dump is done via rake tasks:
rake wordpress:reset_blog
This one basically deletes all data from blog relevant tables (taggings, tags, blog_comments,
blog_categories, blog_posts, blog_categories_blog_posts).
Use this one first, if you want a clean import of your old blog.
rake wordpress:import_blog[file_name]
This one does all the heavy work of parsing the dump and importing the data into refinery tables.
The parameter is the path to the dump file. Got a report from a Mac user, that the ~
didn't work in the path. I'll have a look at it, but till then, don't use it please.
If you don't want to import draft posts, you can set the ENV variable ONLY_PUBLISHED to true:
rake wordpress:import_blog[file_name] ONLY_PUBLISHED=true
The task will then skip all posts that are not published.
rake wordpress:reset_and_import_blog[file_name]
2011-06-02 15:13:56 +00:00
This one combines the two previous tasks.
2011-06-02 15:13:56 +00:00
2011-06-05 12:01:47 +00:00
If you also want to import the cms part of WordPress, three more rake tasks manage
the import into RefineryCMS Pages:
rake wordpress:reset_pages
This task deletes all data from the cms tables, ensuring a clean import. Otherwise existing
pages could break the import because of duplicate IDs.
rake wordpress:import_pages[file_name]
This task imports all the WordPress pages into Refinery. The page structure (parent - child)
is preserved.
If you want to skip the draft pages, add the ONLY_PUBLISHED parameter to this task,
just like with wordpress:import_blog.
rake wordpress:import_pages[file_name] ONLY_PUBLISHED=true
If you want to clean the tables and import in one task:
rake wordpress:reset_and_import_pages[file_name]
2011-06-13 17:02:25 +00:00
Finally, if you want to reset and import all data including media (see below):
rake wordpress:full_import[file_name]
== Importing media files
The WP XML dump contains absolute links to media files linked inside posts, like:
2011-06-13 17:03:52 +00:00
www.mysite.com/wordpress/wp-content/uploads/2011/05/cv.txt
2011-06-13 17:02:25 +00:00
The dump does NOT contain the files itself! To get them imported, this gem downloads the files
from the given URL and imports them to refinery. So for a working media import the old site with
the media URLs must still be online.
After importing the files, this gem replaces the old links in pages and blog posts with the
new generated ones. It parses all existing records searching for the right pattern. That
means, you have to import pages and posts FIRST to get the URLs replaced.
Now to the rake tasks for media import:
rake wordpress:reset_media
This task deletes all data from the media tables (images and resources), ensuring a clean import.
rake wordpress:import_and_replace_media[file_name]
This task imports all the WordPress media into Refinery. After the import it parses all
pages and blog posts, replacing the legacy links with the current refinery ones.
If you want to clean the tables and import in one task:
rake wordpress:reset_import_and_replace_media[file_name]
2011-06-05 12:01:47 +00:00
== Usage on ZSH
2011-06-03 14:20:59 +00:00
One more hint for users of zsh (like myself):
The square brackets following the rake task need to be escaped on zsh, as they have a
special meaning there. So the syntax is:
rake wordpress:reset_and_import_blog\[file_name\]
Ugly, but it works. This is the case for all rake tasks by the way, not just mine.
2011-06-05 12:01:47 +00:00
2011-06-02 15:13:56 +00:00
== Feedback
This is still a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample posts.
The first feedback is quite good, so it seems, the gem doesn't eat the machines it is installed on.
2011-06-02 15:13:56 +00:00
If you want to help make it even more stable, please throw your own WP dumps against it
and see what happens. If you encounter any bugs, please file a bug report here on github.
A sample dump that breaks this gem would be really helpful in that case.
2011-06-02 15:13:56 +00:00
For extra karma, fork it, fix it yourself and send a pull request! ;-)