It is now possible to ignore draft posts on import.

* refactored the code sent by SustainableWebsites
* made the ignore feature optional via ENV variable
* added docs
This commit is contained in:
Marc Remolt 2011-06-03 11:35:11 +02:00
parent 39b482f99f
commit 29bc2534b6
5 changed files with 62 additions and 28 deletions

View File

@ -1,10 +1,14 @@
= Refinerycms-wordpress-import = Refinerycms-wordpress-import
This project is an importer for WordPress XML dumps into refinerycms(-blog). This litte project is an importer for WordPress XML dumps into refinerycms(-blog).
So far, only blog-relevant data gets imported, I'm working on the cms pages part. So far, only blog-relevant data gets imported, I'm working on the cms pages part.
Draft posts and posts with duplicate post titles are ignored (for now) == Prerequisites
As refinerycms-wordpress-import is an addon for RefineryCMS, is shares the prerequisites with it.
So you'll first need a running installation of refinerycms and refinerycms-blog. Make sure
the site is running, all migrations are run and you created the first refinery user.
== Installation == Installation
@ -12,18 +16,44 @@ As there is no official release out yet, just add this repos to your projects Ge
gem 'refinerycms-wordpress-import', :git => 'git://github.com/mremolt/refinerycms-wordpress-import.git' gem 'refinerycms-wordpress-import', :git => 'git://github.com/mremolt/refinerycms-wordpress-import.git'
and run
bundle
== Usage == Usage
There are 3 rake tasks Importing the XML dump is done via 3 rake tasks:
* wordpress:reset_blog - This one basically deletes all data from blog relevant tables (taggings, tags, blog_comments, blog_categories, blog_posts). Use this one first, if you want a clean import of your old blog. rake wordpress:reset_blog
* wordpress:import_blog[file_name] - This one does all the heavy work of parsing the dump and importing the data into refinery tables. The param is a path to the dump file.
* wordpress:reset_and_import_blog[file_name] - This one combines the two previous tasks. This one basically deletes all data from blog relevant tables (taggings, tags, blog_comments,
blog_categories, blog_posts, blog_categories_blog_posts).
Use this one first, if you want a clean import of your old blog.
rake wordpress:import_blog[file_name]
This one does all the heavy work of parsing the dump and importing the data into refinery tables.
The parameter is the path to the dump file. Got a report from a Mac user, that the ~
didn't work in the path. I'll have a look at it, but till then, don't use it please.
If you don't want to import draft posts, you can set the ENV variable ONLY_PUBLISHED to true:
rake wordpress:import_blog[file_name] ONLY_PUBLISHED=true
The task will then skip all posts that are not published.
rake wordpress:reset_and_import_blog[file_name]
This one combines the two previous tasks.
== Feedback == Feedback
This is a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample data. This is still a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample posts.
The first feedback is quite good, so it seems, the gem doesn't eat the machines it is installed on.
If you want to help make it more stable, please throw your own WP dumps against it and see what happens. If you encounter any bugs, please file a bug report with a sample dump that breaks this gem. If you want to help make it even more stable, please throw your own WP dumps against it
and see what happens. If you encounter any bugs, please file a bug report here on github.
A sample dump that breaks this gem would be really helpful in that case.
For extra karma, fork it, fix it yourself and send a pull request! ;-) For extra karma, fork it, fix it yourself and send a pull request! ;-)

View File

@ -18,11 +18,13 @@ namespace :wordpress do
dump = Refinery::WordPress::Dump.new(params[:file_name]) dump = Refinery::WordPress::Dump.new(params[:file_name])
dump.authors.each(&:to_refinery) dump.authors.each(&:to_refinery)
dump.posts.each(&:to_refinery)
only_published = ENV['ONLY_PUBLISHED'] == 'true' ? true : false
dump.posts(only_published).each(&:to_refinery)
ENV["MODEL"] = 'BlogPost' ENV["MODEL"] = 'BlogPost'
Rake::Task["friendly_id:redo_slugs"].invoke Rake::Task["friendly_id:redo_slugs"].invoke
ENV["MODEL"] = nil ENV.delete("MODEL")
end end

View File

@ -25,10 +25,12 @@ module Refinery
end end
end end
def posts def posts(only_published=false)
doc.xpath("//item[wp:post_type = 'post']").collect do |post| posts = doc.xpath("//item[wp:post_type = 'post']").collect do |post|
Post.new(post) Post.new(post)
end end
posts = posts.select(&:published?) if only_published
posts
end end
def tags def tags

View File

@ -71,6 +71,10 @@ module Refinery
status != 'publish' status != 'publish'
end end
def published?
! draft?
end
def ==(other) def ==(other)
post_id == other.post_id post_id == other.post_id
end end

View File

@ -36,9 +36,6 @@ module Refinery
unless user unless user
begin begin
unless draft?
#p "creating post " + title + " Draft status: " + draft?.to_s
post = ::BlogPost.new :title => title, :body => content_formatted, post = ::BlogPost.new :title => title, :body => content_formatted,
:draft => draft?, :published_at => post_date, :created_at => post_date, :draft => draft?, :published_at => post_date, :created_at => post_date,
:author => user, :tag_list => tag_list :author => user, :tag_list => tag_list
@ -55,7 +52,6 @@ module Refinery
comment.save comment.save
end end
end end
end
rescue ActiveRecord::RecordInvalid rescue ActiveRecord::RecordInvalid
# if the title has already been taken (WP allows duplicates here, # if the title has already been taken (WP allows duplicates here,
# refinery doesn't) append the post_id to it, making it unique # refinery doesn't) append the post_id to it, making it unique