It is now possible to ignore draft posts on import.

* refactored the code sent by SustainableWebsites * made the ignore feature optional via ENV variable * added docs
2011-06-03 11:35:11 +02:00 · 2011-06-03 11:35:11 +02:00 · 29bc2534b6
commit 29bc2534b6
parent 39b482f99f
5 changed files with 62 additions and 28 deletions
--- a/README.rdoc
+++ b/README.rdoc
@ -1,10 +1,14 @@
 = Refinerycms-wordpress-import
-This project is an importer for WordPress XML dumps into refinerycms(-blog). 
+This litte project is an importer for WordPress XML dumps into refinerycms(-blog). 
 So far, only blog-relevant data gets imported, I'm working on the cms pages part. 
-Draft posts and posts with duplicate post titles are ignored (for now)
+== Prerequisites
 As refinerycms-wordpress-import is an addon for RefineryCMS, is shares the prerequisites with it.
 So you'll first need a running installation of refinerycms and refinerycms-blog. Make sure
 the site is running, all migrations are run and you created the first refinery user. 
 == Installation
@ -12,18 +16,44 @@ As there is no official release out yet, just add this repos to your projects Ge
  gem 'refinerycms-wordpress-import', :git => 'git://github.com/mremolt/refinerycms-wordpress-import.git'
 and run
  bundle
 == Usage
-There are 3 rake tasks
+Importing the XML dump is done via 3 rake tasks:
-* wordpress:reset_blog - This one basically deletes all data from blog relevant tables (taggings, tags, blog_comments, blog_categories, blog_posts). Use this one first, if you want a clean import of your old blog. 
+  rake wordpress:reset_blog 
-* wordpress:import_blog[file_name] - This one does all the heavy work of parsing the dump and importing the data into refinery tables. The param is a path to the dump file.
+
-* wordpress:reset_and_import_blog[file_name] - This one combines the two previous tasks.
+This one basically deletes all data from blog relevant tables (taggings, tags, blog_comments, 
 blog_categories, blog_posts, blog_categories_blog_posts). 
 Use this one first, if you want a clean import of your old blog. 
  rake wordpress:import_blog[file_name] 
 This one does all the heavy work of parsing the dump and importing the data into refinery tables. 
 The parameter is the path to the dump file. Got a report from a Mac user, that the ~
 didn't work in the path. I'll have a look at it, but till then, don't use it please. 
 If you don't want to import draft posts, you can set the ENV variable ONLY_PUBLISHED to true:
  rake wordpress:import_blog[file_name] ONLY_PUBLISHED=true
 The task will then skip all posts that are not published.
  rake wordpress:reset_and_import_blog[file_name]
 This one combines the two previous tasks. 
 == Feedback
-This is a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample data. 
+This is still a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample posts. 
 The first feedback is quite good, so it seems, the gem doesn't eat the machines it is installed on. 
-If you want to help make it more stable, please throw your own WP dumps against it and see what happens. If you encounter any bugs, please file a bug report with a sample dump that breaks this gem. 
+If you want to help make it even more stable, please throw your own WP dumps against it 
 and see what happens. If you encounter any bugs, please file a bug report here on github.
 A sample dump that breaks this gem would be really helpful in that case. 
 For extra karma, fork it, fix it yourself and send a pull request! ;-)
--- a/lib/tasks/wordpress.rake
+++ b/lib/tasks/wordpress.rake
@ -18,11 +18,13 @@ namespace :wordpress do
    dump = Refinery::WordPress::Dump.new(params[:file_name])
    dump.authors.each(&:to_refinery)
-    dump.posts.each(&:to_refinery)
+    
    only_published = ENV['ONLY_PUBLISHED'] == 'true' ? true : false
    dump.posts(only_published).each(&:to_refinery)
    ENV["MODEL"] = 'BlogPost'
    Rake::Task["friendly_id:redo_slugs"].invoke
-    ENV["MODEL"] = nil
+    ENV.delete("MODEL")
  end
--- a/lib/wordpress/dump.rb
+++ b/lib/wordpress/dump.rb
@ -25,10 +25,12 @@ module Refinery
        end
      end
-      def posts
+      def posts(only_published=false)
-        doc.xpath("//item[wp:post_type = 'post']").collect do |post|
+        posts = doc.xpath("//item[wp:post_type = 'post']").collect do |post|
          Post.new(post)
        end
        posts = posts.select(&:published?) if only_published
        posts
      end
      def tags
--- a/lib/wordpress/page.rb
+++ b/lib/wordpress/page.rb
@ -71,6 +71,10 @@ module Refinery
        status != 'publish'
      end
      def published?
        ! draft?
      end
      def ==(other)
        post_id == other.post_id
      end
--- a/lib/wordpress/post.rb
+++ b/lib/wordpress/post.rb
@ -36,9 +36,6 @@ module Refinery
          unless user
        begin
          unless draft?
            #p "creating post " + title + " Draft status: " + draft?.to_s
          post = ::BlogPost.new :title => title, :body => content_formatted, 
            :draft => draft?, :published_at => post_date, :created_at => post_date, 
            :author => user, :tag_list => tag_list
@ -55,7 +52,6 @@ module Refinery
              comment.save
            end
          end
          end
        rescue ActiveRecord::RecordInvalid 
          # if the title has already been taken (WP allows duplicates here,
          # refinery doesn't) append the post_id to it, making it unique