Importing CSV Data with Feeds

Importing CSV Data with Feeds

Importing or migrating data into Drupal 7 can be a daunting task. Using the Feeds and Feeds Tamper modules make importing data a lot simpler. This article will walk you through the process of using these modules.

For starters, you may download these modules here:

http://drupal.org/project/feeds
http://drupal.org/project/feeds_tamper

These modules require the Chaos Tool Suite and Job Scheduler to work.

Once the modules are installed and enabled you can go to /admin/structure/feeds, where you’ll find the ability to add new feeds. If you’d like an example to start with, install the Feeds Import module, which contains an example node importer and user importer.

There are several file types you can import data from without additional modules to enhance Feeds. These include XML feeds (Atom, RSS 1, and RSS 2), CSV, OPML, and XML sitemap. Other file types may also be imported with the help of additional modules (such as Excel files).

Feed Setup

Click on “Add importer” to get started. Name the new importer and give it a description. Then click “continue.”

Under “Basic settings,” click “settings.” Here you can update the name and description. It is important to make the correct choice for “Attach to content type.” Choose “Use standalone form” for our purposes. A standalone form is typically used to import all the data at once. You can also choose to attach the form to a content type. This option will add an import section to each item of that content type; which is useful if you want to import single article data and attach it to a new article. “Periodic import” is used to to setup a cron process to rerun the import, which is very useful if you’re importing a feed of some kind. In this tutorial, use the “Off” selection. Check “Import on submission” to immediately run the import. And finally check “Process in the background” if this import is large or if you’re unsure of the time it will take. If an import is too large, the server will timeout and not import properly.

Under the “Fetcher” tab, you’ll find File upload and HTTP fetcher. In our case, choose “File upload.” Under “File upload,” you can change the import file types or set the file directory to search for a file. We don’t need to change anything here.

Go to the “Parser” tab and select “CSV parser” for this tutorial.

Under the “CSV parser” tab, set the delimiter for your import file. Also check the box if your file has no header row.

“Processor” is where you select which type of content you are importing. In this case, we’ll be importing nodes.

Node processor

Next you’ll change the “settings” under the “Node processor” tab. The “Update existing nodes” function will affect what data is imported. Your options include the following: “not update nodes,” “replace nodes,” or “update nodes.” If this is a fresh import, it won’t matter which you choose. But if you have overlapping data, you cannot update nodes to keep the old data or use replace nodes to import new data. Since the updating option does the same job as replacement at a slower speed, there’s no real reason to use it.

Use the “Text format” selector to set which format to use for the text body. And choose the “Content type” to be imported. You may set all the new nodes to have the same author; you can also set the author node by node later when mapping the fields. If you want your nodes to expire, and be automatically deleted, you can set the “Nodes expires” option.

Mapping Fields

Click on “mapping” to begin aligning your CSV data to the node fields. It is important to have a unique field in the CSV that can map to the GUID to use as unique. Including a unique field will allow you to update imports with newer imports at a later date. It’s worth noting that Feeds Tamper has a plugin called “Calculate hash” should your data not have a unique field to use as GUID.

The basic idea here is to type the name of the CSV field into the “source” box exactly as it appears in the CSV; then select the correct field on your node and “Add” the map.

Feeds Tamper

Once you’ve mapped all your fields, you can go to the “Tamper” tab at the top to alter any data that needs to be adjusted for Drupal. You may want a date stored in Drupal as a Unix timestamp instead of a human-readable m-d-Y style format. Feeds Tamper comes in here.

Feeds Tamper includes around 20 plugins. These plugins can manipulate dates, HTML, lists, numbers and text in many ways. I can’t cover them all in this article but I’ve personally found date to timestamp, list explode, find/replace, and trim quite useful.

Importing Data

Now it’s time to import the actual data. Go to /import and select your new Feeds import. The first page will tell you that you’ve imported no items (unless, of course, you have). You can again select the delimiter and select whether the file has a header row. Then click “Choose file” to select your CSV and upload your file.

If there are no problems with your CSV data and fields are mapped properly, Feeds will import the data and create new nodes. If you’ve chosen to run the import in the background, you’ll need to run cron repeatedly until all data is imported. When import of the CSV data feed is complete, the page will tell you how many items have been imported. Finally, it’s recommended to verify your data as a good best practice.