FileHelpers: the .Net way to import text files

I am working on a project where I have to import data from thousands of text files. These text files vary in formats: they use different delimiters; they use quoted identifiers never, sometimes, or always; they sometimes have missing end-of-row columns. Importing these took a combination of custom scripts (to clean up and reformat) and XML format files for SQL’s BCP (bulk copy) command… until I found FileHelpers.

FileHelpers by Marcos Meli is a .Net library which provides services to import and export text in delimited and fixed-length file formats. It takes a unique approach to the file import problem:

  • You write a class, giving it properties to store the data in the text file.
  • You apply attributes to the class and properties, describing the format of the file and applying import rules.
  • You import the file using a single command, which creates an array of objects populated with the data from the text file.

With support for custom converters, null types, optional fields, before/after import events, extensive parsing rules, and so much more — not to mention a web site with gobs of examples and documentation — FileHelpers has single-handedly driven me away from the script and batch file approach to a much more powerful .Net application to import my files.

The fact that FileHelpers uses regular .Net classes allows you to use the same class to both import data from a file and persist it to a database. For example, consider the following sample text file:

1,Chris,Smith
2,Bob,Jones
3,"Mary Jane",Brown

Clearly we have three fields in this comma-delimited file, so we write the following class to hold the data. Attributes provide instructions to FileHelpers — specifying that it is a comma-delimited file, and that the quotes in the first and last name fields are optional. (Note that you can use fields or properties; I use fields below for brevity.)

[DelimitedRecord(",")]
public class Person
{
	public int PersonID;
	[FieldQuoted('"', QuoteMode.OptionalForRead)]
	public string FirstName;
	[FieldQuoted('"', QuoteMode.OptionalForRead)]
	public string LastName;
}

Importing this file now becomes as easy as writing two lines of code.

FileHelperEngine<Person> engine = new FileHelperEngine<Person>();
Person[] records = engine.ReadFile(@"c:\myfile.csv");

You now have an array of Person objects loaded with data from the text file. Now, let’s say you wanted to persist these objects to a database. You can use the same Person object to persist to your database with an O/R mapper. If you were using Paul Wilson’s O/R Mapper and the WilsonORWrapper, you would decorate your class as follows:

[DelimitedRecord(","), IgnoreInheritedClass]
public class Person : WilsonORWrapper.Entities.EntityBase<Person>
{
	public int PersonID;
	[FieldQuoted('"', QuoteMode.OptionalForRead)]
	public string FirstName;
	[FieldQuoted('"', QuoteMode.OptionalForRead)]
	public string LastName;
}

Note I added the base class and the IgnoreInheritedClass attribute, which tells FileHelpers to ignore any inherited fields and properties (otherwise, the IsReadOnly property from EntityBase would be included).

With that, I can now persist all the records I read using FileHelpers back to my database using the O/R mapper.

FileHelperEngine<Person> engine = new FileHelperEngine<Person>();
Person[] records = engine.ReadFile(@"c:\myfile.csv");
foreach ( Person p in records )
{
	Data<Person>.Track(p);
}
Data<Person>.Save(records);

Incredible! We can even add supporting logic. Let’s say I wanted to only insert new persons into the database. I could do this.

FileHelperEngine<Person> engine = new FileHelperEngine<Person>();
Person[] records = engine.ReadFile(@"c:\myfile.csv");
foreach ( Person p in records )
{
	Person pers = Data<Person>.Retrieve("PersonID = " + p.PersonID.ToString());
	if (pers == null)
	{
		Data<Person>.Track(p);
		Data<Person>.Insert(p);
	}
}

FileHelpers makes this and so much more possible. It’s the backbone of a project I’m writing, and it’s making some heavy data-parsing much easier. To top it all off, FileHelpers is open source — you can download the code from their Subversion repository, and post patches on their support forums. Patches are well-received — I’ve already had two (1, 2) I’ve sent and had applied.

Thanks again to Marcos (and recently-added co-developer, Matt Campbell) for providing this fantastic library. For regular updates, be sure to read the FileHelpers Library Blog.

One thought on “FileHelpers: the .Net way to import text files”

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>