Deleting all tables and constraints in a database

We all need to do it from time to time — erase a bunch of tables in a SQL database. Sure, you can do this through Enterprise Manager, one table at a time, or write a bunch of DROP TABLE statements, but when doing it this way, you also need to ensure all foreign key constraints (i.e. table dependencies) are dropped first.

There’s an easier way — the following script, which was derived from a few posts scattered around the Internet. (Sorry, it’s been a while, and I don’t remember what they were!)

Continue reading

Separating SQL script files generated by Microsoft SQL (by type)

There’s one great feature in SQL: the “Generate Scripts” command. Unfortunately, it has one limitation: the default filenames of scripts look something like this:

dbo.fnc_PlayerValue.UserDefinedFunction.sql
dbo.UserSelect.StoredProcedure.sql

I’d much prefer the filenames to match the object name, without the owner (‘dbo’) or object type. In other words, I’d prefer the above two files to look like this:

UserDefinedFunction\fnc_PlayerValue.sql
StoredProcedure\UserSelect.sql

How do we get from point A to point B without a lot of manual file copies and renames? We use the FOR command!

First, create a subdirectory for each object type (Table, StoredProcedure, UserDefinedFunction, View, Schema, Trigger, and User), then run the following from a command prompt.

for %i in (*.User.sql) do for /f "delims=., tokens=1-3" %j in ("%i") do move %i %k\%j.%l
for %i in (*.Schema.sql) do for /f "delims=., tokens=1-3" %j in ("%i") do move %i %k\%j.%l
for %i in (*.Trigger.sql) do for /f "delims=., tokens=1-3" %j in ("%i") do move %i %k\%j.%l
for %i in (*.sql) do for /f "delims=., tokens=1-4" %j in ("%i") do move %i %l\%k.%m

The command above will look for each file with a *.sql extension in the current directory. For each of those files, it copies it to a directory based on the type of file, and ensures the new file only includes the object name and the .sql extension. So much cleaner now!

Note that this will not work if any folder in the full path to the SQL files has a period in it!

Updated 2009.11.10 to support Trigger, User, and Schema scripts.

How to use REPLACE() within NTEXT columns in SQL Server

SQL has an incredibly useful function, REPLACE(), which replaces all occurrences of a specified string with another string, returning a new string. It works great with all forms of NCHAR and NVARCHAR fields. It does not, however, work with NTEXT fields.

Fear not — there’s an easy workaround, thanks to type-casting and SQL 2005’s NVARCHAR(max) datatype. Here’s the process in an nutshell.

  1. Cast the NTEXT field to the NVARCHAR(max) datatype using the CAST function.
  2. Perform your REPLACE on the output of #1.
  3. Cast the output of #2 back to NTEXT. (Not really required, but it does get us back to where we started.

A simple SQL query illustrates this.

select cast(replace(cast(myntext as nvarchar(max)),'find','replace') as ntext)
from myntexttable

If you’re using SQL 2000, you’re out of luck, as NVARCHAR(max) first appeared in SQL 2005. However, if your NTEXT field is less than 8000 characters, you can cast it to VARCHAR(8000) — the largest possible VARCHAR size — to accomplish the same.

[Note #1: This solution below will also work with TEXT fields. Simply replace NTEXT with TEXT, and NVARCHAR with VARCHAR.]

[Note #2: NTEXT fields are depreciated in SQL 2005 in favor of NVARCHAR(max), so avoid using NTEXT and you’ll avoid this problem altogether in the future.]

Five simple rules for creating delimited text files

Here’s a few tips for those people who provide raw data in text files (think CSV files and the like).

  1. Surround all text fields in single quotes, even if a comma does not exist in a specific instance. By failing to do this, you lose field delimiter consistency on a row-by-row basis, forcing the contents of the field to be parsed separately (i.e. stripping the “optional” quotes).
  2. Use consistent line endings. Pick one and stick with it for all your files. Use either (CR/LF), (LF), or (CR) — and use the same in all your files.
  3. Put column headings in the first row only. This is more a convenience than a necessity. If you make your first row column headings, make sure it is only the first row.
  4. Every row past the first should have an identical schema. Don’t try to be fancy and have different row types in one file. Each file should have the same number and sequence of columns.
  5. Provide delimiters for all columns, even when a row does not have data for them. For example, in a comma-delimited file with five columns, if a row has data in only the first two columns, make sure the row ends with three commas. Failure to do so implies missing or incomplete data.

When text files following these guidelines, I can write a script to import them into a SQL table (using BCP and a simple batch file) in a few minutes. Each guideline that is broken requires additional cleanup steps and more complex data import steps, and adds significant development (and debugging) time that shouldn’t be necessary.

Automatically generate (partial) XML format files for BCP

I’ve been working with a lot of raw data files lately — data files that come in some delimited format which I must move into a SQL database. Most of the files are CSV files, and some of them have over 100 columns.

Using BCP to import CSV files into SQL is a pleasure. The only annoyance is writing those XML format files. Actually, the annoyance is not writing them, it’s going through the CSV files (which, of course, are undocumented) to determine what delimiters are used in each row.

Here’s a sample of what I mean:

2934,128321,2782,"2007-04-32","Excluded",2321,,22

Fortunately, most CSV files are consistent in their use of quotes, but going through dozens of columns to determine the terminators is a pain. The terminators in the above example aren’t just commas; they could also be quotes. Column three, for example, ends with a comma followed by a quote.

To generate the <record> section of my format files, I wrote the following script, which reads the first line of a text file, finds each comma, determines if it is preceded or followed by a quote, and generates the appropriate XML.

//get filename
var args = WScript.Arguments;
if (args.length == 0)
{
	WScript.Echo('Usage: getdelims.vbs <filename>');
	WScript.Quit();
}

var filename = args(0);

//read file
var fso = new ActiveXObject("Scripting.FileSystemObject");
var file = fso.OpenTextFile(filename,1);
var contents = file.ReadLine();
file.Close();
file = null;
fso = null;

//find commas
var cnt = 0;
for (var i = 1; i < contents.length; i++)
{
	if ( contents.substr(i,1) != ',' ) continue;
	cnt++;
	delim = ",";
	if ( contents.substr(i-1,1) == '"' )
		delim = '&quot;,';
	if ( i+1 < contents.length && contents.substr(i+1,1) == '"' )
		delim += '&quot;';
	WScript.Echo('\t<FIELD ID="' + cnt + '" xsi:type="CharTerm" TERMINATOR="' + delim + '" />');
}

The output can be copy/pasted right into your format file. The example content above would generate the following.

        <field ID="1" xsi:type="CharTerm" TERMINATOR="," />
        <field ID="2" xsi:type="CharTerm" TERMINATOR="," />
        <field ID="3" xsi:type="CharTerm" TERMINATOR=",&quot;" />
        <field ID="4" xsi:type="CharTerm" TERMINATOR="&quot;,&quot;" />
        <field ID="5" xsi:type="CharTerm" TERMINATOR="&quot;," />
        <field ID="6" xsi:type="CharTerm" TERMINATOR="," />
        <field ID="7" xsi:type="CharTerm" TERMINATOR="," />

A few minutes writing a script, and I won’t be looking at CSV files with too many columns ever again (at least, not for the reason of writing XML format files).

Scripting SQL database builds

A project I’m working on requires importing data from hundreds of megabytes of delimited text files into a staging database, then converting and migrating the data into an operations database. (Forgive the naming conventions. The “staging” database is used as an intermediary between the text files and the final database. The “operations” database is the one the application uses.)

What’s the best way to do automate this? Ideally, I need to be able to re-run the import at any time, preferably with a single command. The process should purge the staging database, rebuild tables (since schemas may change), reload all data, and run the necessary migration scripts. Since the final product is merely a web interface to the imported data, there is no harm in wiping and recreating everything from scratch, as there is no “user data” of note. (Actually there is, but it is isolated from this process.)

The solution I came up with used a series of SQL scripts executed using sqlcmd (part of MS SQL 2005; use osql for older versions) and data import tasks using bcp (part of MS SQL), all tied together using batch files. Continue reading

On disk space and defragmentation

My online baseball game, CSFBL, is a data hog to the tune of about 100GB. Yes, 100GB. That’s how much disk space it takes to store the play-by-play results of some 5 million baseball games (over 2,500 seasons) along with the related player data, historical statisics, and other fun stuff.

The server that powers this beast has three 146GB SCSI drives in a RAID5 configuration. That’s about 250GB of available disk space. Excluding data requirements for such necessities like Windows, SQL Server, tempdb files, web files, and the like, we typically have between 50GB and 100GB of free disk space.

Of course, backing up that database is another story entirely, and if I’m not diligent in keeping the disk clean (by deleting old and unneeded data files), the server can quickly run out of space. This has happened from time to time over the past few years, but a quick cleanup resolves the issue.

Over the weekend I decided to take on major maintenance, defragmenting all SQL indexes, cleaning up old data, removing unused applications, and similar fun weekend chores. During this, I checked the level of fragmentation on the SQL database file.

Over 40,000 file fragments.

Whoa! How long does it take to defragment that? About two days. Of course, a lot of that time was spent first defragmenting other files to make room for one big file. The built-in Windows defragger didn’t do the trick at all (it fails miserably unless you have gobs of free disk space or no large files to defragment). What did do the trick was an evaluation version of Raxco‘s PerfectDisk. Fortunately for my wallet, the eval version is full-featured for 30 days, and since my defrag needs are done, I don’t have to buy it — at least, not until I have to defrag the database again.

Then again, I’m planning ahead this time. I set the SQL database to have about 30GB of free space, so the only fragmentation should be internal, not external. By the time I need to worry about a fragmented SQL data file again, there will be another 1.5 million baseball games simulated, and hopefully a little more ad revenue coming in. 😉

Reordering forums in CommunityServer 1.x

Yes, I know that CommunityServer 2.0 has been released months ago, but considering the effort in upgrading (and my lack of time), I’m still using CommunityServer 1.x on CSFBL. It does a fine job for what I need at this time.

However, one thing perturbs me — the fact that new forums in a forum group aren’t added in the correct place when you want alphabetical sorting. There’s a good reason for this: CommuntiyServer manages sort order by using a SortOrder value in the database, allowing you to choose any sort order. However, this works against you when you want to use plain old alphabetical sorting.

The solution is to run a SQL script to update the SortOrder value of each forum such that they will be displayed alphabetically. The following SQL script will show you the forums, current sort order, and new sort order for all forums in a given group.

select sectionid, name, sortorder,
( select count(*) from cs_sections
where groupid = cs.groupid and name < cs.name ) as new_sortorder from cs_sections cs where groupid = 8 order by sortorder asc [/source] Pretty simple. You can get the GroupID by looking at the same table, or by looking at the ForumGroupID in the URLs on your web site. To apply these changes: [source='sql'] update cs_sections set sortorder = ( select count(*) from cs_sections where groupid = cs.groupid and name < cs.name ) from cs_sections cs where groupid = 8 [/source] Simple and relatively harmless -- and a heck of a lot easier than trying to use the (buggy) reordering of forums in the admin console!

Copying forums between databases in CommunityServer

I’ve used Telligent‘s CommunityServer product for about a year and a half now, and it’s proven to be a good (but not great) forums product. (Note: Version 2.0, recently released, seems to correct that issue.)

While doing some maintenance on the forums as installed for my baseball game, CSFBL, I deleted one forum that was still needed. To correct the problem, I restored a copy of a recent backup as a separate database and ran the following SQL scripts to copy the database from the backup to the production copy.

In the following source code, productionforum is the production database name, and productionforum_old is the old (backup) database name. Note that I haven’t moved every column over (such a read flags, attachment links, edit notes, etc.), but the process for those would be pretty much the same.

On to the SQL…

declare @sectionID int
set @sectionID = 0 --replace with the sectionID to move

--1. Move section
set identity_insert productionforum..cs_sections on

INSERT INTO productionforum..cs_Sections(SectionID, SettingsID, IsActive, ParentID, GroupID, Name, NewsgroupName, Description, DateCreated, Url, IsModerated, DaysToView, SortOrder, TotalPosts, TotalThreads, DisplayMask, EnablePostStatistics, EnableAutoDelete, EnableAnonymousPosting, AutoDeleteThreshold, MostRecentPostID, MostRecentThreadID, MostRecentThreadReplies, MostRecentPostSubject, MostRecentPostAuthor, MostRecentPostAuthorID, MostRecentPostDate, PostsToModerate, ForumType, IsSearchable, ApplicationType, ApplicationKey, PropertyNames, PropertyValues, Path)
select *
from productionforum_old..cs_sections
where sectionid = @sectionID

set identity_insert productionforum..cs_sections off

--2. Move permissions
INSERT INTO productionforum..cs_SectionPermissions(SettingsID, SectionID, RoleID, AllowMask, DenyMask)
select *
from productionforum_old..cs_sectionpermissions
where sectionid = @sectionID

--3. Move moderators
INSERT INTO productionforum..cs_Moderators(UserID, SectionID, DateCreated, EmailNotification, PostsModerated, SettingsID)
select *
from productionforum_old..cs_Moderators
where sectionid = @sectionID

--4. Move threads
set identity_insert productionforum..cs_threads on

INSERT INTO productionforum..cs_Threads(threadid, SectionID, UserID, PostAuthor, PostDate, ThreadDate, LastViewedDate, StickyDate, TotalViews, TotalReplies, MostRecentPostAuthorID, MostRecentPostAuthor, MostRecentPostID, IsLocked, IsSticky, IsApproved, RatingSum, TotalRatings, ThreadEmoticonID, ThreadStatus, SettingsID)
select threadid, SectionID, UserID, PostAuthor, PostDate, ThreadDate, LastViewedDate, StickyDate, TotalViews, TotalReplies, MostRecentPostAuthorID, MostRecentPostAuthor, MostRecentPostID, IsLocked, IsSticky, IsApproved, RatingSum, TotalRatings, ThreadEmoticonID, ThreadStatus, SettingsID
from productionforum_old..cs_threads
where sectionid = @sectionID

set identity_insert productionforum..cs_threads off

--5. Move posts
set identity_insert productionforum..cs_Posts on

INSERT INTO productionforum..cs_Posts(postid, ThreadID, ParentID, PostAuthor, UserID, SectionID, PostLevel, SortOrder, Subject, PostDate, IsApproved, IsLocked, IsIndexed, TotalViews, Body, FormattedBody, IPAddress, PostType, EmoticonID, PropertyNames, PropertyValues, SettingsID, AggViews, PostConfiguration)
select postid, ThreadID, ParentID, PostAuthor, UserID, SectionID, PostLevel, SortOrder, Subject, PostDate, IsApproved, IsLocked, IsIndexed, TotalViews, Body, FormattedBody, IPAddress, PostType, EmoticonID, PropertyNames, PropertyValues, SettingsID, AggViews, PostConfiguration
from productionforum_old..cs_Posts
where sectionid = @sectionID

set identity_insert productionforum..cs_Posts off