SplitFile: splitting large text files

I have been working on a project for the last three months where large script files with millions lines of rows is scripted from a database. These script files often contains errors due to data from the old database that don’t fit into the new database.

The largest script files is up till 6GB and difficult to handle with with editors I have available – even in UltraEdit. Therefore, I have been looking for a proper tool for splitting these large files into smaller chunks. 10 years ago, I had some windows toools ported from UNIX, but I couldn’t find them any more, and had to write the tool myself.

I needed a tool that splitted a large file into smaller files with a number of lines or had a maximum size. In addition, it was desirable to be able to just pick a line range from the large file and write these lines into a new file.

Here is example executions with command line parameters:

SplitFile /INFILE:"C:\TFS\SplitFile\Lines.txt" /MAXSIZE:250
SplitFile /INFILE:"C:\TFS\SplitFile\Lines.txt" /MAXLEN:100000
SplitFile /INFILE:"C:\TFS\SplitFile\Lines.txt" /PICKLINES:100,150

Download source code: SplitFile.zip

Advertisements

SQLCMD large script – part 2

I’m in my finale stage in my cleanup project of the 26GB database for a customer. Today, I planned for working late from home. My problem was that my data script had a “use <db name> – included by a mistake. The script files were too large to edit with available tool on the database servere and my broadband connection isn’t good enough to transfer the files between the server and my laptop that has the proper tools.

The solution was a quick look at the SQLCMD command line options. After some frustrating minutes, I spotted the -d <db name> option. I tried it on my local machine and it work perfectly.

As a result, i could load my large script (8gb for largest table) and override the use statement in the script file.

Syntax:

sqlcmd -d <database> -i <script file>

 

Run large SQL command files

I just had a problem with importing large amount of binary data into an SQL server 2005 Express database used for a project at work.

It is not a large database in number of tables, views and stuff, but the document table contains pretty much data. The file with insert statements was on 200MB and I had problems importing the script in SQL Server Management Studio. Therefore, I needed a command line tool to do the work for me.

It didn’t take me long to find the correct SQL server syntax:

  1. Open a cmd window
  2. Type "sqlcmd -i c:\temp\script.sql" and press ENTER

It took some time to run, but after a couple of hours and heavy server load, the script completed successfully 🙂