Archive for March, 2009

How to remove the last n lines from a directory full of csv files.

Tuesday, March 24th, 2009

First, install Cygwin.
Next, add the cygwin\bin directory to your path (control panel\system ->advanced settings -> environment variables
Create a script to invoke the head command and iterate over the files and put the script in the \cygwin\bin directory.
Call it truncate.sh.
Add the shebang to the file (#!c:/cygwin/bin/bash) at the top
Make it executable by running chmod a+x c:\cygwin\bin\truncate.sh (might not be necessary?)
Open a command window in the directory containing your files to be truncated.
Set the starting and ending variables to the right numbers.
Set the number after -n to the last row you want in your data files.
Save truncate.sh
Run dos2unix c:\cygwin\bin\truncate.sh
Type bash c:\cygwin\bin\truncate.sh
Your files will now be truncated and named. [filename].bak You can use a file renaming utility such as Batch File Renamer to rename all the files to .csv

Improvements that could be made:
Counting the number of files and determining the right starting and ending filename.
Taking user input for the row to truncate after.
Writing the files as the right extension, but in a \truncated directory.