An example of the input file:
empno ,ename ,job ,hiredate ,sal
------,---------,-----------,----------,---------
7782 ,CLARK ,MANAGER ,09-JUN-81 ,2450.00
7839 ,KING ,PRESIDENT ,17-NOV-81 ,5000.00
7934 ,MILLER ,CLERK ,23-JAN-82 ,1300.00
The script looks at the second line of each file (i.e. the hyphenated line) to dynamically determine the start and end of each field:
#!/usr/local/bin/gawk -f
BEGIN {
FS=","
RS="\n"
OFS=","
}
# Determine fixed length positions of each field by
# looking at second record (i.e. header row) and output
# each field separated by chosen delimiter.
NR==1{
widthstring = (length($1)+1)
for(i=2; i < NF; i++) {
widthstring = widthstring " " (length($i)+1)
if(i!=(NF-1))
printf("%s",OFS ltrim(rtrim(rmlastcomma($i))))
}
i=NF-1
print(OFS ltrim(rtrim(rmlastcomma($i))))
FIELDWIDTHS = widthstring
}
# Now read from line 3 (skipping first two rows) using newly
# determined field lengths and remove whitespaces. Output
# each field separated by chosen delimiter.
NR>2{
for(i=2; i < NF; i++)
printf("%s",OFS ltrim(rtrim(rmlastcomma($i))))
print(OFS ltrim(rtrim(rmlastcomma($NF))))
}
function ltrim(v) { gsub(/^[ ]+/, "", v); return v}
function rtrim(v) { gsub(/[ ]+$/, "", v); return v}
function rmlastcomma(v) { sub(/[,]+$/, "", v); return v}
The resulting output will be a regular CSV file:
7782,CLARK,MANAGER,09-JUN-81,2450.00
7839,KING,PRESIDENT,17-NOV-81,5000.00
7934,MILLER,CLERK,23-JAN-82,1300.00
Example command to start gawk in Windows (reading in all files):
for %n in (*.txt) do gawk -f fixed2csv.awk %n >%n.csv
Yes, I did this in Windows!