Friday, May 25, 2012

Split large text file into pieces with defined number of lines each

Use command split, which is part of *nix, can be installed on Windows too.
First count number of lines we have in the file:


$ wc -l allsk.csv
103677 allsk.csv

We decide to split it to parts, 50000 lines each, using prefix parts.csv and suffixed by numbers (-d).

$ split -l 50000 -d allsk.csv parts.csv

Finally, we will check, how many lines are present there


$ wc -l parts.*
  50000 parts.csv00
  50000 parts.csv01
   3677 parts.csv02
 103677 total


As you see, total number of lines is the same as in source, so lines were split at line breaks, not in the middle.


For help on more options see:


$ split --help

No comments:

Post a Comment