Break up huge files into manageable chunks so that they can be uploaded and opened in Excel
With the recent news that Windows 10 will introduce a UNIX shell, users will have access to a command-line environment on Linux, Mac, and Windows for the first time. And that means you can finally stop bugging the developers at your company to break up the enormous files that Microsoft Excel refuses to open. You don’t need to know anything about the Terminal to do this; although, if you’d like to learn more, check out these tutorials:
Okay, here’s how it works (for this article, we’ll be working on a Mac, but you can use this method elsewhere).
- Open Terminal (Applications/Utilities/Terminal)
- Create a new folder on your desktop. This is where you’ll save the file that needs to be split. My folder is called ‘split’ and I have a CSV file that contains a few copies of the English dictionary, or about 2.36 million rows.
- In Terminal, navigate to the folder you just created using the ‘cd’ command, which stands for ‘change directory.’
- Now, you’ll use the ‘split’ command to break the original file into smaller files. To do this, you’ll type the following, where 250000 is the incremental number of rows at which you want the files to break.
split -l 250000 words.csv
You can see that the new files are way smaller and have been named in a series of three letters, beginning with ‘xaa.’ These are still CSV files – they just don’t have the ‘.csv’ file extension yet. You can add it manually, but if you have a lot of new files it’s easier to do it through the Terminal.
- Make sure that you’re still in the ‘Split’ folder on your desktop using the ‘pwd’ (print working directory) command. Next, paste the following – no need to worry about how it works.
for i in *; do mv "$i" "$i.csv"; done
You should now see that all of the split files are saved as comma-separated and will open nicely in Excel.