How do I split a file into parts in Linux

This tutorial explains how to split files into parts on Linux by size easily, multiple files, content, and more options. After reading this article, you will know how to split files using split and csplit commands and how to merge or join file parts.

How to split files by size on Linux:

For the first example of this tutorial, I will use a 5GB Windows ISO image called WIN10X64. .ISO. To know the file size you want to split, you can use the du -h command, as shown in the screenshot below.

As you can see, the file size is 5 GB. To split it into 5 files of 1 GB each, you can use the split command followed by the -b flag and the size of split files you want. The G that defines the unit size for GB can be replaced with M for megabytes or B for bytes.

As you can see, the ISO was divided into 5 files named

xaa, xab, xac, xad and xae.

By default, the split command names generated files in the example above, where xaa is the first part, xab the second part, xac the third, and so on. As shown in the following example, you can change this and define a name, leaving the default name as an extension.

As you can see, all files are

called Windows.*, the name extension given by

the split command, which allows us to know the order of the files.

By using the split command, you can implement verbosity so that the command prints the progress, as shown in the following screenshot.

As you can see, the progress output shows the file split phase. The following example shows how to split files into MB drives. The file is an 85 MB file.

The split command includes additional interesting features that are not explained in this tutorial. You can get additional information about the split command in https://man7.org/linux/man-pages/man1/split.1.html.

How to split files by content on

Linux using csplit:

In some cases, users may want to split files based on their content. For such situations, the split command explained above is not useful. The alternative to achieve this is the csplit command.

In this section of the tutorial, you will learn how to split a file whenever a specific regular expression is encountered. We will use a book and divide it into chapters.

As you can see in the image below, we have 4 chapters (they were edited to allow you to see the chapter divisions). Let’s say you want each chapter to be in a different file. For this, the regular expression that we will use is “Chapter”.

I know there are 4 chapters in this book, so we must specify the number of divisions we want to avoid mistakes. In the following examples, I explain how to split without knowing the number of regular expressions or slices. But in this case, we know that there are 4 chapters; Therefore, we need to split the file 3 times.

Run csplit followed by the file you want to split, the regular expression between forward slashes, and the number of slices between braces, as shown in the following example.

The output we see is the byte count for each piece of file

.

As you can see, 5 files were created, the empty space before Chapter 1 was also divided

.

Files are named as when using the split command explained above. Let’s see how they were divided.

The first file,

xx00 is empty, is the empty space before the first time the regular expression “Chapter” appears and the file is split.

The second piece shows only the first chapter correctly. The third piece shows chapter 2.

The

fourth piece shows chapter three.

<img

src

=”https://linuxhint.com/wp-content/uploads/2021/08/12-8.jpg”

alt=”” />

And the last piece shows chapter 4

.

As explained earlier, the number of regular expressions was specified to avoid an incorrect result. By default, if we don’t specify the number of slices, csplit will only cut the file once.

The following example shows the execution of the previous command without specifying the number of slices.

As you can see, only one split and two files occurred because we didn’t specify

the number of splits.

Also, if you type an incorrect number of slices, for example, 6 slices with only 4 regular expressions, you will get an error and no slicing will occur, as shown in the following example.

So what do you do when the content is too long and you don’t know how many regular expressions to split you have in the content? In such a situation, we need to implement the wildcard.

The wildcard will produce many parts as regular expressions that are in the document without requiring you to specify them.

As you can see, the file was successfully split

.

The csplit command includes additional interesting features that are not explained in this tutorial. You can get additional information about the split command in https://man7.org/linux/man-pages/man1/csplit.1.html.

How to merge or join files again:

Now you know how to split files based on size or content. The next step is to merge or join files. An easy task using the cat command.

As you can see below, if we read all the pieces of the file using cat and the wildcard, the cat command will sort them by the alphabetical order of their names.

As you can see, cats are able to sort files correctly. Merging or merging the files consists of exporting this result; You can do this as shown in the following example, where CombinedFile is the name of the merged file.

As you can see in the

image below, the file was successfully merged

.

Conclusion:

As you can see, splitting files into parts on Linux is pretty easy, and you just need to know which is the right tool for your task. It is worthwhile for any Linux user to learn these commands and their advantages, for example, when sharing files over an unstable connection or through channels that limit file size. Both tools have many additional features that were not explained in this tutorial, and you can read on their man pages.

I hope this tutorial explaining how to split a file into parts on Linux has been helpful. Follow this site for more Linux tips and tutorials.