Linux Shell - How To Remove Duplicate Text Lines

https:‮‬//www.lautturi.com
Linux Shell - How To Remove Duplicate Text Lines

To remove duplicate text lines from a file in a Linux shell, you can use the sort and uniq commands.

The sort command sorts the lines in a file lexicographically, and the uniq command removes adjacent duplicate lines.

For example, to remove duplicate lines from the file input.txt and save the result to the file output.txt, you can use the following command:

sort input.txt | uniq > output.txt

This will sort the lines in the input.txt file and remove any adjacent duplicate lines, and save the modified output to the output.txt file.

You can use the -u option with the uniq command to only print the unique lines, rather than all the lines with duplicates removed. For example:

sort input.txt | uniq -u > output.txt

This will only print the unique lines from the input.txt file and save the output to the output.txt file.

It is important to note that the sort and uniq commands operate on individual lines, and will not remove lines that are not adjacent but are otherwise identical. To remove these lines as well, you may need to use additional options or a different approach.

For more information about the sort and uniq commands and their options, you can consult the documentation for your specific Linux system.

Created Time:2017-10-30 10:17:38  Author:lautturi