To remove duplicate text lines from a file in a Linux shell, you can use the sort
and uniq
commands.
The sort
command sorts the lines in a file lexicographically, and the uniq
command removes adjacent duplicate lines.
For example, to remove duplicate lines from the file input.txt
and save the result to the file output.txt
, you can use the following command:
sort input.txt | uniq > output.txt
This will sort the lines in the input.txt
file and remove any adjacent duplicate lines, and save the modified output to the output.txt
file.
You can use the -u
option with the uniq
command to only print the unique lines, rather than all the lines with duplicates removed. For example:
sort input.txt | uniq -u > output.txt
This will only print the unique lines from the input.txt
file and save the output to the output.txt
file.
It is important to note that the sort
and uniq
commands operate on individual lines, and will not remove lines that are not adjacent but are otherwise identical. To remove these lines as well, you may need to use additional options or a different approach.
For more information about the sort
and uniq
commands and their options, you can consult the documentation for your specific Linux system.