Master Linux Text Processing Commands with Our Guide

admin

Text processing in Linux refers lớn manipulating, analyzing, and managing textual data using various command-line tools and utilities available in the Linux operating system. It involves performing a wide range of tasks on text data, such as searching, filtering, sorting, formatting, and extracting information from text files or streams of text.

Imagine effortlessly sifting through mountains of log files lớn find that elusive error message, extracting valuable data from a cluttered dataset, or formatting text neatly for presentation. All of this and more is easily achievable with the arsenal of text processing tools at your disposal. These commands are mostly efficient in particular scenarios, such as Debugging or handling large amounts of data.

In this tutorial, I will walk you through the concept of text processing in Linux and various commonly used text processing commands and highlight its benefits. You will learn how lớn use these text-processing commands lớn simplify your debugging and complex tasks.

Basic Text Editing Commands

When working with Linux, you giảm giá khuyến mãi with many files or texts, and you need these commands lớn increase productivity as a developer. These commands allow you lớn navigate, edit, and format your text efficiently. Here are some commonly used text editing commands

cat: Displaying the Content of Files

cat (short for "concatenate") is a simple but powerful command in Linux that is primarily used lớn display the nội dung of text files in the terminal. It's often used for quick inspection of tệp tin contents.

Here's how lớn use a cat:

$ cát filename

Example output:

You can use --help flag with the cát lớn elaborate more cát command option.

$ cát --help

Output:

nano and vim: Basic Text Editors for Creating and Editing Text Files

nano and vim are two popular text editors in Linux. They are used for creating, viewing, and editing text files.

nano is a straightforward and user-friendly text editor. To create or edit a tệp tin, simply type:

$ nano filename.txt

After running the above command, you’ll see a new window opened which is a nano editor

Output:

Similarly, vim is a more advanced and powerful text editor but has a steeper learning curve. To create or open a tệp tin in Vim run

$ vim newfile 

After you open a new text editor, you need lớn go into INSERT Mode by pressing i

Output:

When inside nano or vim, you can edit the tệp tin, save changes, and exit. In nano, you can use basic keyboard shortcuts displayed at the bottom of the terminal, as you can see in the above output. In Vim, you need lớn switch between different modes of the terminal (insert, command, and visual) lớn edit and save files.

echo: Printing Text lớn the Terminal

The echo command displays text or variables on the terminal screen, and it is frequently used in shell scripts lớn offer feedback lớn users or present the information. Let’s go through some examples below.

Print a simple message lớn the terminal:

$ echo "Hello, World"

Output:

Hello, World

Display the value of a variable:

$ myVar="This is my text."
$ echo $myVar

Output:

This is my text.

Pipes and Redirection in Linux

The Linux shell has two extremely useful tools that let you combine commands and reroute their input and output: “pipelines and redirection”. If one wants lớn use the shell efficiently, these are necessary tools.

Piping in Linux

Linking one command's output lớn another's input is known as “piping”. This enables you lớn execute complicated activities by chaining together commands. The vertical bar (|) is the pipe operator.

The following command calculates the number of lines in each tệp tin after listing every tệp tin in the current directory.

$ ls | wc -l

The wc -l command receives the output of the ls command and counts the number of lines in its input

Example output:

3

Another example:

$ echo "Hello, World" | tr \[a-z\] [A-Z]

Output:

HELLO, WORLD

In the above command, you have taken the first command's output and lập cập it through the tr command, which capitalized every character.

Redirection in Linux

Redirection is a way of changing a command's mặc định input or output. This allows you lớn save the output of a command lớn a tệp tin or lớn read input from a tệp tin instead of the keyboard. The following operators are used for redirection purposes.

The output of a command can be redirected lớn a tệp tin using the > operator:

$ curl -L https://github.com/kubernetes/kubernetes/blob/master/README.md > README.md

The above command reads the nội dung of the README.md tệp tin of the Kubernetes repository and writes its nội dung on your local system lớn the README.md tệp tin.

The >> operator is used lớn write lớn a new tệp tin or append a command's output lớn the kết thúc of an already-existing tệp tin.

Let’s say you have one tệp tin locally called numbers.txt and have nội dung lượt thích this:

$ cát numbers.txt
one
two

Using the >> operator, you can append your text at the kết thúc of this tệp tin.

$ echo "three" >> numbers.txt 

Now kiểm tra the contents of the tệp tin using the cat command:

$ cát numbers.txt
one
two
three

Note: If you use only the > operator, the tệp tin nội dung is replaced with “three”

The < operator reads input from a tệp tin and then acts upon it.

Example:

$ wc -l < numbers.txt > lines.txt

The command above retrieves the line count using the wc command from the tệp tin called numbers.txt and then directs this line count lớn be saved in a tệp tin named 'lines.txt'.

The operator called 2> redirects the errors lớn your desired tệp tin. Suppose you’re listing docker containers running on your machine using the $ docker ps command, but you face some error; you can redirect the output.

Example:

$ docker ps 2> error.txt

Output:

Text processing commands in Linux are essential tools for working with text files and manipulating textual data. These commands provide a wide range of functionalities, including searching, editing, extracting, sorting, and transforming text.

Below are some of the commonly used text-processing commands:

The grep command

grep is a command-line utility for searching text files using regular expressions. It allows you lớn find lines that match a specific pattern or expression. Regular expressions provide a powerful way lớn specify complex tìm kiếm patterns, including character sequences, wildcards, and repetition rules.

Example:
If you want lớn tìm kiếm for the name of the tool in a text tệp tin, use:

$ grep "Github" tool 

Output:

Note: You can use grep -i option lớn perform a case-insensitive tìm kiếm.

Find all lines starting with the letter "G" in a tệp tin called “tools":

$ grep "^G" tools

Output:

The sed command:

sed is a stream editor that enables non-interactive text manipulation. It allows you lớn modify, replace, or delete text patterns in files. It works by processing text line by line, applying a phối of editing commands specified in a script or directly on the command line.

Let's consider this example that searches kubernetes and replaces it with k3s

$ sed 's/kubernetes/k3s/g' 

In the above command:

  • s means that we want lớn substitute a word,
  • kubernetes is the word we want lớn substitute and k3s is the word that we want lớn substitute for kubernetes.
  • The letter g in the command indicates that we want it lớn be executed globally, replacing each instance of the match.

If you execute above mentioned command on the line below:

kubernetes kubernetes kubernetes kubernetes

You will get,
Output:

k3s k3s k3s k3s 

Another example is if you want lớn delete an error word in the tệp tin logs.txt

 $ sed '/error/d' logs.txt

After execution of this command, all occurrences of the word “error” would be deleted.

The awk command:

awk is a versatile text processing tool for extracting and manipulating data. It processes text files line by line, allowing you lớn perform calculations, format output, and generate reports. It uses a pattern-action paradigm, where you specify patterns lớn match and actions lớn perform on matching lines.

For example: If you have data.csv tệp tin and you want lớn filter the nội dung as you want, you can tự it with awk command as below

$ cát data.csv

Output:

index,ean,stock,price
1,2010743556564,669,135
2,2668829157992,476,584
3,0429683856399,875,44
4,2548029150224,77,251
5,3442300385932,742,737

To get the stock field which is the third field:

$ awk -F , '{print $3}' data.csv

Output:

stock
669
476
875
77
742

The tệp tin will be read by “awk” as a comma-separated value (CSV) due lớn the -F flag in the earlier command setting the field separator lớn , . With **{print $2}**, **awk** goes through each line of the tệp tin and prints the second field with.

Calculate the sum of the third column in a tệp tin called "data.csv"

$ awk '{ sum += $3 } END { print sum }' numbers.txt

The sum += $3 will take the sum of all the values in the third column, and { print sum } will print the output lớn your terminal.

Output:

2839

The sort command:

sort is used for sorting lines of text in a tệp tin. You can sort alphabetically, numerically, or based on custom criteria. It can handle various data types, including numbers, strings, and alphanumeric combinations.

Sort the tệp tin tools.txt alphabetically with the sort command below:

$ sort tools.txt

Output:

Sort the lines of a tệp tin called "numbers.txt" numerically:

Content of the file:

Run the following command

$ sort -n numbers.txt

Output:

You can see the difference between the two outputs:

The tac command

The tac command in Linux is used lớn reverse the contents of a text tệp tin, displaying the lines in reverse order, from the last line lớn the first. "tac" is essentially "cat" spelled backward, emphasizing its purpose of reversing text.

Here is how you can use the tac command

Let’s say you want lớn execute the tac command on tệp tin.txt:

$ cát tệp tin.txt
line 1
line 2
line 3
line 4



$ tac tệp tin.txt

Output:

line 4
line 3
line 2
line 1

Conclusion

This article describes how you can perform various text-processing commands and how you can increase your productivity while working with a Unix-like system. Also, this article covers many examples along with use cases.

There is always something new lớn learn about Linux, lớn continue your learning, checkout the following resources:

  • Linux kernel man pages
  • How lớn Set Environment Variables on a Linux Machine
  • Linux background and foreground process management
  • How lớn schedule a periodic task with cron
  • How lớn Schedule Future Processes in Linux Using at