Regulat expression in Linux Part-1
Regular expressions (Regexp)is one of the advanced concept we require to write efficient shell scripts and for effective system administration.
Basically regular expressions are divided in to 3 types for better understanding.
2)Interval Regular expressions (Use option -E for grep and -r for sed)
3)Extended Regular expressions (Use option -E for grep and -r for sed)
What is a Regular expression?
A regular expression is a concept of matching a pattern in a given string.
Which commands/programming languages support regular expressions?
vi, tr, rename, grep, sed, awk, perl, python etc.
Basic Regular Expressions
Basic regular expressions: This set includes very basic set of regular expressions which do not require any options to execute. This set of regular expressions are developed long time back.
^ –Caret/Power symbol to match a starting at the beginning of line.
$ –To match end of the line
* –0 or more occurrence of previous character.
. –To match any character
[] –Range of character
[^char] –negate of occurrence of a character set
<word> –Actual word finding
– -Escape character
Lets start with our Regexp with examples, so that we can understand it better.
^ Regular Expression
Example 1: Find all the files in a given directoryls -l | grep ^-
As you are aware that the first character in ls -l output, - is for regular files and d for directories in a given folder. Let us see what ^- indicates. The ^ symbol is for matching line starting, ^- indicates what ever lines starts with -, just display them. Which indicates a regular file in Linux/Unix.
If we want to find all the directories in a folder use grep ^d option along ls -l as shown below
ls -l | grep ^d
How about character files and block files?
ls -l | grep ^c
ls -l | grep ^b
We can even find the lines which are commented using ^ operator with below example
grep ‘^#’ filename
How about finding lines in a file which starts with ‘abc’
grep ‘^abc’ filename
We can have number of examples with this ^ option.
$ Regular Expression
Example 2: Match all the files which ends with shls -l | grep sh$
As $ indicates end of the line, the above command will list all the files whose names end with sh.
how about finding lines in a file which ends with dead
grep ‘dead$’ filename
How about finding empty lines in a file?
grep ‘^$’ filename
* Regular Expression
Example 3: Match all files which have a word twt, twet, tweet etc in the file name.ls -l | grep ‘twe*t’
How about searching for apple word which was spelled wrong in a given file where apple is misspelled as ale, aple, appple, apppple, apppppple etc. To find all patterns
grep ‘ap*le’ filename
Readers should observe that the above pattern will match even ale word as * indicates 0 or more of previous character occurrence.
. Regular Expression
Example 4: Filter a file which contains any single character between t and t in a file name.ls -l | grep ‘t.t’
Here . will match any single character. It can match tat, t3t, t.t, t&t etc any single character between t and t letters.
How about finding all the file names which starts with a and end with x using regular expressions?
ls -l | grep ‘a.*x’
The above .* indicates any number of characters
Note: .* in this combination . indicates any character and it repeated(*) 0 or more number of times.
Suppose you have files as..
awx
awex
aweex
awasdfx
a35dfetrx
etc.. it will find all the files/folders which start with a and ends with x in our example.
[] Square braces/Brackets Regular Expression
Example 5: Find all the files which contains a number in the file name between a and xls -l | grep ‘a[0-9]x’
This will find all the files which is
a0xsdf
asda1xsdfas
..
..
asdfdsara9xsdf
etc.
So where ever it finds a number it will try to match that number.
Some of the range operator examples for you.
[a-z] –Match’s any single char between a to z. [A-Z] –Match’s any single char between a to z. [0-9] –Match’s any single char between 0 to 9. [a-zA-Z0-9] – Match’s any single character either a to z or A to Z or 0 to 9 [!@#$%^] — Match’s any ! or @ or # or $ or % or ^ character. You just have to think what you want match and keep those character in the braces/Brackets.
[^char] Regular Expression
Example6: Match all the file names except a or b or c in its filenamesls | grep ’[^abc]‘
This will give output all the file names except files which contain a or b or c.
<word> Regular expression
Example7: Search for a word abc, for example I should not get abcxyz or readabc in my output.grep ‘<abc>’ filename
Escape Regular Expression
Example 8: Find files which contain [ in its name, as [ is a special charter we have to escape itgrep "[" filename
or
grep '[[]‘ filename
Note: If you observe [] is used to negate the meaning of [ regular expressions, so if you want to find any specail char keep them in [] so that it will not be treated as special char.
Note: No need to use -E to use these regular expressions with grep. We have egrep and fgrep which are equal to “grep -E”. I suggest you just concentrate on grep to complete your work, don’t go for other commands if grep is there to resolve your issues.
Regular Expression in Linux Part- 2
This is our second part on Regular Expressions in Linux.Interval Regular expressions
These are used to mention no of character/character set reputation info. Note that interval regular expression and extended reg require -E option with grep.Note: In order to use this set of regular expressions you have to us -E with grep command and -r option with sed commands
{n} –n occurrence of previous character
{n,m} – n to m times occurrence of previous character
{m, } –m or more occurrence of previous character.
Example 1: Find all the file names which contain “t” and t repeats for 3 times consecutively.
ls -l | grep -E ‘t{3}’
-E option is used to extend regexp understanding for grep.
Example 2: Find all the file names which contain l letter in filename with 1 occurrence to 3 occurrence consecutively.
ls -l | grep -E ‘l{1,3}’
Example 3: Find all the file names which contains k letter 5 and more in a file name.
ls -l | grep -E 'k{5,}'This is bit tricky, let me explain this. Actually we given a range i.e 5 to infinity(Just given only comma after 5).
Extended regular expressions
These regular expressions extend the regular expression features.Note:In order to use this set of regular expressions you have to us -E with grep command and -r option with sed commands
+ –one more occurrence of previous character
| — or option, we can specify either a character to present in the pattern.
? — 0 or one occurrence of previous character
() — grouping of character set.
Example 1: Find all the files which contains f letter, one more occurrences.
ls -l | grep -E ‘f+’
Example 2: Find all the files which may contain a or b in its file name
ls -l | grep -E ‘a|b’
Example 3: Find all the files which may contain t or 1 occurrence of t in filename
ls -l | grep -E ‘t?’
for example i have below files
test
best
see
do
my grep command will list test, best files as output.
Note: My grep output contain all these files though see and do files do not contain t, the is because we given ? which will search for 0 or 1 occurrence of previous character. See and do contains 0 t’s in its name, so it will find these files too.
Example 4: Find all the files which contains ab in the sequence
ls -l | grep -E ‘(ab)’
This will give all the files which contains ab in the file name consequently.
Please stay tuned to our next article on grep command and how to use it.
Grep command with Regular expressions -Part-3
n this post we will see how to use extended regular expressions to increase the power of grep command even better than Basic regular expression.
Extended regular expressions:
+ --Match one or more occurrences of previous character. | -- Match Either character ? – Match 0 or 1 occurrence of previous character. () –match a group of characters {number} –Match number of occurrence of a character {1, 3} –Match a character which is 1 to 3 times repetition {5, } –Match a repeated character which is repeated 5 or more times.
Note1: In order to use this extended regular expressions we have to use –E option give grep the capability to understand Extended regular expressions.
Note2: egrep is nothing but grep –E, so try to avoid it if grep itself can do the work for you. Why to learn new command?
Examples:
Example1:Search for a words which contains one or more occurrence of ‘b’ between a and c.
grep –E ‘ab+c’ filename
Example2: Search for a word which contains zero or one occurrence of b between a and c
grep –E ‘ab?c’ filename
Example3: Search for a word which contains either a or b, a and b between d, e characters
grep –E ‘da|be’ filename
Example4: Search for a word which contains either a or b, but not both a and b between d, e characters
grep –E ‘d(a|b)e’ filename
Example5: Search for a word which contains only 2 ‘b’ between a and c character
grep –E ‘ab{2}c’ filename
Example6: Search for a word which contains 3 to 4 ‘b’ between a and c character
grep –E ‘ab{2,4}c’ filename
Example7: Search for a word which contains 3 or more ‘b’ between a and c character
grep –E ‘ab{3, }c’ filename
Note: When we are using {} we have to give a range, but in this example we did not give range we just started the range but did not end the range which indicates infinity.
* ************************ enjoy ***************
Thanks for sharing this informative blog on Linux Training . If you are still in doubt, whether to take the plunge or not. Here are few points that might change your opinion! Also UrbanPro.com Help you by connecting you to the best Linux Training Classes in your locality
ReplyDeleteVisit link below For more www.urbanpro.com/linux-training?_r=offpage
“ Linux Training Classes ”
Your concepts was on linux training in india easy to understand that I wondered why I never looked at it before. This information is definitely useful for everyone.
ReplyDeleteNice blog,I have learned more in this topic thank you so much
ReplyDeleteLinux Administration Training in Hyderabad
Very nice blog, this is really an amazingly written blog, I'll wait for your next blog. Linux Training in Yamunanagar
ReplyDeleteWell said! One of the main advantages of Linux is that it is an open-source operating system i.e. its source code is easily available for everyone.
ReplyDelete