Jump to content

Shell Scripting -- need some awk/sed help...

Changing some old C code to a ksh shell script, need to grab the numbers at the end of this line from the * to the tilda. Regular expressions has never really been something I'm good at lol.. so hoping someone here would know

 

 

G50*N*20160207*18885~

                             ^^^^^^^

 

the first part is always static up till the end of the date (20170505), after that asterisk I need to grab the numbers that follow which can be in length anywhere from 1 to 25 characters then it ends in a tilda.

 

"If a Lobster is a fish because it moves by jumping, then a kangaroo is a bird" - Admiral Paulo de Castro Moreira da Silva

"There is nothing more difficult than fixing something that isn't all the way broken yet." - Author Unknown

Spoiler

Intel Core i7-3960X @ 4.6 GHz - Asus P9X79WS/IPMI - 12GB DDR3-1600 quad-channel - EVGA GTX 1080ti SC - Fractal Design Define R5 - 500GB Crucial MX200 - NH-D15 - Logitech G710+ - Mionix Naos 7000 - Sennheiser PC350 w/Topping VX-1

Link to comment
https://linustechtips.com/topic/780143-shell-scripting-need-some-awksed-help/
Share on other sites

Link to post
Share on other sites

Well.. it really depends of how the data pattern will be. If you can guarantee it will always be at the end...

 


user@host:~$ echo "G50*N*20170505*18885~"| rev | cut -d'*' -f1 | rev | sed -e 's/\~//g'
18885
user@host:~$

 

Which:

- Prints the given data

- reverse it

- cuts only the first column, using * as delimiter

- reverse it back

- remove tilda

 

You can also try awk:


user@host:~$ echo "G50*N*20170505*18885~"| awk -F '\*' '{print $4}' | sed -e 's/\~//g'
18885
user@host:~$

 

Which is about the same but from bold part, which prints 4th column at data (as unijab post).
 

Link to post
Share on other sites

@tmacedo yes the pattern will always be constant up to the point where you reach the numbers I want after that 3rd asterisk.

 

These lines are in what is essentially a text file, I just need it to only do this on lines that begin with 'G50' or "BEG'

 

 

but I see what you did there with the reversing of the statements and cutting the part that I need off.. Thank you! makes more sense now

 

edit: simply grepping for lines with G50 or BEG at the beginning of that should work :)

"If a Lobster is a fish because it moves by jumping, then a kangaroo is a bird" - Admiral Paulo de Castro Moreira da Silva

"There is nothing more difficult than fixing something that isn't all the way broken yet." - Author Unknown

Spoiler

Intel Core i7-3960X @ 4.6 GHz - Asus P9X79WS/IPMI - 12GB DDR3-1600 quad-channel - EVGA GTX 1080ti SC - Fractal Design Define R5 - 500GB Crucial MX200 - NH-D15 - Logitech G710+ - Mionix Naos 7000 - Sennheiser PC350 w/Topping VX-1

Link to post
Share on other sites

After some quick testing with awk: awk 'match($0, /*([0-9]+)~/, res) { print res[1] }' file1 file2 ...

 

Where:

  • match($0, /*([0-9]+)~/, res) matches the regular expression *([0-9]+)~ (regular expressions are enclosed by // in awk) to $0 (i.e., stdin) and saves the result to res.
  • { print res[1] } prints the first item of the (0-indexed) result array.  Since there were capturing parentheses in the regular expression, this is what was inside the parentheses.  (The 0th item in the array for this bit of code is the full match, as if there weren't capturing parentheses there).
  • file1 file 2... are your files you need to find this pattern in.

For the line G50*N*20160207*18885~ , this should return just 18885.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×