Jump to content

Regex

Go to solution Solved by leodaniel,

I would use following regex

\/\s(1?\d.?\d?)\s-\s

starts by / followed by space

captures leading 1 (optional) followed by any number, then optional . and optional number

ends with space - space

 

https://repl.it/KUpk/5

So, I am working on a  script that gets shoes from websites it's all going well but I am trying to get the sizes that are in stock and I have hit a wall :(

 

I am getting data like this, I only want the 8,8.5 and so on. I just can't get it right :/

 

 

BLK/BLK/CYEL / 8 - $500.00
BLK/BLK/CYEL / 8.5 - $500.00
BLK/BLK/CYEL / 9 - $500.00
BLK/BLK/CYEL / 9.5 - $500.00
BLK/BLK/CYEL / 10 - $500.00
BLK/BLK/CYEL / 10.5 - $500.00
BLK/BLK/CYEL / 11 - $500.00
BLK/BLK/CYEL / 11.5 - $500.00

I have tried 

 

\b[0-9]{1,2}\.?\b

but this gets all the 00 at the end of the price and catches 10.5 as 10. and 5

                     ¸„»°'´¸„»°'´ Vorticalbox `'°«„¸`'°«„¸
`'°«„¸¸„»°'´¸„»°'´`'°«„¸Scientia Potentia est  ¸„»°'´`'°«„¸`'°«„¸¸„»°'´

Link to comment
https://linustechtips.com/topic/835916-regex/
Share on other sites

Link to post
Share on other sites

I would recommend a regex with a capturing group, so you can match with the surrounding content but only capture the actual number. Something like

/ (\d{1,2}\.?5?) -

(match "/ ", then capture 1-2 numbers, optionally ".", and optionally "5", then match but don't capture " -").

HTTP/2 203

Link to comment
https://linustechtips.com/topic/835916-regex/#findComment-10444297
Share on other sites

Link to post
Share on other sites

8 minutes ago, vorticalbox said:

So, I am working on a  script that gets shoes from websites it's all going well but I am trying to get the sizes that are in stock and I have hit a wall :(

 

I am getting data like this, I only want the 8,8.5 and so on. I just can't get it right :/

 

 


BLK/BLK/CYEL / 8 - $500.00
BLK/BLK/CYEL / 8.5 - $500.00
BLK/BLK/CYEL / 9 - $500.00
BLK/BLK/CYEL / 9.5 - $500.00
BLK/BLK/CYEL / 10 - $500.00
BLK/BLK/CYEL / 10.5 - $500.00
BLK/BLK/CYEL / 11 - $500.00
BLK/BLK/CYEL / 11.5 - $500.00

I have tried 

 


\b[0-9]{1,2}\.?\b

but this gets all the 00 at the end of the price and catches 10.5 as 10. and 5

Hi,

 

It is going to be quite trickiy distinguishing multiple cases of size formats. I would recommend splitting the string by spaces and just getting the 3rd item. It's not perfect, but the Regex for sizes would be massive. 

Try, fail, learn, repeat...

Link to comment
https://linustechtips.com/topic/835916-regex/#findComment-10444299
Share on other sites

Link to post
Share on other sites

26 minutes ago, vorticalbox said:

So, I am working on a  script that gets shoes from websites it's all going well but I am trying to get the sizes that are in stock and I have hit a wall :(

 

I am getting data like this, I only want the 8,8.5 and so on. I just can't get it right :/

 

 


BLK/BLK/CYEL / 8 - $500.00
BLK/BLK/CYEL / 8.5 - $500.00
BLK/BLK/CYEL / 9 - $500.00
BLK/BLK/CYEL / 9.5 - $500.00
BLK/BLK/CYEL / 10 - $500.00
BLK/BLK/CYEL / 10.5 - $500.00
BLK/BLK/CYEL / 11 - $500.00
BLK/BLK/CYEL / 11.5 - $500.00

I have tried 

 


\b[0-9]{1,2}\.?\b

but this gets all the 00 at the end of the price and catches 10.5 as 10. and 5

 

5 minutes ago, zwirek2201 said:

Hi,

 

It is going to be quite trickiy distinguishing multiple cases of size formats. I would recommend splitting the string by spaces and just getting the 3rd item. It's not perfect, but the Regex for sizes would be massive. 

Alternatively, you could use following Regex (I completely missed spaces in strings):

 

( \d+.?\d? )

 

Remember to trim spaces at the beggining and end.

Try, fail, learn, repeat...

Link to comment
https://linustechtips.com/topic/835916-regex/#findComment-10444319
Share on other sites

Link to post
Share on other sites

1 minute ago, zwirek2201 said:

It is going to be quite trickiy distinguishing multiple cases of size formats. I would recommend splitting the string by spaces and just getting the 3rd item. It's not perfect, but the Regex for sizes would be massive. 

The string is different every time, so it may not by the 3rd item every single time

                     ¸„»°'´¸„»°'´ Vorticalbox `'°«„¸`'°«„¸
`'°«„¸¸„»°'´¸„»°'´`'°«„¸Scientia Potentia est  ¸„»°'´`'°«„¸`'°«„¸¸„»°'´

Link to comment
https://linustechtips.com/topic/835916-regex/#findComment-10444322
Share on other sites

Link to post
Share on other sites

I would use following regex

\/\s(1?\d.?\d?)\s-\s

starts by / followed by space

captures leading 1 (optional) followed by any number, then optional . and optional number

ends with space - space

 

https://repl.it/KUpk/5

Business Management Student @ University St. Gallen (Switzerland)

HomeServer: i7 4930k - GTX 1070ti - ASUS Rampage IV Gene - 32Gb Ram

Laptop: MacBook Pro Retina 15" 2018

Operating Systems (Virtualised using VMware): Windows Pro 10, Cent OS 7

Occupation: Software Engineer

Link to comment
https://linustechtips.com/topic/835916-regex/#findComment-10444553
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×