Jump to content

JavaScript Odd str.split behaviour, str.match works fine

Alef
Go to solution Solved by Mira Yurizaki,

The parenthesis in your regular expression is creating separate capture groups that split is using. If you get rid of them then it appears to work as intended.

[0-9]{5}\s+-+\s+[A-Z]{3,4}\s+[0-9]{4}U/g

I was using https://regex101.com/ to verify this.

I am trying to split a string on a matched regular expression but its not behaving as intended. It splits in weird spots. While if use the str.match function with the same regular expression it finds all the things I want to split on.  So I am kind of baffled to whats going on. 

 

Trying to split using following expression. This matches for example "74017 - SOFE 3850U", "74019 - SOFE 3850U", "74020 - SOFE 3850U" ........

/([0-9]{5})+\s+-+\s+([A-Z]{3,4})+\s+([0-9]{4})+U/g

 

 

 

Sample of the data I am attempting to split on using the regular expression above

74017 - SOFE 3850U,Lecture,11:10 am,12:30 pm,T,Lecture,11:10 am,12:30 pm,R,Lecture,74019 - SOFE 3850U,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,74020 - SOFE 3850U,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,12:40 pm,3:30 pm,W,Laboratory,74021 - SOFE 3850U,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,74283 - SOFE 3850U,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,12:40 pm,3:30 pm,T,Laboratory,74285 - SOFE 3850U,Laboratory,12:40 pm,3:30 pm,R,Laboratory,12:40 pm,3:30 pm,R,Laboratory,12:40 pm,3:30 pm,R,Laboratory,12:40 pm,3:30 pm,R,Laboratory,12:40 pm,3:30 pm,R,Laboratory,12:40 pm,3:30 pm,R,Laboratory,74022 - SOFE 3850U,Tutorial,2:10 pm,3:30 pm,M,Tutorial,74023 - SOFE 3850U,Tutorial,8:10 am,9:30 am,M,Tutorial

 

 

 

EDIT: Found a suboptimal solution by using str.replace first then str.split. I was able to do "somestring".replace(regex, "^") then split on "^" . Still want to know why the split function inst working as intended though.

 

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, Mira Yurizaki said:

The parenthesis in your regular expression is creating separate capture groups that split is using. If you get rid of them then it appears to work as intended.


[0-9]{5}\s+-+\s+[A-Z]{3,4}\s+[0-9]{4}U/g

I was using https://regex101.com/ to verify this.

 

Wow, I was so close I tried removing "+" from the expression thinking it might affect it, but never did try removing brackets. Thank you for your help

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×