Jump to content

[Python] What would be the best way to remove everything but letters?

For a programming assignment I am supposed to write a function that removes everything from a string except letters. My current method has more that a few flaws, and I was wondering if there was a better way to do it. Here is the current function:

 

def remove_extras(string):
    return string.translate(None,' ?.!/;:=-)(*&^%$#@`~><\|}{,_+"'+"'")
 
Currently it just translates all symbols to strings, but this is not 100% effective in testing.
Link to comment
Share on other sites

Link to post
Share on other sites

you would probably be better off using simple regex. Here :
 
 

import redef only_alpha(text):    regex = re.compile(r'[A-Za-z]+')    ans = "".join(regex.findall(text))    return ans

it should probably do it if you have any confusion about the regex just ask i'll explain it to you

Link to comment
Share on other sites

Link to post
Share on other sites

    regex = re.compile(r'[A-Za-z]+')    ans = "".join(regex.findall(text))

it should probably do it if you have any confusion about the regex just ask i'll explain it to you

What are those 2 lines doing? I don't really know anything about regex.

Link to comment
Share on other sites

Link to post
Share on other sites

the first line is compiling a regular for later use.

 

I'll explain piece by piece :

 

r' ' : it for python raw strings for unconflicting regex

[ ] : are selectors to specify which characters and/or range of characters to find

A-Z: specify all upper case letters and a-z all lower  case letters

+ : is a greedy meta-character which tells the regex engine to find 1 or more characters which match earlier criteria

 

lets  see an example :http://docs.python.org/2/howto/regex.html

 

string = 'i_am doing**it ^&wrong 1234'

 

and we're using findall so it will well find all parts of the string that match our regex and return a list

 

list = ['i','am','doing','it','wrong']

 

and then join everything in the list with the blank string and return "iamdoingitwrong"

 

Although it is a little bit complicated since you don't about regex much but it is a lot more safer and short and frankly it will far more easy when you learn about regex

 

you can learn more about regex here : http://docs.python.org/2/howto/regex.html

Link to comment
Share on other sites

Link to post
Share on other sites

Or if you prefer the not so regex way

import stringsomeString = 'jkKALjd(@)(EJ)DJQ(@JD(Q*JD( NDM(*@QJD(Q*JD(*QJ(_@*J('fixedString = ''.join([x for x in someString if x in (string.ascii_lowercase + string.ascii_uppercase)])

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

 

Or if you prefer the not so regex way

import stringsomeString = 'jkKALjd(@)(EJ)DJQ(@JD(Q*JD( NDM(*@QJD(Q*JD(*QJ(_@*J('fixedString = ''.join([x for x in someString if x in (string.ascii_lowercase + string.ascii_uppercase)])

 

Actually its simply better.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×