Jump to content

Hey,

 

So I have a string of words and I need to lemmatize it.

 

I know that 

lmtzr.lemmatize("cats")

will output "cat"

 

but how do I get 

x = ("cats cats cats cat")
lmtzr.lemmatize(x)

to output "cat cat cat cat"?

As it is, it will attempt to lemmatize "cats cats..." and it lemmatized it to "cats cats cats cat"

 

I was trying to do a for loop, but it would not work...

for word in x:
	lmtzr.lemmatize(word)

Any idea?

CPU: Intel Core i7 8700k CPU Cooler: Corsair Hydro Series H100i Mobo:  Memory: G.Skill Ripjaws X 32GB 2133 Storage #1: 1TB 850 EVO SSD Storage #2: Western Digital Black 2TB Storage #3: Western Digital Green 4TB GPU: Gigabyte 980 Ti G1 Case: Mastercase5 PSU: EVGA 750 W G2 80+Gold Keyboard: Corsair K70 RGB Cherry MX Brown Mouse: Razer Deathadder Elite Monitor: LG 34UM94 Headset: Bose

Phone: Samsung Galaxy S9

Link to comment
https://linustechtips.com/topic/550655-lemmatizing-strings-in-python/
Share on other sites

Link to post
Share on other sites

for word in x.split(" ")

 

this should divide the string at each space

then after your loop you just need to append the output (plus a space) to a new string

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to post
Share on other sites

15 minutes ago, Enderman said:

for word in x.split(" ")

 

this should divide the string at each space

then after your loop you just need to append the output (plus a space) to a new string

That did not work...
 

<code>

x = ("cats cats cats cat dogs dog")

for word in x.split(" "):
#    print (word)
    
lmtzr.lemmatize(word)

</code>

printed out 

'dog'

CPU: Intel Core i7 8700k CPU Cooler: Corsair Hydro Series H100i Mobo:  Memory: G.Skill Ripjaws X 32GB 2133 Storage #1: 1TB 850 EVO SSD Storage #2: Western Digital Black 2TB Storage #3: Western Digital Green 4TB GPU: Gigabyte 980 Ti G1 Case: Mastercase5 PSU: EVGA 750 W G2 80+Gold Keyboard: Corsair K70 RGB Cherry MX Brown Mouse: Razer Deathadder Elite Monitor: LG 34UM94 Headset: Bose

Phone: Samsung Galaxy S9

Link to post
Share on other sites

10 minutes ago, paps511 said:

That did not work...
 

<code>

x = ("cats cats cats cat dogs dog")

for word in x.split(" "):
#    print (word)
    
lmtzr.lemmatize(word)

</code>

printed out 


'dog'

hm

i dont have python on my pc so i cant test sry

if you do your original code with the 

for word in x

print word

 

what does it print out

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to post
Share on other sites

18 minutes ago, Enderman said:

for word in x.split(" ")

 

this should divide the string at each space

then after your loop you just need to append the output (plus a space) to a new string

c
a
t
s
 
c
a
t
s
 
c
a
t
s
 
c
a
t
 
d
o
g
s
 
d
o
g

CPU: Intel Core i7 8700k CPU Cooler: Corsair Hydro Series H100i Mobo:  Memory: G.Skill Ripjaws X 32GB 2133 Storage #1: 1TB 850 EVO SSD Storage #2: Western Digital Black 2TB Storage #3: Western Digital Green 4TB GPU: Gigabyte 980 Ti G1 Case: Mastercase5 PSU: EVGA 750 W G2 80+Gold Keyboard: Corsair K70 RGB Cherry MX Brown Mouse: Razer Deathadder Elite Monitor: LG 34UM94 Headset: Bose

Phone: Samsung Galaxy S9

Link to post
Share on other sites

4 minutes ago, paps511 said:

c
a
t
s
 
c
a
t
s
 
c
a
t
s
 
c
a
t
 
d
o
g
s
 
d
o
g

ok yeah you definitely need to use split

 

try this:

 

x = 'cats cats cats cat dogs dog'

for word in x.split(" "):
    lmtzr.lemmatize(word)

    print (word)

 

make sure you dont forget the space between the " "

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to post
Share on other sites

10 minutes ago, Enderman said:

ok yeah you definitely need to use split

 

try this:

 

x = 'cats cats cats cat dogs dog'

for word in x.split(" "):
    lmtzr.lemmatize(word)

    print (word)

 

make sure you dont forget the space between the " "

Yep, that was part it, I got it to work.

I needed another variable = " "

then 

for i in contractionless.split():
    lemmed += (lmtzr.lemmatize(i)) + " "

CPU: Intel Core i7 8700k CPU Cooler: Corsair Hydro Series H100i Mobo:  Memory: G.Skill Ripjaws X 32GB 2133 Storage #1: 1TB 850 EVO SSD Storage #2: Western Digital Black 2TB Storage #3: Western Digital Green 4TB GPU: Gigabyte 980 Ti G1 Case: Mastercase5 PSU: EVGA 750 W G2 80+Gold Keyboard: Corsair K70 RGB Cherry MX Brown Mouse: Razer Deathadder Elite Monitor: LG 34UM94 Headset: Bose

Phone: Samsung Galaxy S9

Link to post
Share on other sites

1 minute ago, paps511 said:

Yep, that was part it, I got it to work.

I needed another variable = " "

then 

for i in contractionless.split():
    lemmed += (lmtzr.lemmatize(i)) + " "

yay :)

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×