Jump to content

Could someone help me with this Python code?

Go to solution Solved by Sauron,
22 minutes ago, LebowskiBuschemi said:

I was referring to the first loop. The one with the enumeration command.

you need that one because you're iterating over a list of sentences, each of which is turned into a list of words with split.

23 minutes ago, LebowskiBuschemi said:

Also, regarding the enumerate command, how come we use it in Python? Aren't lists indexed to begin with?

Lists are indexed but they don't contain the index as data. When you use the for command like that you're getting a single element without the index. If you want you can use a less abstract syntax to skip the enumerate:

for i in range(len(doc_list) - 1):
  words = [token.rstrip(".,").lower() for token in doc_list[i].split()]

but there is no way to determine the index of a list element from the element itself.

So I'm taking a free Python course on Kaggle and one of the exercises was to print the index of the first occurrence of a word within a list of strings. For some reason, even though the word car is present within the list, I'm not getting the correct output. I just keep getting "absent"

Here's the code:

Quote

 

doc_list=['The Learn Python Challenge Casino', 'They bought a car, and a horse', 'Casinoville?']
keyword="car"

def word_search(doc_list, keyword):
    list1=[]
    for i, doc in enumerate(doc_list):
            words = doc.rstrip('.,').lower().split()
            if keyword.lower() in words:
               list1.append(i)
               return list1

 

 

Link to comment
https://linustechtips.com/topic/1400478-could-someone-help-me-with-this-python-code/
Share on other sites

Link to post
Share on other sites

7 minutes ago, Sauron said:

you defined the function but never called it...

Oh, that seems to have fix the compiler issue. Sorry, I've been spoiled by the Kaggle compiler. When it has you writing commands in a function body, it calls the function on its own to let you see the output.

With that being said, I am having this one issue which is that the word "car" isn't being detected even though it's within the list. Any advice on how I can fix that?

Link to post
Share on other sites

8 minutes ago, LebowskiBuschemi said:

With that being said, I am having this one issue which is that the word "car" isn't being detected even though it's within the list. Any advice on how I can fix that?

Try printing out the content of "words", it may clue you in to what's going wrong. I can already tell you that rstrip isn't doing what you think it's doing; it only works on trailing characters, not characters in the middle of the string. You either need to split the sentence and then remove trailing commas or remove the unwanted characters some other way.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

24 minutes ago, Sauron said:

Try printing out the content of "words", it may clue you in to what's going wrong. I can already tell you that rstrip isn't doing what you think it's doing; it only works on trailing characters, not characters in the middle of the string. You either need to split the sentence and then remove trailing commas or remove the unwanted characters some other way.

I looked up the solution for the exercise and the person in the solution is using rstrip in this way:

Quote

tokens=doc.split()
        words = [token.rstrip('.,').lower() for token in tokens]

Is there a reason why he didn't just use the split and rstrip functions in a single line and why is there a loop being used? Is the loop being used so that wherever a comma is indexed within the list, it's removed from that indexed location?

 

Link to post
Share on other sites

3 minutes ago, LebowskiBuschemi said:

Is there a reason why he didn't just use the split and rstrip functions in a single line and why is there a loop being used?

because rstrip operates on a single string while split returns a list. if you really want a one-liner you can pull it off like this:

words = [token.rstrip(".,").lower() for token in doc.split()]

 

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

16 minutes ago, Sauron said:

because rstrip operates on a single string while split returns a list. if you really want a one-liner you can pull it off like this:

words = [token.rstrip(".,").lower() for token in doc.split()]

 

What about the loop? Are we using a second loop so we go over the list again to remove any commas and periods? Can't we just do that using the first loop instead of having to use a second one?

Link to post
Share on other sites

27 minutes ago, LebowskiBuschemi said:

What about the loop? Are we using a second loop so we go over the list again to remove any commas and periods?

yes, we're iterating through the list to apply the rstrip function to each individual string.

28 minutes ago, LebowskiBuschemi said:

Can't we just do that using the first loop instead of having to use a second one?

there is only one loop here...

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

4 minutes ago, Sauron said:

there is only one loop here...

I was referring to the first loop. The one with the enumeration command. Is there a possible way we could just use the first loop somehow to remove the commas and periods without having to use the the second loop which is the token one? Also, regarding the enumerate command, how come we use it in Python? Aren't lists indexed to begin with? Can't we somehow work with the indexes of the data within the list without having to use the command?

I'm sorry if the questions seem a bit absurd. I just want to get a good grip over the language itself. 

Link to post
Share on other sites

22 minutes ago, LebowskiBuschemi said:

I was referring to the first loop. The one with the enumeration command.

you need that one because you're iterating over a list of sentences, each of which is turned into a list of words with split.

23 minutes ago, LebowskiBuschemi said:

Also, regarding the enumerate command, how come we use it in Python? Aren't lists indexed to begin with?

Lists are indexed but they don't contain the index as data. When you use the for command like that you're getting a single element without the index. If you want you can use a less abstract syntax to skip the enumerate:

for i in range(len(doc_list) - 1):
  words = [token.rstrip(".,").lower() for token in doc_list[i].split()]

but there is no way to determine the index of a list element from the element itself.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

4 minutes ago, Sauron said:

you need that one because you're iterating over a list of sentences, each of which is turned into a list of words with split.

Lists are indexed but they don't contain the index as data. When you use the for command like that you're getting a single element without the index. If you want you can use a less abstract syntax to skip the enumerate:

for i in range(len(doc_list) - 1):
  words = [token.rstrip(".,").lower() for token in doc_list[i].split()]

but there is no way to determine the index of a list element from the element itself.

Thank you so much for all the help you've given me. Have a good day!

Link to post
Share on other sites

3 minutes ago, LebowskiBuschemi said:

Thank you so much for all the help you've given me. Have a good day!

you can mark a post as "best answer" to close the thread 👍

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×