Jump to content

Study Asks

Spirit`yL

Hello,guys.I know that this page might no talking about HardWare,please dont report me,thanks a lot.

Well,recently i had to learn a Python third party libraries which BeautifulSoup(Does it really taste good?😥),and now here is a project(actually is a homework yet),need me to Statistics  how many times of each name has appear in this html page.

Here is my way:contract all name into a txt files,and use split() and dictionary to make it,but here is a thing that when i try to contract this BS moudle into a txt files,it only appear one name!! Where is my fault?Please point for me if you are studying Python now or you are a Python Worker or Master on it.

Thanks a Lot.

QQ截图20200325112306.png

catch.txt PythonWebCatchApplication1.py

Link to comment
Share on other sites

Link to post
Share on other sites

You should write to the file inside the for loop. The way you're doing you're just saving the remaining garbage in the span variable.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, igormp said:

You should write to the file inside the for loop. The way you're doing you're just saving the remaining garbage in the span variable.

How should i do?Can you help t orecorrect my file?link is under below,thanks🤣

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Spirit`yL said:

How should i do?Can you help t orecorrect my file?link is under below,thanks🤣

Doing something like that should do the trick:

f=open('catch.txt','a')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')

 

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, igormp said:

Doing something like that should do the trick:


f=open('catch.txt','a')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')

 

got it dude,i will try it

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, igormp said:

Doing something like that should do the trick:


f=open('catch.txt','a')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')

 

btw,what`s the meaning of mode 'a'?

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Spirit`yL said:

import requests
r=requests.get('http://www.pythonscraping.com/pages/warandpeace.html')
info=r.text
import bs4
result=bs4.BeautifulSoup(info,'html.parser')
name=bs4.BeautifulSoup.find_all(result,'span',{'class':"green"})
f=open('catch.txt','w')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')
fin=f.read()
lis=split(fin,'\n')
d=[]
for word in lis:
    if lis not in word:
        d[lis]=1
    else:
        d[lis]+=1
print(d)

That is totaly what i write,but it says unreaddable in my VisualStudio yet.

 

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, Spirit`yL said:

 

import requests
r=requests.get('http://www.pythonscraping.com/pages/warandpeace.html')
info=r.text
import bs4
result=bs4.BeautifulSoup(info,'html.parser')
name=bs4.BeautifulSoup.find_all(result,'span',{'class':"green"})
f=open('catch.txt','w')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')
f.close()
fin=open('catch.txt', 'r')
names=fin.read().split('\n')
d={}
for word in names:
    if d.get(word):
        d[word]+=1
    else:
        d[word]=1
print(d)
fin.close()

You can do something like this. You can either use f.seek(0) to return back to the beginning of the file instead of closing and reopening if you want. Make sure you close the file after you're done reading it though.

 

Also, you could use a Counter instead of writing to a file (or just count the names from the "name" array the way you are currently doing it).

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, elpiop said:

import requests
r=requests.get('http://www.pythonscraping.com/pages/warandpeace.html')
info=r.text
import bs4
result=bs4.BeautifulSoup(info,'html.parser')
name=bs4.BeautifulSoup.find_all(result,'span',{'class':"green"})
f=open('catch.txt','w')
for span in name:
    print(span.string)
    f.write(span.string)
    f.write('\n')
f.close()
fin=open('catch.txt', 'r')
names=fin.read().split('\n')
d={}
for word in names:
    if d.get(word):
        d[word]+=1
    else:
        d[word]=1
print(d)
fin.close()

You can do something like this. You can either use f.seek(0) to return back to the beginning of the file instead of closing and reopening if you want. Make sure you close the file after you're done reading it though.

 

Also, you could use a Counter instead of writing to a file (or just count the names from the "name" array the way you are currently doing it).

Well,why you make a close options upon open command?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×