Jump to content

Scrapping Google Auth protected page

ShatteredPsycho

Hello everyone,

 

I am trying to scrape Google Postmaster Tools so that I can automate investigative tasks at work. Until now I haven't been able to pass google's two step verification.

I am using Scrapy and will be trying Selenium next.

Does anyone have any idea on how to go about this?

 

So far, after the password injection, I get :
 

[scrapy.core.engine] DEBUG: Crawled (200) <GET https://accounts.google.com/signin/rejected?hl=pt-PT&rrk=47&rhlk=js> (referer: https://accounts.google.com/signin/v1/lookup)

 

I was looking into the app password thing for less secure apps but haven't been lucky either

 

Any tips appreciated

Link to comment
Share on other sites

Link to post
Share on other sites

Not familiar with Scrapy.  If it gives you a way to view the http request it's sending, I'd check the headers and compare them to what Chrome or Firefox is sending.  I suspect the user-agent and referred by strings don't match up to what google normally gets when a browser sends the request and so you're being denied.

 

Additionally, how are you submitting the form?  Easiest way to make sure it's done properly is to select the submit button and have it perform the click action, that way if there's any additional javascript that needs to be run it will be.

 

Additionally, you might try performing a delay of around 3-5 seconds between individual form submissions, so that it doesn't look like a bot is submitting them.

Link to comment
Share on other sites

Link to post
Share on other sites

If you setup 2fa on your Google account and login manually (triggering the 2fa once) you will not get the 2 step verification again. Then you can automate the login (using selenium) with a simple button click using chrome's autofill for the credentials. 

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/12/2019 at 3:43 PM, JacobFW said:

Not familiar with Scrapy.  If it gives you a way to view the http request it's sending, I'd check the headers and compare them to what Chrome or Firefox is sending.  I suspect the user-agent and referred by strings don't match up to what google normally gets when a browser sends the request and so you're being denied.

 

Additionally, how are you submitting the form?  Easiest way to make sure it's done properly is to select the submit button and have it perform the click action, that way if there's any additional javascript that needs to be run it will be.

 

Additionally, you might try performing a delay of around 3-5 seconds between individual form submissions, so that it doesn't look like a bot is submitting them.

 

On 8/13/2019 at 2:26 PM, FlappyBoobs said:

If you setup 2fa on your Google account and login manually (triggering the 2fa once) you will not get the 2 step verification again. Then you can automate the login (using selenium) with a simple button click using chrome's autofill for the credentials. 

The scrapy thing wasn't gettings results. I setup selenium headless and managed to get it working. I scrape the data from xpath and all works.

Only way I think scrapy would work was if I carried all the headers between requests and managed the times between them, at this moment it's not worth it as the task that I am developing this two only needs to run once a day

Selenium is working wonders, i'm finishing up some details and I might release it sometime around next week. Miight be usefull for others

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×