Jump to content

Help on postback webpages, with C++

I am making a program in C++ that needs to download HTML files and extract info from them. The program is to try and figure out my exact nyseg bill by using data from their website. Being a novice in web interaction I managed to figure out how to download webpages with this code:

 

#include <iostream>
#include <fstream>
#include <string>
#include <tchar.h>
#include <urlmon.h>
#pragma comment(lib, "urlmon.lib")

using namespace std;

int main() {

	ifstream fin;
	string tempString;

	HRESULT hr = URLDownloadToFile(NULL, _T("http://www.nyseg.com/SuppliersAndPartners/pricingandtariffs/electricitytariffs/transitionchargestatements.html"), _T("nyseg.txt"), 0, NULL);

	fin.open("nyseg.txt");

	for (int i = 0; i < 20; i++) {

		getline(fin, tempString);

		cout << tempString << endl << endl;
	}
	fin.close();
	
	return 0;


}

 

So as long as a I had the url I could then download pages and avoid actually interacting with the pages, until I notice something I think is called postback? This, https://ebiz1.nyseg.com/cusweb/opcosupplyprice.aspx, is a tool for their prices and when you submit to the form the URL doesn't change. The page updates with new information after clicking submit, yet, the URL doesn't change. I spent a few hours looking at the code and researching and I just don't understand it. It uses Javascript in some way and I don't know that language. I see this code in the submit button:
 

onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;btnPage1&quot;, &quot;&quot;, true, &quot;&quot;, &quot;&quot;, false, false))"

I simply want to know how to get the updated page downloaded so that I can get the information that I need.

-Thanks

Link to comment
Share on other sites

Link to post
Share on other sites

You can use network tab in dev tools in your browser to see where and what data are send.

 

If this site you try to crawl on doesn't need you to log in then it os one problem less. But if it does then you need cookie jar for persist session cookies. I don't know capabilities of urlmon, but I know curl lib is capable.

 

I cannot check the site, It is unreachable for me.

Link to comment
Share on other sites

Link to post
Share on other sites

The URL doesn't change because it's using an HTTP POST request to retrieve the information, as opposed to an HTTP GET request, which would look like so: 

http://website.com/page.aspx?parameter=value&parameter2=value

You either have to perform the POST request manually, or interact with the page and do the postback event, then download the updated page.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×