Jump to content

Hello guys,

 

Full disclaimer, I have close to zero knowledge on anything in this topic.

I have an issue properly scripting a HTTP GET request. the API only allows for the 50 records at a time; to get results from the 51st record, we need to use a pagination variable "n", so n=51

I have a need to get about 2000 records, so roughly 40 queries, where the last one will be something like n=1951. A bash script can be created to increment the variable n until a limit and append the results of the query. However, this will mean that each start of a new page will begin with <?xml, which will break the xml file and make it unreadable in something like excel.

My question is: is there a way to continue with incrementing the variable n, but without adding the <?xml at the start of each page?

 

Thanks!

Link to comment
https://linustechtips.com/topic/1170782-bash-http-get-request/
Share on other sites

Link to post
Share on other sites

I can think of multiple solutions for this:

 

  • Save each page to a separate XML file
  • Run the output through a regex to remove all <?xml tags except the first one (or just all of them and then add the first one manually)
  • Start the concatenation of a new page from the 6th character
  • Extract the data you need from each page and create your xml file with the desired format within the script

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
https://linustechtips.com/topic/1170782-bash-http-get-request/#findComment-13423291
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×