Jump to content

Get HTML of a website using Ajax

Adonis4000

My end goal is to get the html code of a website and modify it in some way to suit my needs.

I can't however for the life of me, get the html.

Here is my code so far. Note, that I am using JQuery.

 

index.html

<!DOCTYPE html>
<html lang="en">

<head>
    <title>Document</title>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.1/jquery.min.js"></script>
    <script src="EditHtml.js" type="text/javascript"></script>
</head>

<body>
    <p>Placeholder Text</p>
</body>

</html>

EditHtml.js

$(document).ready(function () {
    $.ajax({ url: 'my-link', type: 'GET', success: function(data) { $(document).html(data); } })
        .done(function() {
            alert( "success" );
        })
        .fail(function() {
            alert( "error" );
        })
        .always(function() {
            alert( "complete" );
        });;
});

The alert is always "error" and then "complete". I tried multiple urls and nothing worked.

Link to comment
Share on other sites

Link to post
Share on other sites

I have to imagine you're probably coming across Cross Origin request errors.  Take a look at the network tab of the dev tools of the browser you have this running in.  You'll be able to see more detail about the request and why it was rejected.

Link to comment
Share on other sites

Link to post
Share on other sites

19 hours ago, Vicarian said:

I have to imagine you're probably coming across Cross Origin request errors.  Take a look at the network tab of the dev tools of the browser you have this running in.  You'll be able to see more detail about the request and why it was rejected.

You are correct. Thanks for the tip.
The error has been:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at "my-link". (Reason: CORS header ‘Access-Control-Allow-Origin’ missing). Status code: 200.
I tried changing my code to allow cross orign requests, but I still can't get it to work.
 

$(document).ready(function () {
    $.ajax({
        url: 'my-link',
        type: 'GET',
        // dataType: "html",
        beforeSend: function(xhr) {
            xhr.setRequestHeader('Access-Control-Allow-Credentials', 'true');
            xhr.setRequestHeader('Access-Control-Allow-Headers', 'Content-Type');
            xhr.setRequestHeader('Access-Control-Allow-Methods', 'GET, POST, PUT, DELETE, OPTIONS');
            xhr.setRequestHeader('Access-Control-Allow-Origin', '*');
        },
        success: function(data) { $(document).html(data); }
        })
        .done(function() {
            alert( "success" );
        })
        .fail(function() {
            alert( "error" );
        })
        .always(function() {
            alert( "complete" );
        });;
});

Now I also get this error:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at "my-link". (Reason: CORS request did not succeed). Status code: (null).

 

To my understanding, this is sort of a security feature and I would need to set Access-Control-Allow-Origin to allow my website on the target websites server, but I don't see why. I'm just trying to get the html code that everyone has access to when they load the website.

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Adonis4000 said:

To my understanding, this is sort of a security feature and I would need to set Access-Control-Allow-Origin to allow my website on the target websites server, but I don't see why. I'm just trying to get the html code that everyone has access to when they load the website.

Yes, it's a security feature. It is intended to protect against malicious code from third parties interfering with a website.

https://medium.com/bigcommerce-developer-blog/lets-talk-about-cors-84800c726919

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

23 hours ago, Eigenvektor said:

Yes, it's a security feature. It is intended to protect against malicious code from third parties interfering with a website.

https://medium.com/bigcommerce-developer-blog/lets-talk-about-cors-84800c726919

function getHTML() {
  var xhr = new XMLHttpRequest();
  xhr.onreadystatechange = function() {
    if (xhr.readyState == XMLHttpRequest.DONE) {
      var html = xhr.responseText;
      // Do something with the HTML here
    }
  }
  xhr.open('GET', 'https://example.com', true);
  xhr.send();
}

 

async function getHTML() {
  const response = await fetch('https://example.com');
  const html = await response.text();
  // Do something with the HTML here
}

 

will this do it?

hey! i know to use a computer

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×