Jump to content

c++ Encoding Gzip/Deflate?

Go to solution Solved by Mr_KoKa,

Browser sends Accept-Encoding header, and server responds with Content-Encoding, header is not encoded, so you send header as you was, and then instead of sending "stuff" you encode it with zlib either using deflate or gzip and then send such encoded data.

From what I have read, deflate is gzip without headers, like filename and time, but I also read that browsers had some troubles of interpreting deflate (they do some fallback and what not) and it is better to use gzip despite it has more overhead.

 

You would need to look some examples of zlib usage, and whe you want to encode deflate, you use deflateInit2 function with windowBits set to 15, and 31 for gzip.

It (deflateInit2) is described under Advanced Functions here http://www.zlib.net/manual.html

 

I have never tried it, I just read about it. You can experiment some with curl tool.

When you invoke

 curl -H "Accept-Encoding: plain" -i www.example.com

or entirely ommit Accept-Encodig header you will get in response no Content-Encoding header and plain body.

HTTP/1.1 200 OK
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:24:04 GMT
Etag: "359670651+ident"
Expires: Sun, 25 Sep 2016 19:24:04 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 1270

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 50px;
        background-color: #fff;
        border-radius: 1em;
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        body {
            background-color: #fff;
        }
        div {
            width: auto;
            margin: 0 auto;
            border-radius: 0;
            padding: 1em;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is established to be used for illustrative examples in documents. You may use this
    domain in examples without prior coordination or asking for permission.</p>
    <p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

 

if you do: 

curl -H "Accept-Encoding: deflate, gzip" --compressed -i www.example.com

You will get Content-Encoding: gzip (as browsers has no doubts how to handle it and it is picked despite deflate was first in order) and content will be compressed.

HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:25:24 GMT
Etag: "359670651"
Expires: Sun, 25 Sep 2016 19:25:24 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 606

▼ ;ü♣R ♥ŹTA»Ë0♀ż´Wśr☺i]¸ÇSÎV @Ô☻∟Ó┬1k▄ŇZôö$Ý6í¸▀q█Ż«ňÝ@+ÁÄ↔⌂■lăI×IôűsŹPzUeőń˝çBf♂Ó'˝ń+╠>čä¬+äOF ĎI4h↨├▬ů^@^
♂íĘ:ă►|»Q├☼í]░äÓ♂V-z╩♣|├♠Y3*ľ­┴r♠Kp╝5thęŞ"÷┬C ĽďNH¸Úă­vŻ«OOyŮú☻Đxsâţř╠ßVţ┼$■╬Xë6┤BRŃbŞCÁŁP↕qE˙░ń⌂KÄ<╩   ┴GŞÎŤ7Ť═♦▒ŰE(17Vx2╠U←ŹSđ¸
ź$¬GÍ   ▀↨▬ő4ŔnŹ8ŐÄăŃŐä▬+c¸Đ►¤EŚhA÷ŇXń­îž˙↑ź§âëČLóżRItę[4\o⌂Ľ¬ő÷♦  <×L˛rÉ╦╗'ú╔¬┴Vg?Kr {=░ä╬ő]E«ń^x♥;äĂ▒X↑♂TUŹˇ]Ň[♦∟►↔¤☼{šŹÔ☺s+°e→╬¨▄9ügńßÓ]đ§ŇŰHż4Źç┌↕#šćKA║´'­ZŞ♥žŮşĐ*rÄ

I noticed that Content-Length is same for plain and compressed, so you want probably measure length of content before you do compress it.

 

To make curl request compressed data and then uncompress it do:

curl -H "Accept-Encoding: gzip, deflate" --compressed -i www.example.com

There will be Content-Encoding header but content will be uncompressed by curl.

HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:27:18 GMT
Etag: "359670651"
Expires: Sun, 25 Sep 2016 19:27:18 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 606

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 50px;
        background-color: #fff;
        border-radius: 1em;
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        body {
            background-color: #fff;
        }
        div {
            width: auto;
            margin: 0 auto;
            border-radius: 0;
            padding: 1em;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is established to be used for illustrative examples in documents. You may use this
    domain in examples without prior coordination or asking for permission.</p>
    <p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>

 

Hi.

I'm coding a web server in c++, I'm reading on request headers, and testing out different browsers.

I am now building a class that stores and all around works with the request headers's info.

Everything seems pretty straight forward and has been working wonders, but I came across Accept-Encoding.

 

Now Firefox and Chrome both accept both gzip and deflate.

 

I'm don't understand what I'm supposed to do about this, straight up sending stuff through sockets seems to have been working so far.

I've found a lot on decoding, but I can't seem to be able to get any info on how to encode.

And what are these anyway? I've heard of utf-8 and stuff but this is completely unknown to me.

Should I encode anything? How would I do so?

 

Thank you.

Link to comment
Share on other sites

Link to post
Share on other sites

Browser sends Accept-Encoding header, and server responds with Content-Encoding, header is not encoded, so you send header as you was, and then instead of sending "stuff" you encode it with zlib either using deflate or gzip and then send such encoded data.

From what I have read, deflate is gzip without headers, like filename and time, but I also read that browsers had some troubles of interpreting deflate (they do some fallback and what not) and it is better to use gzip despite it has more overhead.

 

You would need to look some examples of zlib usage, and whe you want to encode deflate, you use deflateInit2 function with windowBits set to 15, and 31 for gzip.

It (deflateInit2) is described under Advanced Functions here http://www.zlib.net/manual.html

 

I have never tried it, I just read about it. You can experiment some with curl tool.

When you invoke

 curl -H "Accept-Encoding: plain" -i www.example.com

or entirely ommit Accept-Encodig header you will get in response no Content-Encoding header and plain body.

HTTP/1.1 200 OK
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:24:04 GMT
Etag: "359670651+ident"
Expires: Sun, 25 Sep 2016 19:24:04 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 1270

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 50px;
        background-color: #fff;
        border-radius: 1em;
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        body {
            background-color: #fff;
        }
        div {
            width: auto;
            margin: 0 auto;
            border-radius: 0;
            padding: 1em;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is established to be used for illustrative examples in documents. You may use this
    domain in examples without prior coordination or asking for permission.</p>
    <p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

 

if you do: 

curl -H "Accept-Encoding: deflate, gzip" --compressed -i www.example.com

You will get Content-Encoding: gzip (as browsers has no doubts how to handle it and it is picked despite deflate was first in order) and content will be compressed.

HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:25:24 GMT
Etag: "359670651"
Expires: Sun, 25 Sep 2016 19:25:24 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 606

▼ ;ü♣R ♥ŹTA»Ë0♀ż´Wśr☺i]¸ÇSÎV @Ô☻∟Ó┬1k▄ŇZôö$Ý6í¸▀q█Ż«ňÝ@+ÁÄ↔⌂■lăI×IôűsŹPzUeőń˝çBf♂Ó'˝ń+╠>čä¬+äOF ĎI4h↨├▬ů^@^
♂íĘ:ă►|»Q├☼í]░äÓ♂V-z╩♣|├♠Y3*ľ­┴r♠Kp╝5thęŞ"÷┬C ĽďNH¸Úă­vŻ«OOyŮú☻Đxsâţř╠ßVţ┼$■╬Xë6┤BRŃbŞCÁŁP↕qE˙░ń⌂KÄ<╩   ┴GŞÎŤ7Ť═♦▒ŰE(17Vx2╠U←ŹSđ¸
ź$¬GÍ   ▀↨▬ő4ŔnŹ8ŐÄăŃŐä▬+c¸Đ►¤EŚhA÷ŇXń­îž˙↑ź§âëČLóżRItę[4\o⌂Ľ¬ő÷♦  <×L˛rÉ╦╗'ú╔¬┴Vg?Kr {=░ä╬ő]E«ń^x♥;äĂ▒X↑♂TUŹˇ]Ň[♦∟►↔¤☼{šŹÔ☺s+°e→╬¨▄9ügńßÓ]đ§ŇŰHż4Źç┌↕#šćKA║´'­ZŞ♥žŮşĐ*rÄ

I noticed that Content-Length is same for plain and compressed, so you want probably measure length of content before you do compress it.

 

To make curl request compressed data and then uncompress it do:

curl -H "Accept-Encoding: gzip, deflate" --compressed -i www.example.com

There will be Content-Encoding header but content will be uncompressed by curl.

HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 18 Sep 2016 19:27:18 GMT
Etag: "359670651"
Expires: Sun, 25 Sep 2016 19:27:18 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 606

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 50px;
        background-color: #fff;
        border-radius: 1em;
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        body {
            background-color: #fff;
        }
        div {
            width: auto;
            margin: 0 auto;
            border-radius: 0;
            padding: 1em;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is established to be used for illustrative examples in documents. You may use this
    domain in examples without prior coordination or asking for permission.</p>
    <p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>

 

Link to comment
Share on other sites

Link to post
Share on other sites

I was wrong, I fooled myself, I was looking at Content-Length after calling -H "Accept-Encoding: gzip" and -H "Accept-Encoding: gzip" --compressed which reply on those are the same content length, but both are compressed (just one is not uncompressed)

 

So, it is even visible in examples I pasted, that uncompressed content length and compressed content length varies, so, of course, you need to count length of compressed body.

 

BTW, have you made it working? 

Link to comment
Share on other sites

Link to post
Share on other sites

I have played around with C++ streams and gzip a while ago and made a github project of it. There is a C++ std::streambuf included that handles gzip on the fly (and some other streams handling openssl-digests, base64 and iconv). This is just a playground, so don't expect quality grade documentation, but it might help you nevertheless.

It was written under Centos 7 / Fedora 24 using the Eclipse NEON IDE.

You can check it out here: https://github.com/tryptichon/Tool-IOStreams-for-CPP

CPU Ryzen 7 5800X | MoBo MSI B550 Gaming Plus | RAM 32GB Teamgroup @3600/18 | GPU EVGA RTX 3070 Ti FTW | Case Enthoo Pro M SE
PSU bq! Straight Power 11 Plat. 750W CM | Cooling Scythe Fuma 2 & 5x Corsair ML140 | Sound SB Z Retail | Storage Samsung 970 EVO 500GB
Display(s) Iiyama GB3461WQSU, Dell 24", LG 34UM95 | Keyboard Kinesis Freestyle Edge | Mouse Logitech G900 Chaos Spectrum | OS Windows 11

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×