Jump to content

Python, File to PNG Compressor.

Poet129
Go to solution Solved by Poet129,
2 minutes ago, WereCatf said:

No need to.

Well I would to thank everybody who participated in helping me make this compressor, helped me test it, and even showed me how it isn't better in most cases. Thanks again, will definitely look for help here again in my future projects.

24bit so 2^24 colors=16 777 216. All 2-pixel combinations of that is 140 737 479 966 720. Each of these takes 6 bytes so 844 424 879 800 320 bytes, or 768Tib.

Header not included :)

 

I.e. you'll need to rent a supercomputer.

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

First you'll have to be clear about what you want.

 

What do you mean by 24 bit color ?   Do you mean the color depth of the image should be 24 bit, as in 8 bit per color ( RGB = 256 shades of red, 256 shades or green , 256 shades of blue etc) or what ...

 

Do you mean 24 bit per red, green and blue channel ?  That would make each pixel use 3 bytes per channel, so 6 bytes per pixel.  Storing 2 pixels would use at least 12 bytes.

 

With 8 bit per channel (24 bit color depth), you can have 224 = 16,777,216 possible colors. 

 

Combinations of 2 unique colors out of the 16m pool gives you in 140,737,479,966,720 combinations  (can calculate with  https://stattrek.com/online-calculator/combinations-permutations.aspx )  .. so that would mean  6 bytes per pair of pixels times above mentioned number = 844,424,879,800,320 bytes 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, mariushm said:

With 8 bit per channel (24 bit color depth), you can have 224 = 16,777,216 possible colors. 

Combinations of 2 unique colors out of the 16m pool gives you in 140,737,479,966,720 combinations  (can calculate with  https://stattrek.com/online-calculator/combinations-permutations.aspx )  .. so that would mean  6 bytes per pair of pixels times above mentioned number = 844,424,879,800,320 bytes

 

32 minutes ago, Kilrah said:

I.e. you'll need to rent a supercomputer.

What about with one pixel, I know this seems useless but it isn't...

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Poet129 said:

 

What about with one pixel, I know this seems useless but it isn't...

Then you're down to a much more manageable 176MiB (8 byte header for PNG plus 3 bytes for the pixel), although the size on disk would be much larger.

 

But regardless of how much disk space this uses, such a library absolutely is useless, because it would take many orders of magnitude longer to load 3 bytes of data from disk than it would be to generate the pixel on the fly.

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, badreg said:

Then you're down to a much more manageable 176MiB (8 byte header for PNG plus 3 bytes for the pixel), although the size on disk would be much larger.

 

But regardless of how much disk space this uses, such a library absolutely is useless, because it would take many orders of magnitude longer to load 3 bytes of data from disk than it would be to generate the pixel on the fly.

Okay but what about generating every single combination on the fly would it still be faster than having it written? Since all of them would be required for my program.

Link to comment
Share on other sites

Link to post
Share on other sites

Can you give more details about what you want to do?

There may be some tricks or things you're not aware they exist or you may be stuck thinking of only one approach which is not optimal.

Also you're still not clear about it ... you still haven't answered exactly what you're talking about ... just answer how many bits per color component  (red, green, blue) you actually need.

 

And yeah, generating on the fly would almost always be faster, since ram bandwidth is in the 40 GB/s + of reads while reading from a hard drive or SSD is very slow, especially if reading tiny amounts like 2 pixels x 3 bytes per pixel = 6 bytes - pretty much everything is optimized these days to work with multiples of 512 bytes or 4096 bytes or even bigger sizes, so reading and writing smaller chunks is wasteful.

 

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Poet129 said:

Okay but what about generating every single combination on the fly would it still be faster than having it written? Since all of them would be required for my program.

An RGB color (of any bit length), is essentially just one big number. While we view a 24 bit color as having three 8 bit components, it is really only a single 24 bit number.

Think about it like this: The hex color #34eb55 has an RGB tuple like Hex#(34, eb, 55). This means that the color really is just a big number.

So the problem "producing all 24 bit colors" can be equated to the problem "produce all 24 bit numbers", which is overwhelmingly simple and relatively fast to do:

allColors = range(0, 16777215 + 1)

For 16M integers, this is generally much faster than reading a file on disk, and at least equally as fast if not faster than reading a file from memory.

 

 

Alternatively:

7 hours ago, Poet129 said:

I'm using python and would like to generate a picture which contains every possible combination of 24 bit color with 2 pixels and save it to a file in whatever format is the smallest.

Your question is fairly vague. There are multiple common definitions of 24 bit color. Do you mean, "24 bits per color" or "24 bits per channel"? Does "combination... with 2 pixels" mean "every possible color stored in 2 pixel files." or does it mean "a single file with every possible combination of 2 colors".

These problems are very different. There are only 224 colors, while there are 216,777,216 combinations of 2 colors. And that's assuming that the answer to the first question is "24 bits per color".

Regardless, there is a simpler way to store every possible color combination: The color "pure white" is defined as an equal mixture of all other colors (and therefore, all other color combinations, since all color combinations are themselves, colors). Thus, for a 24 bit per color color, all possible colors are represented in the color generated by:

allPossibleCombinations = 0;
for (i in range(0, 16777215 + 1):
     allPossibleCombinations |= i

 

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

just as off topic comment .... here's how I generated all those 16 million colors using PHP :

 

<?php

$z = 0;
$y = 0;
$x = 0;
$h = fopen('picture.ppm','wb');	// open file in write mode
// write the ppm header (P6 , X * Y , 8 bit (2^8 = 256) per channel (0..255 colors)
fwrite($h,"P6\n4096 4096\n255\n");	
$buffer = '';
for ($z=0;$z<256;$z++) {
 for ($y=0;$y<256;$y++) {
  for ($x=0;$x<256;$x++) {
	$buffer .= chr($x).chr($y).chr($z);
  }
  // only write to disk a full picture line: 3 bytes per pixel x 4096 bytes = 12288
  // the if works here because 12288 also happens to be a multiple of 256 (16 x 256 x 3 bytes)
  // so we only have to check if the buffer reached this size after end of every "if x"
  //
  if (strlen($buffer)==12288) {
	  fwrite($h,$buffer);	// write buffer
	  $buffer='';	// empty buffer
  }
 }
}
fclose($h);
?>

 

PPM is a stupid simple image format:  http://netpbm.sourceforge.net/doc/ppm.html

 

Several lines followed by binary data ... there can be optional lines (start with # character) before the binary data that are ignored but the minimum is the stuff below:

 

Line 1 : P6 (must be first line)

Line 2 : resolution_x [whitespace]  resolution_y

Line 3 : color depth up to 65536 (16 bit per channel)

binary data (3  x 1-2 bytes per channel for each pixel)

 

I then converted the PPM image to PNG using Irfanview, a free image viewer / converter.... because there's a lot of bytes that repeat, the image compresses very well ... you get a 4096x4096 image compressed down to 128 KB.

 

And if you want to get all the possible combinations of two colors you'd do something like this :

 

$maxColors = 256*256*256;

for ($i=0;$i<$maxColors-1;$i++) {
	for ($j=$i+1;$j<$maxColors;$j++) {
		// i is color of first pixel 
		// j is color of second pixel 
		// since each color is an unsigned integer (24 bits)
		// you can shift number to right 16 or 8 bit to get the red and green
		// channels 
		$pixel_r = (($i >> 16) & 0xFF);
		$pixel_g = (($i >>  8)& 0xFF);
		$pixel_b = ($i & 0xFF);
		
	}
}

 

picture.png

Link to comment
Share on other sites

Link to post
Share on other sites

I've figured it out know I just need to know how to convert binary to a color pixel in a file in python please help. (Each pixel should get its own file btw.)

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Poet129 said:

I've figured it out know I just need to know how to convert binary to a color pixel in a file in python please help. (Each pixel should get its own file btw.)

The file system really isn't gonna like 16 million files in a single folder ... especially in Windows ... performance will really start to  degrade once you go above around 20k files in a single folder.

So you'll want to make subfolders, maybe 256 sub-folders ...

 

Storing a picture for every pixel is really weird man, do you really can't explain what you're trying to achieve, most likely what you want can be solved in a smarter, less "brute-force"-ish way?

 

If you're really stubborn about it, already showed you above how to make PPM files. Find the python function which converts an unsigned number into a character / single byte and use that... i think it's same as in php, chr() https://en.wikibooks.org/wiki/Python_Programming/Text ...  the ppm file would be "P6\n1 1\n255\nRGB"  ... 14 bytes per pixel.... so you'll end up with around  224 MB worth of pictures on your drive.

 

Pretty much as small as you can make uncompressed pictures, except raw files.   Once they're all generated,  you can use batch conversion feature in irfanview to convert them all to bmp or png or whatever,  but they're probably gonna end up bigger in size (at least the bmp header is bigger and bmp is also a format that stores in multiples of 4, so you end up with at least 12 bytes + bitmap header or 4 bytes + 3 bytes for color palette + header)

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, mariushm said:

The file system really isn't gonna like 16 million files in a single folder ... especially in Windows ... performance will really start to  degrade once you go above around 20k files in a single folder.

So you'll want to make subfolders, maybe 256 sub-folders ...

 

Storing a picture for every pixel is really weird man, do you really can't explain what you're trying to achieve, most likely what you want can be solved in a smarter, less "brute-force"-ish way?

 

If you're really stubborn about it, already showed you above how to make PPM files. Find the python function which converts an unsigned number into a character / single byte and use that... i think it's same as in php, chr() https://en.wikibooks.org/wiki/Python_Programming/Text ...  the ppm file would be "P6\n1 1\n255\nRGB"  ... 14 bytes per pixel.... so you'll end up with around  224 MB worth of pictures on your drive.

 

Pretty much as small as you can make uncompressed pictures, except raw files.   Once they're all generated,  you can use batch conversion feature in irfanview to convert them all to bmp or png or whatever,  but they're probably gonna end up bigger in size (at least the bmp header is bigger and bmp is also a format that stores in multiples of 4, so you end up with at least 12 bytes + bitmap header or 4 bytes + 3 bytes for color palette + header)

This may sound weird but I'm making a compressor. So would the above be able to take 111111110000000000000000 to a red pixel in a bmp or jpg whatever file format?

Link to comment
Share on other sites

Link to post
Share on other sites

how do you plan to make a compressor when you don't even know how to deal with bytes and bits

 

you would convert that sequence of bits into 3 bytes ... 11111111 becomes 255; 00000000 becomes 0 .. so you store 255,0,0 in file.  use chr function to convert the byte to character with that ascii code (or some other function that converts the number to a single byte)

 

conversion from binary to decimal is easy ... the basic algorithm is this ...in php

 

$value =0;

$bits = [1,0,1,1,1,1,1,1];

for ($counter =0; $counter<count($bits);$counter++) {

  $value = $value * 2 + $bits[$i];

}

// 10111111 => 191

 

if you want feel free to pm me with your compression idea and i can think about it and maybe tell you if it wasn't done already before or if it has some flaws.

 

or if you don't care about it (saying it because there's always some people that think they're gonna make millions from something and refuse to tell their ideas) describe your compression technique.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, mariushm said:

how do you plan to make a compressor when you don't even know how to deal with bytes and bits

 

you would convert that sequence of bits into 3 bytes ... 11111111 becomes 255; 00000000 becomes 0 .. so you store 255,0,0 in file.  use chr function to convert the byte to character with that ascii code (or some other function that converts the number to a single byte)

 

conversion from binary to decimal is easy ... the basic algorithm is this ...in php

 

$value =0;

$bits = [1,0,1,1,1,1,1,1];

for ($counter =0; $counter<count($bits);$counter++) {

  $value = $value * 2 + $bits[$i];

}

// 10111111 => 191

 

if you want feel free to pm me with your compression idea and i can think about it and maybe tell you if it wasn't done already before or if it has some flaws.

 

or if you don't care about it (saying it because there's always some people that think they're gonna make millions from something and refuse to tell their ideas) describe your compression technique.

 

 

Put in simple terms taking every combination of color and assigning that to one number 1 through 16,777,216 then converting that number to a set binary string with a length of 256 bytes. I already have the number to binary and vice versa done btw. I have run the math a few times and everything checks out at roughly half the file size of the OG file.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Poet129 said:

Put in simple terms taking every combination of color and assigning that to one number 1 through 16,777,216 then converting that number to a set binary string with a length of 256 bytes

Your plan actually multiplies the size of the image by 85 1/3 times.

You are saying that you plan to compress a 3 byte number into a 256 byte string.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, straight_stewie said:

Your plan actually multiplies the size of the image by 85 1/3 times.

You are saying that you plan to compress a 3 byte number into a 256 byte string.

That's the decompression method.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Poet129 said:

That's the decompression method.

So your compression method takes 256 character strings and compresses them into a 16 bit color?

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, straight_stewie said:

So your compression method takes 256 character strings and compresses them into a 16 bit color?

24 bit color but yes, each color is one combination of 256 bytes.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Poet129 said:

24 bit color but yes, each color is one combination of 256 bytes.

So is this really a substitution cipher?

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, straight_stewie said:

So is this really a substitution cipher?

To some extent.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Poet129 said:

To some extent.

Ok. So just in case you are actually doing this as a compression method, you will have to remap the colors for every new file. This isn't really hard at all, it really just requires passing around the dictionary of translations. There are better ways to do this, but that's one way to think about it.

On the case that you are really trying to do some steganography, there are much better (and much more secure methods that don't rely on transporting a dictionary of translations around) out there. One such method is to shift bits of your input into the last bit of each color.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, straight_stewie said:

Ok. So just in case you are actually doing this as a compression method, you will have to remap the colors for every new file. This isn't really hard at all, it really just requires passing around the dictionary of translations. There are better ways to do this, but that's one way to think about it.

On the case that you are really trying to do some steganography, there are much better (and much more secure methods that don't rely on transporting a dictionary of translations around) out there. One such method is to shift bits of your input into the last bit of each color.

I thank you for your advice, but I have already tested and more or less perfected the binary to number converter so I know I will need a similar setup for the color my original issue was I was unsure how to generate a "color dictionary".

Link to comment
Share on other sites

Link to post
Share on other sites

My guess is that you have probably tested your idea only with some specific test files, with a lot of redundancy.

In real world, you won't often find sequences as long as 256 characters / bytes that repeat, making it worth putting that sequence in a dictionary and giving it a number which you can then convert into a pixel (24 bits, 3 bytes)

 

A lot of compressors will only search for sequences up to around 16-32 bytes long - for example palmdoc compression (which is a simplified lz77 compression algorithm) is designed for up to 10 bytes lengths because in general text (books) there's gonna be a limited number of sequences bigger than 10 bytes.

 

converting colors to numbers and the other way around it's quite simple ...

0... 2^24-1  :    red * 65536 + green * 256 + blue    or red <<16 + green <<8 + blue   

You've converted 3 x 8 bits into a single 24 bit value.

 

To convert the 24 bit number back to 3 8 bit numbers  you can use divisions , modulus functions , binary operators... just search Python books for that. 

For example,  blue is remainder of that number divided by 256.  nr % 256 => blue  

Substract value from number and divide by 256 and you're left with a 16 bit number that contains red and green.  nr = (nr -blue) / 256

green is the remainder of that number divided by 256.  nr % 256 => green

red is the number divided by 256  ... (nr - green) / 256 => red

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, mariushm said:

My guess is that you have probably tested your idea only with some specific test files, with a lot of redundancy.

In real world, you won't often find sequences as long as 256 characters / bytes that repeat, making it worth putting that sequence in a dictionary and giving it a number which you can then convert into a pixel (24 bits, 3 bytes)

 

A lot of compressors will only search for sequences up to around 16-32 bytes long - for example palmdoc compression (which is a simplified lz77 compression algorithm) is designed for up to 10 bytes lengths because in general text (books) there's gonna be a limited number of sequences bigger than 10 bytes.

 

converting colors to numbers and the other way around it's quite simple ...

0... 2^24-1  :    red * 65536 + green * 256 + blue    or red <<16 + green <<8 + blue   

You've converted 3 x 8 bits into a single 24 bit value.

 

To convert the 24 bit number back to 3 8 bit numbers  you can use divisions , modulus functions , binary operators... just search Python books for that. 

For example,  blue is remainder of that number divided by 256.  nr % 256 => blue  

Substract value from number and divide by 256 and you're left with a 16 bit number that contains red and green.  nr = (nr -blue) / 256

green is the remainder of that number divided by 256.  nr % 256 => green

red is the number divided by 256  ... (nr - green) / 256 => red

 

 

I actually "compressed" a copy of the dictionary file, so zero redundancy.

Link to comment
Share on other sites

Link to post
Share on other sites

50 minutes ago, Poet129 said:

Put in simple terms taking every combination of color and assigning that to one number 1 through 16,777,216 then converting that number to a set binary string with a length of 256 bytes.

So the whole "color" thing doesn't actually matter at all, you're just mapping things to a 24bit number. 

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×