6 Replies - 987 Views - Last Post: 21 September 2012 - 07:08 PM Rate Topic: -----

#1 byrandomby1  Icon User is offline

  • D.I.C Head

Reputation: 2
  • View blog
  • Posts: 124
  • Joined: 08-March 11

Compress and compare gif images

Posted 21 September 2012 - 08:14 AM

I need to compress (reduce size) a gif image to as small as possible.
I will only need to use the compressed file to compare it with another gif, meaning it need not retain its format or be decompressed, but just a unique identifier of the image.

For example,
I have a huge list of compress gif images.
I can then take an uncompressed gif image, compress it, and compare (the bytes) with the list of compressed ones, to find whether there is a match (duplicate).
(I can just use and compare uncompressed images but it takes too much disk space)

What I've tried:
- Compressing the bytes using GZipStream. - doesn't seem to reduce the file size
- Converting to jpg with lower quality - reduces size from ~2kb to ~1kb (not enough) even with very low quality

How can I do this?

This post has been edited by byrandomby1: 21 September 2012 - 08:22 AM


Is This A Good Question/Topic? 0
  • +

Replies To: Compress and compare gif images

#2 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 9265
  • View blog
  • Posts: 34,755
  • Joined: 12-June 08

Re: Compress and compare gif images

Posted 21 September 2012 - 08:20 AM

Odd thought - not all compression algorithms are the same, right? That means unless you compress your normal gif the same way as the compressed one was the byte information will be off anyways.
Was This Post Helpful? 1
  • +
  • -

#3 byrandomby1  Icon User is offline

  • D.I.C Head

Reputation: 2
  • View blog
  • Posts: 124
  • Joined: 08-March 11

Re: Compress and compare gif images

Posted 21 September 2012 - 08:22 AM

View Postmodi123_1, on 21 September 2012 - 08:20 AM, said:

Odd thought - not all compression algorithms are the same, right? That means unless you compress your normal gif the same way as the compressed one was the byte information will be off anyways.


I would be using the same compression method to compare. Why wouldn't I?
Was This Post Helpful? 0
  • +
  • -

#4 RudiVisser  Icon User is offline

  • .. does not guess solutions
  • member icon

Reputation: 1003
  • View blog
  • Posts: 3,562
  • Joined: 05-June 09

Re: Compress and compare gif images

Posted 21 September 2012 - 08:23 AM

You say it doesn't need to be decompressed again, but you're also saying you want to do a byte-by-byte comparison?

Why not just use a hash? An MD5 hash would be more than sufficient for comparison of contents provided the 2 GIFs were initially encoded the same.

The problem is that you couldn't do this based *just* on contents of the file. I mean, you couldn't tell the actual difference, only that there was a difference. By compressing it further than GIFs are already compressed (which they are, GIF images aren't raw image data), you could compress the same image twice using slightly varying levels of encryption (despite being the same method!) and get entirely different results.

This post has been edited by RudiVisser: 21 September 2012 - 08:24 AM

Was This Post Helpful? 2
  • +
  • -

#5 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 9265
  • View blog
  • Posts: 34,755
  • Joined: 12-June 08

Re: Compress and compare gif images

Posted 21 September 2012 - 08:23 AM

... because you are asking for a compression algorithm. If you are asking for an algorithm then it would be sensible to believe you don't know what algorithm the compressed gifs were compressed by.


Quote

I need to compress (reduce size) a gif image to as small as possible.

Was This Post Helpful? 0
  • +
  • -

#6 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3589
  • View blog
  • Posts: 11,159
  • Joined: 05-May 12

Re: Compress and compare gif images

Posted 21 September 2012 - 09:11 AM

Short answer: Don't compress. Use MD5 hashes (or some other hash function).

Long answer:
Do you want to compare the two GIF files or the images contained within the two files?

If you want to compare the two GIF files, just treat them as any old random file and compare the two of them. Why incur the computation and I/O overhead of compressing the two files, and then comparing the two compressed files? If the argument is that it is faster to compare 2 1MB files than to compare 2 4MB files, consider the amount of time it took to compress those two 4MB files to become 1MB files.

If you want to compare the images contained within the two files, a quick way is to first read the image dimensions from each of the two files. This will only be at around 8 bytes to be read from each file. If the two dimensions are different, then you obviously have two different images.

Text is spoiler is if you want to make some assumptions about how the data was encoded.
Spoiler


This leads to the only true way to see if two images are the same is to actually render the data into virtual canvases, and then do pixel by pixel comparisons between the canvases. The pixel by pixel comparison should be fine if you only ever need to compare just two images.

If on the other hand, you need to compare multiple files or multiple images, then it may make sense to build a database of MD5 hashes (or whatever other hash algorithm you prefer). The hash values you get from either the file bits or the pixel data will be a much more compact way to compare files or images.
Was This Post Helpful? 1
  • +
  • -

#7 byrandomby1  Icon User is offline

  • D.I.C Head

Reputation: 2
  • View blog
  • Posts: 124
  • Joined: 08-March 11

Re: Compress and compare gif images

Posted 21 September 2012 - 07:08 PM

MD5 hash seems to do the trick.
Thanks for the help.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1