Author Topic: Compression Algorithm  (Read 4154 times)

Offline Brazilian Fan

  • Sr. Member
  • ****
  • Posts: 302
Compression Algorithm
« on: March 09, 2007, 03:36:33 PM »
I was searching thru the net, and a page about Compression Algorithms came up (I don't remember what it was, and my web browser history/cookies were deleted accidentally). One type of algorithm is being researched, and it promises to compress a whole HD (about 80GB) into a file of 200 bytes very fast.

So, what do ya folks' think?

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4954
Compression Algorithm
« Reply #1 on: March 09, 2007, 04:06:00 PM »
I'd say it's a load of BS. Either that or they're talking about some theoretical nonsense that can never be implemented in practice. Like saying those 200 bytes represent the HDD uniquely (say a serial number) and a time, with precision high enough to represent the state at any given moment. Say the time it takes to write a sector. Then you could claim that you've uniquely identified the drive, and the point of time of it's existence, and thus somehow uniquely identified the contents of the drive. You could also go on to claim that the universe is completely deterministic, so you could simulate the universe up to the point where you have calculated the exact contents of the drive. Mind you, a lot of people reject the idea of a completely deterministic universe. Plus, to do such a simulation, you'd need to simulate and remember the state of every particle in the universe, with at most every particle in the universe to represent that state. (And the machinery to execute the algorithm?) I hope you can see how impractical this is. Even if it's super quick to write down the HDD serial number and a time stamp in a very small amount of time into 200 bytes, those 200 bytes are essentially useless for getting the contents of the drive back.



Edit: I should also point out that 200 bytes can only represent 2^(200*8) possible states. A 200 GB HDD can have 2^(200GB *8) possible states. There is clearly no way to represent all possible states in only 200 bytes. You can only represent a tiny fraction of them. You don't get something for nothing. If you decrease the length of the representation of one state, you must incrase the representation of one or more other states. That is how most compression algorithms work. They try to make the more common bit patterns have a smaller representation at the expense of the less common ones having a longer bit representation.

Also note that the first idea (which I did actually read in a book somewhere) basically assumes to have access to a representation of the whole universe. You can probably guess that's well over 200 GB of information they're assuming they just happen to have for free. Not exactly a very valid assumption.
 
« Last Edit: March 09, 2007, 04:13:58 PM by Hooman »

Offline Brazilian Fan

  • Sr. Member
  • ****
  • Posts: 302
Compression Algorithm
« Reply #2 on: March 09, 2007, 04:34:43 PM »
Hey, I didn't said that it was possible in any way  :P. I was just posting about something I saw and I wanted to get your answers about what other person put into the net  :heh:

Ok, forget about 'real' limitations. Let's say this thing is right, possible to do that. What would be the consequences? (ie. Patents, internet providers, other's compression algorithms, download managers, etc.)

Offline dm-horus

  • Banned
  • Hero Member
  • *****
  • Posts: 1042
Compression Algorithm
« Reply #3 on: March 09, 2007, 11:19:44 PM »
unless its a quantum thing where 1 bit can represent a series of quantum bits, but we are well far from that.

http://en.wikipedia.org/wiki/Quantum_computing

(especially the part about bits vs. qubits)

Hooman what you explained above about exchanging states for compression factor is actually alot like my breaking the speed of light argument in one of the last weapon idea threads by combine. If you exchange velocity (greater velocity in order to break the speed of light) you also get an increase in energy requirement. While the speed of light is not infinite, it requires infinite energy to reach. Considering that light speed requires infinite energy, faster than light might as well be considered impossible in our universe.

(There may be other ways to reach FTL outside our universe but thats for another thread)

Offline White Claw

  • Hero Member
  • *****
  • Posts: 854
Compression Algorithm
« Reply #4 on: March 09, 2007, 11:46:40 PM »
Quote
I was just posting about something I saw and I wanted to get your answers about what other person put into the net 
That's exactly what you got!

Like Horus and Hooman said, it's not possible with "standard" bits. But if each bit was an electron and you could control all of its subparticles (or some other nonsensicle idea, at least from today's viewpoint), then you could store all the information in a much smaller set. But is it really compression then? Or different storage media.

Buy like you said, let's assume it is somehow possible. It would cause a boom just like MP3s.

Of course, I could always compress a hard drive very fast if it contained all zeros. And I could use 200bits to store the number of zeros on that hard drive. I claim the patent on that compression scheme...  :D  

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4954
Compression Algorithm
« Reply #5 on: March 10, 2007, 01:18:16 AM »
Quote
unless its a quantum thing where 1 bit can represent a series of quantum bits, but we are well far from that.

Isn't that a bit backwards? Qubits are taken from a continuous space, where as bits are taken from a discrete space. Thus you could claim that a single qubit can represent an infinite number of (classical) bits. But then, upon measurement, you can only really get 1 of them back. That's a bit of a problem since you can't clone quantum information, and you destroy it when you measure it. Then there is the idea of quantum teleportation (of quantum states/information), where you can transmit an unknown single copy of a qubit using 2 classical bits, and a shared EPR pair. You use up the EPR pair in the process, and destroy your copy of the qubit, but the two classical bits are enough to transmit the state to someone else to recreate the original qubit. The shared EPR pair is generally assumed to have been generated before hand, and each person gets one of the qubits from it.

But all that aside, the no cloning, and the measuring destroying the qubit would seem to imply you don't get any storage space improvements from quantum computers. Their real power is being able to set qubits in a superposition of classical states during a computation, and thus perform the computation for an exponential number of inputs. They're basically powerful computationally because of this extra interaction between qubits. But that also seems to be the problem with storing quantum information. As you may have heard, it's very difficult to build a quantum computer. Decoherence is a big problem. Any interaction with it's environment causes the qubit to be destroyed. It's hard to store a quantum state for any length of time, and from what I hear, if you can do it for about a second, you're doing pretty good. That would likely be quite the problem for a HDD like device.


And I'm of the view that a bit is simply a bit. It's a unit of information, mathematically defined to have 1 of 2 values. There is no way to map an arbitrary collection of 200GB*8 bits down to 200*8 bits, and then back again reliably so that any given input could be uniquely reconstructed. Trying to store the same amount of "information" in a smaller space is simply a change in storage media.


Quote
What would be the consequences?
I think this question is better if you drop the what if it's possible part, and replace if it, what if storage space suddenly got 1 billion times larger. That is, 200 GB can now be stored just as easily as 200 bytes can be.

If that was the case, I doubt there would be any real radical change or shift in society. Sure some people would claim it's a big thing, and it sort of would be. But do we really need that much harddrive space? Take the space you have now, and multiply it by 10 or 100. That should be more than enough to store pretty much anything you can think of for quite some time to come. The big drain on space, for home computers at least, is in media. If you have a lot of pictures, music, and movies, than you'll need lots of space. But even the first two don't seem to account for a enough to upgrade from current harddrives in most cases. And if they do, there's probably a big chance they're breaking the law.

So, I figure if we got that much more storage space, software/movie/music pirates would pirate more. There'd be increase demand for internet bandwidth, maybe more lag. It might be somewhat temporary, since an increase in pirating would probably cause a crackdown from law enforcement or the people behind them pushing it. If bandwidth suddenly went up a lot, ISPs would probably also crack down to save their bottom line. Or you might claim that bandwidth is largely already saturated for people who do that much pirating, that increased storage space wouldn't really affect their behavior. Although, with extra space, it would probably lead more people into the temptation to pirate stuff.

It'd probably be more useful for business to have more storage space. They often have legitimate reasons for having lots of storage. But then they can also usually afford to buy many harddrives to get the space they need. So I don't know if it'd be a big difference here. Maybe just a little more convenient. Perhaps some savings for them.
 

Offline Brazilian Fan

  • Sr. Member
  • ****
  • Posts: 302
Compression Algorithm
« Reply #6 on: March 10, 2007, 06:13:29 AM »
It's not the same thing to make the data 100x smaller and to have a HDD 100x bigger.

Say, if I could download a whole game from the net using a 56K modem in 2 min. No need for download managers or fast internet connection. That would be, literally, a big revolution.

Offline BlackBox

  • Administrator
  • Hero Member
  • *****
  • Posts: 3093
Compression Algorithm
« Reply #7 on: March 10, 2007, 09:08:39 AM »
Well as far as the argument about storage space becoming a billion times larger, and causing pirating to increase, think about this:

I would say that the capacity of storage devices increases just as we need more and more storage space. Computer software continues to get more and more complex every year.

For example, I have an XT clone in my closet from 1987 with a 20mb MFM hard drive. This 20 MB was more than enough storage space back in those days. A typical game rarely took more than a megabyte of storage space (or even 500 kb for that matter). Although we now look at a 20mb hard drive and think about how small it is, back then it was a luxury and was more than enough space for most people.

Nowadays we look at modern operating systems and games. Windows Vista uses nearly 7 gb of space for a base install. DOS 3.0 on the other hand fit into a 360 kb floppy disk. Look at games. Supreme Commander requires around 8 GB of storage space. In contrast, Zork 3 took less than 1 MB of storage space.

And storage demands are going to keep going up. Look at Blu-ray and HD-DVD. It seems strange, but games written for the consoles (Xbox 360 and PS3) are actually requiring discs that can hold such vast amounts of data. (HD-DVD has a capacity of 15 gb per layer; Blu-ray a capacity of 25 gb per layer). Disc capacities keep going up. Holographic Versatile Disc (HVD) can be expected in the next couple of years, having a maximum capacity of 3.9 TB (though the capacity will probably only be around 200 - 300 gb when initially released).

I think it might be cheaper (in terms of R&D costs / time) to produce better storage media than to concentrate on compression algorithms that can store more information in a smaller space. And for all the reasons listed previously, compression algorithms have their limits.

Of course, you could always argue that programmers have just gotten lazy and software is in general becoming bloated (good programming technique is less valued because modern computers are plenty fast to run the (unoptimized) code).

But anyway, to say that it will increase piracy, that is hard to estimate. If pieces of software start requiring 80 gigabytes of storage space each, then the "buying power" people have with storage devices is no better or worse (the proportion of the size of each piece of data you need store to the size of the entire storage medium remains the same). All this while consumer internet connection bandwidth is not increasing that much. (Face it, downloading 80 gb of data, even with a 10mbps connection to the internet, is not that fast).
« Last Edit: March 10, 2007, 09:13:30 AM by op2hacker »

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4954
Compression Algorithm
« Reply #8 on: March 12, 2007, 09:50:17 PM »
Heh, yeah. That remind me of Linux install disks that you can buy. There wouldn't really be a market for any of those if downloading wasn't a pain in enough cases to warrant it. Well, other than buying one as a way of "donating" to a charity.


I guess that's one way the movie producers can stop movie pirating, is to make them so high definition that it'll be impossible to store that much data on media used by a home computer. I'd imagine pirating has picked up since DVD drives and larger harddrives have become more widespread. But if a movie took 15 GB to 25 GB to store a single one, then I'd certainly find it very hard to convince myself it was worth having a copy of on my computer. Downloading is even worse, since ISP tend to call you can complain or threaten to cut off your service if you download too much. Even if people still did it, the larger size would limit the rate at which things would be pirated until at least a point in time where the home computer technology caught up.

On the other hand, they could also just choose to record/pirate the stuff at a lower bitrate. But then they would probably get to a lot of purists that want whatever is out there at the highest quality possible. It should at least split things to reduce the number of available sources for the format pirates want it in.


Of course I've never quite understood why they didn't just split the technology. Why do DVDs for movies have to be in a format that is readable by a home computer? Wouldn't they be easier to protect if they were only viewable using readonly DVD players hooked up to a TV? Sure mass storage for a computer is nice, but I don't see why mass storage on a computer needs to be compatible with whatever the movie industry is using. Besides, a home computer isn't exactly the best movie watching platform.
 

Offline Mez

  • Hero Member
  • *****
  • Posts: 648
Compression Algorithm
« Reply #9 on: March 13, 2007, 06:27:27 AM »
Well because when you don't have a dedicated TV, and instead you have a PC which you want to be able to do everything on!

You have a few catagories:

Students - small rooms, have a PC or laptop and just want to watch a movie (i.e. Me)

Yuppie - Has lots of money, knows what he.she is doing with computers makes areal nice home entertainment system with huge Hi-def screen.  Waiting / using the online movie download services

amatur/pro editter/recorder - needs to be able to read/edit/produce movies for public (or private) consumption - and please dont reply with any Porno jokes!

Sometimes the main TV is being used, e.g. By the wife, kids, other students in the flat!

Offline Eddy-B

  • Hero Member
  • *****
  • Posts: 1186
    • http://www.eddy-b.com
Compression Algorithm
« Reply #10 on: March 16, 2007, 05:35:31 PM »
Compress 80GB to just 200 bytes ? that's not too hard; i can give you the command for that:  FORMAT C:
Rule #1:  Eddy is always right
Rule #2: If you think he's wrong, see rule #1
--------------------

Outpost : Renegades - Eddy-B.com - Electronics Pit[/siz

Offline White Claw

  • Hero Member
  • *****
  • Posts: 854
Compression Algorithm
« Reply #11 on: March 17, 2007, 08:05:25 AM »
HEY! That was my idea. (Make it all zeros...)

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4954
Compression Algorithm
« Reply #12 on: March 17, 2007, 05:17:59 PM »
I feel strangely offended that you would even call that compression.  :o

The point of compression seems to be that you can later restore the original data. I fail to see how format obtains that goal. (And I'm not looking for a smartass response about how the drive may have originally contained 0's that are being restored!)

Offline White Claw

  • Hero Member
  • *****
  • Posts: 854
Compression Algorithm
« Reply #13 on: March 17, 2007, 06:46:13 PM »
I think that considering the whole idea is implausiable, it seems to lend itself to smartass responses.

Edit: I just went to look up Fractal compression. Even that is theoried only up to 60:1. So you'd only get about 11K for 200bytes. (And from the sounds of it, fractal encoding is only good for images and even at that, seems impractical.)
« Last Edit: March 17, 2007, 07:02:35 PM by White Claw »

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4954
Compression Algorithm
« Reply #14 on: March 18, 2007, 01:39:55 AM »
Yes, you do have a point there.