From: Roedy Green on 27 Jan 2010 13:38 On Tue, 26 Jan 2010 22:17:57 -0800, "Mike Schilling" <mscottschilling(a)hotmail.com> wrote, quoted or indirectly quoted someone who said : >I'm confused. How is "+" a weird character than can't be stored as >ASCII? + is an odd character for filenames. It usually means concatenation. Perhaps Phil Katz originally used some simple compression on ASCII filenames. It has been a long time since I studied the file format. Remember that PkZip started out as with the DOS 8.3 all case-insensitive file system. The way to answer these questions: 1. read spec at PkZip.com 2. read docs at WinZip.com 3. create some sample zip files and look at them with a hex editor. 4. compress and fluff some sample files and compare attributes/timestamps. See http://mindprod.com/jgloss/zip.html http://mindprod.com/jgloss/pkzip.html http://mindprod.com/jgloss/winzip.html http://mindprod.com/jgloss/hex.html -- Roedy Green Canadian Mind Products http://mindprod.com Computers are useless. They can only give you answers. ~ Pablo Picasso (born: 1881-10-25 died: 1973-04-08 at age: 91)
From: Tom Anderson on 27 Jan 2010 13:39 On Tue, 26 Jan 2010, Arne Vajh?j wrote: > On 26-01-2010 05:04, Roedy Green wrote: >> On Mon, 25 Jan 2010 23:41:35 +0100, Erik<et57(a)hotmail.com> wrote, >> quoted or indirectly quoted someone who said : >>> file last modified on (0x00003c39 0x0000b52a): 2010-01-25 >> >> One problem with ZIP format that bedevils me is that when you put a >> file into a zip, then restore it, the timestamp can be out by up to 2 >> seconds! The restored file looks like a DIFFERENT version of the file. > > The format only has 5 bits for seconds. > > No surprise that it can be off. > >> Further the timestamps are in local timezone rather than GMT, and the >> timezone is not recorded. Arrgh. I have been bugging the Winzip and >> the Truezip people to fix this. >> >> Vendors are reluctant, I think, primarily because an upward compatible >> solution would make files fatter. Archivers compete ferociously. > > The ZIP format is a well-defined format (defined in APPNOTE). > > Picking a new time format would make it not zip. > > And would make it unreadable by all other zip tools out there. There is an 'extra field' in the file header record. It's structured into tag-length-value chunks which can hold arbitrary extra metadata. Tag 0x5455 is not formally standardised, but is one of the listed "third party mappings commonly used", and is described as "extended timestamp". You will note that taken as a two-character ASCII string, 0x5455 is "UT". It seems to be defined and quasi-standardised by InfoZIP; see this file from InfoZIP hosted by your new favourite microchip manufacturer: http://www.opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld Which explains that it can contain any combination of modification, access, and creation times, described by a bitfield, and that: The time values are in standard Unix signed-long format, indicating the number of seconds since 1 January 1970 00:00:00. The times are relative to Coordinated Universal Time (UTC), also sometimes referred to as Greenwich Mean Time (GMT). Although looking at the InfoZIP source code, there seems to be a lot of special-casing which suggests to me that not all tools follow those rules to the letter. There are also a variety of more formally standardised OS-specific metainfo blocks, which can contain timestamps. A polyglot tool which could read all these could provide better timestamps on extracted files even in the absence of a 0x5455 header. tom -- I never meant to say that the Conservatives are generally stupid. I meant to say that stupid people are generally Conservative. I believe that is so obviously and universally admitted a principle that I hardly think any gentleman will deny it. -- John Stuart Mill
From: Arne Vajhøj on 27 Jan 2010 22:56 On 27-01-2010 13:39, Tom Anderson wrote: > On Tue, 26 Jan 2010, Arne Vajh?j wrote: > >> On 26-01-2010 05:04, Roedy Green wrote: >>> On Mon, 25 Jan 2010 23:41:35 +0100, Erik<et57(a)hotmail.com> wrote, >>> quoted or indirectly quoted someone who said : >>>> file last modified on (0x00003c39 0x0000b52a): 2010-01-25 >>> >>> One problem with ZIP format that bedevils me is that when you put a >>> file into a zip, then restore it, the timestamp can be out by up to 2 >>> seconds! The restored file looks like a DIFFERENT version of the file. >> >> The format only has 5 bits for seconds. >> >> No surprise that it can be off. >> >>> Further the timestamps are in local timezone rather than GMT, and the >>> timezone is not recorded. Arrgh. I have been bugging the Winzip and >>> the Truezip people to fix this. >>> >>> Vendors are reluctant, I think, primarily because an upward compatible >>> solution would make files fatter. Archivers compete ferociously. >> >> The ZIP format is a well-defined format (defined in APPNOTE). >> >> Picking a new time format would make it not zip. >> >> And would make it unreadable by all other zip tools out there. > > There is an 'extra field' in the file header record. It's structured > into tag-length-value chunks which can hold arbitrary extra metadata. > Tag 0x5455 is not formally standardised, but is one of the listed "third > party mappings commonly used", and is described as "extended timestamp". > You will note that taken as a two-character ASCII string, 0x5455 is > "UT". It seems to be defined and quasi-standardised by InfoZIP; see this > file from InfoZIP hosted by your new favourite microchip manufacturer: > > http://www.opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld > > > Which explains that it can contain any combination of modification, > access, and creation times, described by a bitfield, and that: > > The time values are in standard Unix signed-long format, indicating the > number of seconds since 1 January 1970 00:00:00. The times are relative > to Coordinated Universal Time (UTC), also sometimes referred to as > Greenwich Mean Time (GMT). > > Although looking at the InfoZIP source code, there seems to be a lot of > special-casing which suggests to me that not all tools follow those > rules to the letter. > > There are also a variety of more formally standardised OS-specific > metainfo blocks, which can contain timestamps. A polyglot tool which > could read all these could provide better timestamps on extracted files > even in the absence of a 0x5455 header. You are correct. And extension would not break anything. And if implementation could actually start agreeing on using it, then it could become very useful. Arne
From: Roedy Green on 28 Jan 2010 18:20 On Wed, 27 Jan 2010 18:39:09 +0000, Tom Anderson <twic(a)urchin.earth.li> wrote, quoted or indirectly quoted someone who said : > The time values are in standard Unix signed-long format, indicating the > number of seconds since 1 January 1970 00:00:00. The times are relative > to Coordinated Universal Time (UTC), also sometimes referred to as > Greenwich Mean Time (GMT). Finally, some progress. The thing that is so funny about these problems is any one solution is trivial. The difficulty is introducing it in a way that does not trip up other users of the files, and persuading people to converge on a common solution. The precise details of how it works are almost irrelevant since only a very few programmers ever have to deal with it. Everyone else will deal with it via a simple API. The other problem is trying to persuade some vendor to pioneer the feature. Vendors are reluctant to do so, even if they see the need, because soon after a slightly different consensus scheme may be introduced leaving them with an incompatible legacy. I hope someone does a thesis on these sorts of problem, researching the politics involved and how successful consensuses are reached quickly. Maybe the game theorists could explain the behaviours. -- Roedy Green Canadian Mind Products http://mindprod.com Computers are useless. They can only give you answers. ~ Pablo Picasso (born: 1881-10-25 died: 1973-04-08 at age: 91)
From: Erik on 30 Jan 2010 07:35 WinZIP itself provides this info on any zip file opened. File->Properties->details On Wed, 27 Jan 2010 00:58:49 -0800, Roedy Green <see_website(a)mindprod.com.invalid> wrote: >On Mon, 25 Jan 2010 23:41:35 +0100, Erik <et57(a)hotmail.com> wrote, >quoted or indirectly quoted someone who said : > >>Some additional info from WinZIP about the Java-generated zip file: > >I looked all over their site but could not find that info. Did you get >it in email? >
First
|
Prev
|
Pages: 1 2 3 Prev: Using browser's proxy-settings from a javaws-app? Next: split UTF-8 string to multi UTF8-file |