Talk:ZIP (file format)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Section "Implementation" contains incorrect information regarding the Windows zip feature[edit]

The text "For example, encryption is not supported in Windows 10 Home edition" references a article about NTFS' EFS (Encrypting File System) encryption, which has nothing to do with the encryption in zip files. This text should be changed to contain correct information. --Sonic The Hedgehog LNK1123 (talk) 03:57, 6 April 2021 (UTC)[reply]

Deflate details[edit]

When using zlib for deflate compression it looks like one needs to use "raw deflate" with no zlib header or trailer, by specifying -ve windowBits to deflateInit2(). Otherwise for example unzip fails, complaining about "invalid compressed data to inflate".

Also the 2-byte 'compression method' fields need to be 8, not 0x800. — Preceding unsigned comment added by 82.68.48.14 (talk) 23:58, 29 April 2021 (UTC)[reply]

Are unreferenced/zombie files allowed?[edit]

The article says:

Because ZIP files may be appended to, only files specified in the central directory at the end of the file are valid. Scanning a ZIP file for local file headers is invalid (except in the case of corrupted archives), as the central directory may declare that some files have been deleted and other files have been updated.

However, the specification (6.3.9) says:

4.3.2 Each file placed into a ZIP file MUST be preceded by a "local file header" record for that file. Each "local file header" MUST be accompanied by a corresponding "central directory header" record within the central directory section of the ZIP file.

Also, "4.3.6 Overall .ZIP file format" and "4.3.8 File Data" do not mention that arbitrary meaningless bytes may be stored between files.

I checked that 7-zip does not support unreferenced files. If I take a zip file and rewrite central directory record, removing some files from it, 7-zip still shows these files as available. If I additionally replace the signatures of their local file headers with some garbage, then 7-zip cannot open the resulting file. Discussion on 7-zip forum.

I think this claim should either be removed or a citation should be added. -- Preceding unsigned comment added by Stgatilov (talk o contribs) 04:36, 11 May 2021 (UTC)[reply]

If you scan from the beginning for local file headers, then there's also the problem of an ZIP file contained uncompressed within another ZIP file (which is legal). AnonMoos (talk) 11:15, 11 May 2021 (UTC)[reply]
Unfortunately unreferenced files can occur depending on how the zip application changes and updates files. Writing a file signature at the location of the central directory and writing the central directory back out after it. The only thing that needs to be updated is the one entry in the central directory to locate to the new file signature. At the end of the central directory we write the end of directory signature which must have the updated position the central directory starts at.
If we read the zip the proper way bottom up we find the end of directory signature, then we know where the start of the central directory is. Which each central directory entire has the file path and the file signature location. Each file signature also has the file path as well. If we read from the start of the zip down we would see two file signatures with the same file name and path. The central directory only locates to one file signature added at the start of the central directory before writing the central directory back out and adjusting two values, for the entire to file signature, and the start position to the central directory. Damian Recoskie (talk) 21:37, 1 March 2023 (UTC)[reply]

How unreferenced files happen.[edit]

First of all it should be simplified so everyone can understand it rather than using words like appending files to the end of the zip.

The fast way of reading a zip is bottom up. Once the end of directory signature is found it tells you where to read the file to start reading the central directory. The central directory signatures is a listing of files paths in the zip followed by the location the file signature for the file can be found.

As a bit of a note each file signature contains the file path of the file as well. We can write a new file signature for this file at the location where the central directory starts. We then can write the central directory after this file signature and update the file signature location of just the one.

This also means you end up with two file signatures in the zip for the same file in which the one we added contains the changes. So the reality is we can add files at the start of the central directory and write it back out with the file we changed with the updated positions to the added file signature. This also means we end up with a file signature that is not referenced from the central directory. 24.150.140.42 (talk) 21:04, 1 March 2023 (UTC)[reply]