hey.
-
- zip has its trailer at the end of the file, while tar has headers in between archive members, so that works out
- gzip is DEFLATE with a header, while zip supports DEFLATE, so this also works out
- DEFLATE is almost closed under concatenation (there is a "this block is the last one" flag), but not quite
if there was a way to make each archive member's data its own DEFLATE stream, and each tar header its own DEFLATE stream too, and then prepend a gzip header and append a zip trailer, it could all work!
oh, i think this can be made to work!

-
hey. you know how every code forge in existence offers you two types of downloads: tar and zip?
i wonder if you can make that into one file that is both tar and zip.
@whitequark probably, but the benefit of tar.gz is that it can compress across files.
-
hey. you know how every code forge in existence offers you two types of downloads: tar and zip?
i wonder if you can make that into one file that is both tar and zip.
@whitequark There's also a whole genre of polyglot hacks: https://github.com/corkami/docs/blob/master/AbusingFileFormats/README.md.
-
- zip has its trailer at the end of the file, while tar has headers in between archive members, so that works out
- gzip is DEFLATE with a header, while zip supports DEFLATE, so this also works out
- DEFLATE is almost closed under concatenation (there is a "this block is the last one" flag), but not quite
if there was a way to make each archive member's data its own DEFLATE stream, and each tar header its own DEFLATE stream too, and then prepend a gzip header and append a zip trailer, it could all work!
@whitequark I think zip also has headers between members? this reminds me of one .zip file I had encountered which had different contents if you scanned it forwards from the beginning (python zip module) vs if you went backwards from the trailer (normal programs)
-
@whitequark probably, but the benefit of tar.gz is that it can compress across files.
@leah @whitequark For a download you can send Content-Encoding: gzip instead of Content-Type: application/gzip (or as well as, it won’t make a difference if the whole stream is already compressed)
-
hey. you know how every code forge in existence offers you two types of downloads: tar and zip?
i wonder if you can make that into one file that is both tar and zip.
-
@whitequark probably, but the benefit of tar.gz is that it can compress across files.
@leah to me the benefit of tar.gz is that it can represent the x bit
-
@whitequark There's also a whole genre of polyglot hacks: https://github.com/corkami/docs/blob/master/AbusingFileFormats/README.md.
@pervognsen yeah i know
-
@leah @whitequark For a download you can send Content-Encoding: gzip instead of Content-Type: application/gzip (or as well as, it won’t make a difference if the whole stream is already compressed)
-
@whitequark I think zip also has headers between members? this reminds me of one .zip file I had encountered which had different contents if you scanned it forwards from the beginning (python zip module) vs if you went backwards from the trailer (normal programs)
@grawity iirc the zip file headers can appear in a somewhat random order
-
@whitequark zip because Windows?
(Nowadays windows 11 can unzip more than .zip anyway so that's slowly getting "solved")
Also yeah, that kind of trick could be really fun but ultimately they would still be .zips
Not just Windows. The fact that you can extract individual files from a zip without full extraction has a few advantages in some use cases. Including using the file as backing store for a read-only filesystem.
-
Not just Windows. The fact that you can extract individual files from a zip without full extraction has a few advantages in some use cases. Including using the file as backing store for a read-only filesystem.
@david_chisnall @Ronflaix this is in fact my motivating example: I have a tarball that's 3 TB long, and I don't have a spare 3 TB (or spare 3.5 hours) every time I need one file from it
-
oh, i think this can be made to work!

@whitequark ... I like the way you're thinking.
-
R relay@relay.infosec.exchange shared this topic
