From: Florian Diesch on
Eli the Bearded <*@eli.users.panix.com> writes:

> PDFs, SWFs, ROM images, non-standard container formats. I'd like some
> tool that can scan those for well-formed (continuous) bits of media.
>
> The strings(1) command can do that for finding text although it's not
> as intelligent as I'd like it to be. (Why do wide characters only get
> found with a special command line argument?)
>
> Are there tools to do this for JPEGs, GIFs, MP3s, AVIs? I know there
> are tools to find JPEGs in disk images (eg, for recovery of photos from
> camera media) but I suspect those rely on what's left of the filesystem
> information, so as to be able to reconstruct discontinuous files.

http://foremost.sourceforge.net/
http://jbj.rapanden.dk/magicrescue/


Florian
--
<http://www.florian-diesch.de/software/shell-scripts/>
From: Darren Salt on
I demand that Eli the Bearded may or may not have written...

[snip]
> $ jpegtran -copy all foo.jpeg+extra > foo.jpeg

> That is useful for removing the trailing extra bytes from a JPEG without
> also losing comments, Exif, etc, or doing a lossy recompression that you
> might get by opening and resaving the file. It doesn't cover the case of
> GIF or PNG or AVI or any thing else.

For PNG, /\x89PNG\r\n\x1A\n/ marks the start (and describes the header). You
can parse the content fairly easily; each chunk (following the start) has a
length word (32-bit, big-endian, excluding the chunk header) and a name (4
bytes), <length> bytes of data, then a 32-bit CRC. The first chunk's length
word immediately follows the header, and the last chunk has length=0 and
name="IEND".

--
| Darren Salt | linux at youmustbejoking | nr. Ashington, | Doon
| using Debian GNU/Linux | or ds ,demon,co,uk | Northumberland | Army
| + http://www.xine-project.org/

You will be aided greatly by a person whom you thought to be unimportant.