Agent Extraction· Sift

Read everything hidden inside a file.

Sift reads the hidden story inside a file: the camera and GPS behind a photo, the author and structure behind a PDF, the text and images locked inside a document, in a fraction of the time it takes to launch a traditional tool. It is a clean-room rewrite in Rust that does the work of two entire legacy toolchains, and slots natively into five programming languages.

2-in-1 replaces ExifTool and Poppler

20 file formats, JPEG to RAW to PDF

5 native language SDKs

Request access All core systems

Why Sift

Reading a file meant juggling old command-line tools.

The tools for reading file metadata and pulling text or images out of PDFs are a patchwork of aging command-line programs that you install separately, launch as a new process for every file, and parse back from brittle text output. They are platform-specific, slow at scale, and cannot be embedded in an application.

Sift collapses all of that into one fast, native library any program can call directly, with no external binaries to ship or shell out to.

Capabilities

What it can do.

Camera and photo metadata

Pulls EXIF, IPTC, XMP, and color-profile data, the camera, exposure, GPS, captions, and copyright, from JPEG, TIFF, PNG, WebP, HEIF, AVIF, and more.

Proprietary camera maker notes

Decodes the vendor-specific data Canon, Nikon, Sony, Fujifilm, and Apple bury in their files, with a built-in lens-identification database.

RAW camera files

Reads ten RAW formats, from Canon and Nikon to Sony, Adobe DNG, and Fujifilm.

PDF text with layout intelligence

Reconstructs reading order, columns, words, and lines, handling multi-column pages, right-to-left scripts, and vertical CJK text, with correct Unicode mapping. Replaces pdftotext.

Images out of PDFs

Extracts embedded images including inline images and masks, with deduplication across pages. Replaces pdfimages.

Full document info

Title, author, page count and sizes, version, encryption, tagged and linearized status, and PDF/A and PDF/X conformance. Replaces pdfinfo.

Password-protected PDFs

Authenticates user or owner passwords across every standard encryption scheme and reports the permissions the document allows.

Forms, annotations, and structure

Extracts interactive form fields, annotation types, and the tagged structure tree of headings, tables, and alt-text used for accessibility.

Built different

Why it is not just another parser.

Two toolchains, one library

Sift does in a single embeddable library what used to need ExifTool plus three separate Poppler programs, with nothing to install or shell out to.
Clean-room from the specifications

Written from the published standards, not ported from the restrictively-licensed originals, so it is freely embeddable in commercial products, unlike the tools it replaces.
Memory-safe by construction

Built in Rust, eliminating the buffer-overflow bugs that have produced real security holes in image and PDF parsers. The format parsers contain no unsafe code.
Engineered to skip the heavy lifting

Instead of a full rendering engine, Sift reads only what is needed for text and metadata, using memory-mapped, zero-copy reads so only the bytes it touches are ever loaded.
Built for hostile input

Recovers gracefully from truncated files, corrupt cross-reference tables, and circular references, and is continuously fuzz-tested across every format.
Native in five ecosystems

The same core ships as idiomatic packages for Rust, .NET, Java, Python, and Node.js, plus a C interface, not thin wrappers around a CLI.

By the numbers

One library, deep coverage.

20file formats parsed

EXIF/IPTC/XMP/ICCmetadata standards

10RAW camera formats

Zero-copymemory-mapped reads

Agent-first

Understand any file the moment it arrives.

When an agent receives an uploaded file, it needs to know what the file is and what is inside before it can act, and it needs that answer in microseconds, not after spinning up a subprocess. Sift gives an agent an instant, structured read of any image or PDF: the metadata, the text, the embedded images, the location, the document structure, all from one in-process call with no external tools to orchestrate, returning typed data it can reason over directly.

Identify and triage any uploaded image or PDF the instant it arrives.
Pull a PDF's full text and reading order straight into a model's context.
Surface provenance signals, camera, GPS, dates, embedded scripts, encryption, for trust decisions.

Put Agent Extraction to work.

See HQ running in your own Slack or Teams, on the operating system we built for agents.

Request access Explore qOS