case study

Converting Hex Data to Binary for Image and PDF Processing

Client Background

A government agency needed a reliable solution to convert hexadecimal-encoded data back into binary format for further processing. The data, originating from a legacy document storage system, contained mixed content, including images and PDFs, which required proper identification and extraction.

The Problem

Challenges Faced

  • The client had large volumes of hex-encoded data representing documents and images but lacked a streamlined way to decode and process it.
  • Some files were corrupted or incomplete, requiring additional validation steps.
  • The system needed to distinguish between images and PDFs automatically to ensure proper handling.
  • Manual conversion was not feasible due to the scale of data and processing speed required.

The Solution

Our Approach & Process

To automate the conversion and processing, we developed a custom Python-based solution that:

  1. Decoded the hex-encoded data back into binary format using native Python bytes.fromhex().
  2. Identified file types through custom magic number detection, analyzing byte patterns for PNG, JPEG, and TIFF formats.
  3. Processed images using PIL (Python Imaging Library) for opening, handling, and saving images, utilizing BytesIO for in-memory operations.
  4. Reconstructed PDFs using PyMuPDF (fitz module) for PDF validation, handling document resources, and ensuring integrity.
  5. Logged errors and validation results, providing the client with a clear audit trail of processed files.

Technologies used:

  • Python & bytes.fromhex() for hex decoding.
  • Custom magic number detection for accurate file type identification.
  • Pillow (PIL) for image processing with in-memory handling via BytesIO.
  • PyMuPDF (fitz) for PDF validation and reconstruction.

The Results

Impact & Benefits

  • 100% successful hex-to-binary conversion, restoring original documents.
  • Automated file classification, eliminating the need for manual sorting.
  • Reduced processing time by 85%, enabling faster document retrieval.
  • Error handling & logging, ensuring all data integrity issues were flagged and resolved.

See More Case Studies

Automating Financial Data

A mid-sized municipality was facing challenges with manually importing financial transactions from an external application into Tyler Incode, their...

Learn more

Automating Buyer Notification Compliance

A municipal government was facing compliance issues with its Buyer Notification Inspection ordinance, which required sellers of real estate...

Learn more

Empowering Your Municipality Through Polity's Expertise

Unlocking Municipal Success with Our Unique Blend of Skills and Experience

Trusted by over 100 municipalities nationwide

A Partner We Can Depend On

5 star rating

"Polity is not just a service provider; they're a partner we can depend on. Their comprehensive solutions and dedication to our success have made all the difference. It's been a true collaboration."