File Carving

File Carving

What is File Carving?

The simplest file carving definition is this: File carving is the process of recovering files from raw storage data without using the file system’s metadata.

When the file system structure is missing, corrupted, or wiped, file carving steps in as a content‑based method that looks for identifiable patterns inside the data itself.

This method is widely used in:

  • Digital forensics – to prove that specific files once existed on a device, even if they were deleted.
  • Data recovery – to restore content from formatted, damaged, or partially overwritten drives, SD cards, and USB sticks.

Because file carving does not rely on the file system’s record of what used to be where, it can sometimes recover files long after their directory entries or allocation records disappeared.

How Does File Carving Work?

File carving works by examining the raw bytes on a storage device and identifying what each region of data represents – whether it’s part of a text file, an executable, a PNG image, an MP3 audio file, or something else. Since no metadata is available, the carving tool must recognize files based on their internal structure, not on any file system records.

The simplest and most widely used technique relies on file signatures, also known as magic numbers. These are unique sequences of bytes found at the beginning (and sometimes the end) of specific file types. They act like a label that tells the software, “a file of this type starts here.”

Here are some examples:

  • JPEG: FF D8 FF
  • PNG: 89 50 4E 47
  • Java class file: CA FE BA BE

For formats that also include a footer, carving becomes even easier because data recovery software can scan from the header to the exact endpoint of the file. When only a header exists, the software must estimate where the file ends based on structure and size rules for that format.

Why File System Structure Matters

To understand how carving behaves, it helps to know how file systems place data on a disk. Most file systems (FAT, NTFS, UFS, and others) store data in fixed‑size blocks called clusters (or sectors). In a typical FAT32 layout, clusters might be 4 KB each:

  • A file smaller than 4 KB fits entirely in one cluster.
  • Larger files span multiple clusters.
  • Only one file can occupy a cluster at a time.

If a file grows over time or the disk becomes cluttered, the file system may place its clusters in different physical locations. This is called fragmentation. Instead of a file occupying clusters 100-120 in one clean sequence, it might be stored in three or four separate groups scattered across the device.

Fragmentation is common for large files, and it creates real challenges for file carving. When a file’s data is not stored in one continuous block, the software must figure out which clusters belong together (even though the file system metadata that originally tracked this is gone).

File Carving Breakdown (Step-by-Step)

Here’s a quick look at how file carving actually works behind the scenes:

1. Raw Disk Scan

The tool reads the entire storage device sector by sector. It ignores whether the file system says the space is “used” or “free” – everything gets checked.

2. File Signature Detection

As the scan runs, it looks for known file headers, unique byte patterns that signal the start of a specific file type (like JPEG or PDF). Some formats also include footers, which mark the end of the file and help with precise extraction.

3. Block Collection

Once a header is found, the tool begins pulling data from that point onward. If a footer is found, it stops there.

If not, it estimates the end based on the file type’s structure or default length rules.

4. Validation

To avoid false positives, the software checks whether the carved data actually makes sense. For example, can a JPEG viewer open it? Does the file structure look intact?

5. Fragment Handling (If Needed)

If the file is split across multiple areas (i.e. fragmented), the tool tries to reassemble the scattered parts using rules, heuristics, or statistical models. This is where more advanced carving logic kicks in.

6. Export

Each recovered file is saved with a generic name (like file001.jpg) since the original names, folders, and timestamps are gone.

Limitations of File Carving

File carving, like everything in the data recovery space, is not without its downsides.

Here are the key limitations to keep in mind:

  • No metadata – you won’t get original file names, folder structure, or timestamps.
  • Fragmentation is a challenge – if a file is broken into pieces across the disk, reconstruction becomes difficult. Most tools struggle or fail with fragmented files, especially videos and large documents.
  • False positives – some byte patterns that look like file signatures might appear by coincidence in other data, which leads the tool to recover junk files that don’t actually work.
  • Limited file format support – carvers work best with well-known formats like JPEG, PNG, or PDF. Rare or proprietary formats may not be detected or properly reconstructed (depending on the software).
  • Encrypted or compressed data is hard to carve – files stored inside encrypted containers, archives, or virtual disk formats are harder to detect and extract unless the tool supports deep parsing.
  • High resource usage – deep scans can be slow and CPU-intensive, especially on large drives. Not ideal for older systems or time-sensitive cases.

Carving Schemes

Over time, researchers and developers have introduced smarter file carving methods to improve accuracy. Below are some of the most widely recognized carving schemes used in modern tools and forensic workflows:

Bifragment Gap Carving (BGC)

This technique focuses on reassembling files that have been split into two main pieces. Here’s how it works:

  • The carver identifies potential start and end fragments of a file.
  • It tries to “fill the gap” by testing whether the fragments form a valid file when combined.
  • If the combination produces a readable file (e.g., a playable video or viewable image), it’s marked as a successful carve.

BGC speeds up recovery for files broken in two, which is common with certain storage patterns.

SmartCarving

SmartCarving goes further by handling files split into more than just two pieces. It relies on heuristics based on how file systems behave, and typically includes:

  • Preprocessing – decompressing or decoding blocks if needed.
  • Collation – grouping blocks by suspected file type (JPEGs, DOCs, etc.).
  • Reassembly – using rules and context to piece together scattered blocks into valid files.

This approach is useful when fragmentation is high or when the file structure is partially damaged. It was developed in academic settings and later adapted into commercial and forensic tools.

Memory Dump Carving

File carving also works on RAM dumps. Carving memory snapshots helps investigators recover:

  • Recently opened images and documents
  • Chat logs, browser history, and form data
  • Encryption keys for locked drives (if they were active in memory).

These carving schemes aren’t mutually exclusive – many modern recovery tools use combinations of them, and switch methods based on file type, fragmentation level, and available data. The goal is always the same: to increase the chances of recovering usable files even in messy, incomplete, or damaged environments.

FAQ

Data recovery is a broad field that includes any method used to regain access to lost or inaccessible data. File carving is one specific technique within that field. It focuses on reconstructing files without metadata, purely from their raw content. In short:
  • Data recovery = the whole toolbox.
  • File carving = one tool in that box, used when metadata is gone or unreliable.
A common example of data carving is recovering deleted photos from a formatted SD card. Let’s say you accidentally format your camera’s memory card. All your pictures seem gone - nothing shows up. But in many cases, the actual image data is still there in raw form. The file system just stopped tracking it. A file carving tool scans the entire SD card, looking for JPEG file signatures (specific byte patterns like FF D8 FF at the beginning and FF D9 at the end). When it finds one, it pulls the data in between and reconstructs the image, even though the original name and folder are lost.
There are many apps that include file carving capabilities as part of their feature set. Some of the best data recovery software that support file carving include:
  • Disk Drill - User‑friendly, strong at carving photos, videos, and documents.
  • R‑Studio - Professional‑grade tool with advanced carving and file system support.
  • PhotoRec - Free, open‑source utility that excels at carving hundreds of file types.
  • Recuva - A simple tool with a deep scan that uses carving techniques.
« Back to Glossary Index