So, how often do counsel botch an e-filing by leaving in redacted information?

Tim Lee has done a study of redaction failures using a nice-sized subset of the federal PACER database. More specifically, he looked at the documents donated to the “RECAP” database, assembled as volunteers donated copies of PACER filings as they downloaded them.

Tim wrote a program that analyzed each PDF file, looking for the tell-tale hand-drawn rectangles that are a hallmark of poor redaction.

In a sample size of 1.8 million PACER documents, he found about 2000 documents with these rectangles. He narrowed that set to documents where these rectangles sat on top of text — and after checking the best candidates by hand — found 194 with failed redactions. Most of those (“about 130″) were from commercial litigation. In addition to the redaction mistakes caught by this program, there were about 1700 other redaction failures that had been caught before the documents were donated to RECAP. (( Why so many? A large number of the RECAP documents had been donated by Carl Malamud, who spent some time trying to remove the sensitive information. )) An overall ratio of 1 redaction failure per 1000 filings seems pretty low to me. I am curious how many of those 1.8 million documents were scanned from paper rather than generated as native PDFs. Native PDFs can be more challenging to redact, and the newer federal rules require them.

How courts can avoid this problem going forward

Tim has graciously donated this code to the public domain. As things stand, it requires a little technical savvy. (( If you see the word “perl” and think of a dromedary, then you should have no problems. Otherwise, you might want to wait for someone to add an interface on top of these raw scripts. )) But it’s available to any court officials who might want to fold it into their e-filing systems or to anyone else who wants to build a more user-friendly interface.

How you can redact properly

Redaction is covered in the blog’s resources page about how to make e-briefs that satisfy the Texas rules. There is a deeper discussion about strategies for redaction in the document called “Workflow for E-Briefs,” beginning at page 15 of the PDF.