Follow via RSS, Twitter, Mastodon, Telegram or email.

Preview in macOS Big Sur is irreversibly destroying PDFs – again

This image has three components: On the left is an OCR’ed PDF from my ScanSnap iX500. I have selected most of the text, and on the right side you can see two copy&paste results. In the upper half is the result directly after scanning, right after the bundled ABBYY FineReader that comes with the iX500 did its magic. In the lower half is the result after modifying (removed a blank page) and saving that same PDF in Preview.

Hard to believe, but that’s not the first time Apple messed this up. Sure, even Apple can’t account for all use cases when changing complex stuff like internal PDF handling. But:

  • The iX500 is an insanely popular and common scanner
  • I don’t know any OCR software that is more popular than ABBYY FineReader
  • macOS used to be the absolute best in class OS for dealing with PDFs by a long shot

I wish Apple was still charging for OS updates, so I could at least refund it.1 This is such a nasty bug – if you don’t already know to expect it, you will only find out months or possibly years later. I almost missed it this time, because even after modifying and saving the file it’s still not happening. You have to completely close the file and reopen it, only then will you realize that it has been destroyed.

  1. Yes, I blame only Apple for this. I’ll repeat what I told Philipp (noted Apple apologist!) when we argued about this last week after I discovered the problem: ABBYY says they don’t support Big Sur yet, that’s fine. But Apple didn’t tell me that I can’t upgrade to Big Sur when I use ABBYY. I’d be a lot less angry if there was a changelog or release notes from Apple where it says there is a known problem with OCR’ed PDFs in Preview. Their software is broken, they need to tell me. I don’t care if it only worked because they had workarounds for super shitty PDFs that ABBYY possibly produces, I just need my OS to keep working for me. This bug could hit me without even owning a scanner at all – someone sending me a PDF that I then unknowingly break before archiving it. That’s the part I’m mad about. ↩︎

Manuel was annoyed on December 16, 2020 at 19:58