From: Chris Curvey on 9 Mar 2010 17:05 Has anyone ever tried to find the pixel (or point) location of text in a PDF using Python? I've been using the pyPdf libraries for other things, and it seems to me that if I can find the bounding box for text, I should be able to calculate the location. What I want to do is take a PDF of one of our vendor invoices and blur everything in it except the block that's related to a single customer. So if I have an invoice that looks like: Alfred Annoying 123 Elm St Somewhere, NJ $100 Barbie Bonehead 456 Pine St Elsewhere, NJ $125 Charlie Clueless 789 Beech St. Everywhere, NJ $150 I want to show Barbie just her section of the invoice (with the header intact, so that she can tell it's a real invoice) but with Alfred and Charlie's information blurred out. I was going to convert the PDF to a JPG or PNG and do the blurring with ImageMagick/PythonMagick. But that requires me to know the pixel location of the regions that I want blurred and left alone. I'm also open to other ideas if I'm going about this the hard way....
|
Pages: 1 Prev: How to detect C Function block using python Next: Can't build hashlib |