Extract a bordered, skewed rectangle from an image

  • Follow


We have a scanned document on which a label has been attached. The label has 
been designed to have a border that makes it easy to determine the correct 
orientation and area of the label. The label portion of the scanned image 
needs to be extracted and deskewed as an image. The contents of the label 
will change, but the border won't
I originally posted this onto RentAcoder as a project, but I am not getting 
a lot of responses. It might be that I requested it be done in Python, its 
too hard or I am too stingy. You can see the project here:
http://www.RentACoder.com/RentACoder/misc/BidRequests/ShowBidRequest.asp?lngBidRequestId=1402446

It may not be feasible to do this project without the use of an image 
processing engine such as openCV. There is a routine in openCV called 
cvMinAreaRect2() that may do the job of returning a matching rectangle that 
is inclined. There is a Python to openCV interface available. So I think all 
the pieces are there, but this is out of my league as I have had very little 
experience with image processing.

I am wondering whether there are any people here that have experience with 
openCV and Python. If so, could you either give me some pointers on how to 
approach this, or if you feel so inclined, bid on the project. There are 2 
problems:
How do I get openCV to ignore the contents of the label and just focus on 
the border?
How to do this through Python into openCV? I am a newbie to Python, not 
strong in Maths and ignorant of the usage of openCV.

Thanks.


0
Reply Paul 5/7/2010 1:03:55 AM

"Paul Hemans" <darwin@nowhere.com> writes:

> I am wondering whether there are any people here that have experience with 
> openCV and Python. If so, could you either give me some pointers on how to 
> approach this, or if you feel so inclined, bid on the project. There are 2 
> problems:

Can't offer actual services, but I've done image tracking and object
identification in Python with OpenCV so can suggest some approaches.

You might also try the OpenCV mailing list, though it's sometimes
varies wildly in terms of S/N ratio.

And for OpenCV specifically, I definitely recommend the book "Learning
OpenCV" by O'Reilly.  It's really hard to grasp the concepts and
applications of the raw OpenCV calls from the API documentation, and I
found the book (albeit not cheap) helped me out tremendously and was
well worth it.

I'll flip the two questions since the second is quicker to answer.

> How to do this through Python into openCV? I am a newbie to Python, not 
> strong in Maths and ignorant of the usage of openCV.

After trying a few wrappers, the bulk of my experience is with the
ctypes-opencv wrapper and OpenCV 1.x (either 1.0 or 1.1pre).  Things
change a lot with the recent 2.x (which needs C++ wrappers), and I'm
not sure the various wrappers are as stable yet.  So if you don't have
a hard requirement for 2.x, I might suggest at least starting with 1.x
and ctypes-opencv, which is very robust, though I'm a little biased as
I've contributed code to the wrapper.

> How do I get openCV to ignore the contents of the label and just focus on 
> the border?

There's likely no single answer, since multiple mechanisms for
identifying features in an image exist, and you can also derive
additional heuristics based on your own knowledge of the domain space
(your own images).  Without knowing exactly what the border design to
make it easy to detect is, it's hard to say anything definitive.

But in broad strokes, you'll often:

  1. Normalize the image in some way.  This can be to adjust for
     brightness from various scans to make later processing more
     consistent, or to switch spaces (to make color matching more
     effective) or even to remove color altogether if it just
     complicates matters.  You may also mask of entire portions of the
     image if you have information that says they can't possibly be
     part of what you are looking for.
  2. Attempt to remove noise.  Even when portions of an image looks
     like a solid color, at the pixel level there can be may different
     variations in pixel values.  Operations such as blurring or
     smoothing help to average out those values and simplify matching
     entire regions.
  3. Attempt to identify the regions or features of interest.  Here's
     where a ton of algorithms may apply due to your needs, but the
     simplest form to start with is basic color matching.  For edge
     detection (like of your label) convolutions (such as gradient
     detection) might also ideal.  
  4. Process identified regions to attempt to clean them up, if
     possible weakening regions likely to be extraneous, and
     strengthening those more likely to be correct.  Morphology
     operations are one class of processing likely to help here.
  5. Select among features (if more than one) to identify the best
     match, using any knowledge you may have that can be used to
     rank them (e.g., size, position in image, etc...)

My own processing is ball tracking in motion video, so I have some
additional data in terms of adjacent frames that helps me remove
static background information and minimize the regions under
consideration for step 3, but a single image probably won't have
that.  But given that you have scanned documents, there may be other
simplifying rules you can use, like eliminating anything too white or
too black (depending on label color).

My own flow works like:

1. Normalize each frame

   1. Blur the frame (cvSmooth with CV_BLUR, 5x5 matrix).  This
      smooths out the pixel values, improving the color conversion.
   2. Balance brightess (in RGB space).  I ended up just offsetting
      the image a fixed (x,x,x) value to maximize the RGB values.
      Found it worked better doing it in RGB before Lab conversion.
   3. Convert the image to the "Lab" color space.  I used Lab because
      the conversion process was fastest, but when frame rate isn't
      critical, HLS is likely better since hue/saturation are
      completely separate from lightness which makes for easier color
      matching.

2. Identify uninteresting regions in the current frame

   This may not apply to you, but here is where I mask out static
   information from prior background frames, based on difference
   calculations with the current frame, or very dark areas that I
   knew couldn't include what I was interested in.

   In your case, for example, if you know the label is going to show
   up fairly saturated (say it's a solid red or something), you could
   probably eliminate everything that is below a certain saturation
   level.  Or if they are black and white documents, but the label has
   a color, it might be very easy to filter out everything but the
   label.

   If you're lucky, some simple heuristics applied here might have the
   net effect of masking the majority of your document image away,
   leaving primarily the label.

3. Color matching

   1. Mask off regions of the image not falling within a specific Lab
      pixel range, sufficient to encompass my object under a variety of
      lighting/camera conditions.  I typically use cvInRangeS to set
      the mask bits for pixels within the range.
   2. Perform an erosion/dilation process - cvMorphologyEx against the
      mask as CV_MOP_CLOSE.  What this does is apply an erosion
      followed by a dilation.  The erosion removes very small features
      (likely unnecessary matches) while the dilation combines nearby
      features with each other.  The net effect is to strengthen
      larger matched areas (and help them become contiguous) while
      removing tiny features.  

   Note in my case I was looking for a relatively solid color ball (it
   had gaps since it was a whiffle ball), so if, for example, your
   label is alternating colors, or dashed lines or something like that
   it might not work as well.  There are more complicated algorithms
   that can match more elaborate patterns, sometimes with initial
   training on target images.

4. Object selection

   1. Locate all top level contours of any remaining solid areas
      in the mask (cvFindContours).  This will identify connected
      areas in the mask, so in your case, ideally one of the located
      contours would be the label edge.  This does assume that your
      feature identification in the prior step is likely to create
      contiguous areas.  Even just a few pixels of gaps will net a
      non-closed contour which is harder to work with, though the
      morphology operation will sometimes close those gaps.
   2. Evaluate "best" contour when multiple choices exist.  Very small
      areas are eliminated, and remaining areas are evaluated for
      average Lab value distance from a target point (somewhat
      arbitrarily chosen at this point to represent the "ideal" ball).
      The nearest (in color distance) contour is picked, except in the
      case of two "close" contours where the further contour can win
      if it is at least 4x (arbitrarily chosen) as large.  In your
      case, for example, any contours located within the label itself
      would necessarily be smaller than the label, so you could
      probably just pick largest.  Also, when calling cvFindContours
      you can prevent it from finding "interior" contours.
   3. Compute and return a minimum bounding circle (center, radius)
      for the selected contour.  In your case, you'd likely just use
      the contour itself - you can use the contour (with 'n' line
      segments) as is, or convert into an approximate polygon.

The nice thing about Python with OpenCV is the interactive
experimentation you can do right in the interpreter.  Open a highgui
window, load in your image and then experiment.  After performing
various processes, just quickly show the new image in the existing or
a new window.  You can keep several windows up to date when you test
process an image through several transforms to see the results.

Hope this at least gives you some thoughts as to how to proceed.

-- David
0
Reply David 5/7/2010 9:41:16 PM


Thanks David, that is a 'tonne' of information. I am going to have a play 
with it, probably looking at masking out the contents of the label and 
finding the label border within the scanned document is the place to start. 
Looks like there is going to be a learning curve here.

Thanks again for your help you really put a lot of effort into this.


0
Reply Paul 5/9/2010 12:03:29 PM

2 Replies
552 Views

(page loaded in 0.066 seconds)

Similiar Articles:













7/23/2012 6:32:42 PM


Reply: