User Tools

Site Tools


tips
no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


tips [2015/12/02 06:52] (current) – created sbw
Line 1: Line 1:
 +This is just a place to record info on tips for improving the scanned files. It is probably not of interest otherwise.
  
 +===== Removing speckles from scans =====
 +
 +After having manually cleaned up speckles from too many scans, I eventually found a useful technique that can automatically remove many speckles, without affecting the text too badly.
 +
 +It works on the principal that the speckles are invariably smaller than the text. First, we reduce the size of both until the speckles disappear. Then the "seed" that's left of the text is grown back to the original.
 +
 +It uses the image processing software [[http://www.imagemagick.org/script/index.php|ImageMagick]]. The instructions are based on the information in this [[http://www.imagemagick.org/discourse-server/viewtopic.php?t=18707|thread]].
 +
 +  convert "e:/source.tif" -write MPR:source -morphology close rectangle:3x4 -clip-mask MPR:source -morphology erode:40 square +clip-mask  "e:/output.png"
 +
 +The key items to play with are the size of the initial rectangle (3x4) for the close operation, and the number of iterations of erode (40). The rectangle effectively determines the largest speckle that will be removed. However, making it much large also tends to result in portions of text being removed. The number of iterations of erode is to build the image back up. For letters with thin branches (such as e or g), smaller numbers seem to result in missing chunks off the branches.
 +
 +This algorithm could possibly be improved by using multiple iterations in the first instance to reduce the speckles. I haven't tested this.
tips.txt · Last modified: 2015/12/02 06:52 by sbw

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki