We designed an algorithm to automatically extract relevant elements in archive documents, such as text and images. We also build a website to allow non expert to perform extractions on any IIIF document.
docExtractor: An off-the-shelf historical document element extraction
Tom Monnier and Mathieu Aubry, International Conference on Frontiers of Handwriting Recognition (ICFHR) 2020
PDF, webpage, code, demo