board

2018-04-24 ALTO Board Meeting Minutes

Agenda

  1. Review action items and other updates. [Art leads discussion]
  2. Review International Image Interoperability Framework (IIIF) OCR correction proposal and response to request for ALTO Board member to attend monthly Newspaper IIIF meetings. [Art leads discussion]
  3. Description of Google’s lattice structure for text representation and possible implications for ALTO. [Ashok leads discussion]
  4. Open discussion of new use cases for ALTO, such as handwriting and supporting OCR in video. [All]
  5. Other business. [All]

View all open action items here.

Attending members

Participating observers

Minutes

Board action items reviewed. Jo updated README on github to reflect that version 4.0 is the latest release.

In response to IIIF request, Ashok and Joachim will both attend monthly IIIF Newspaper meetings (Glen Robson, IIIF Technical Coordinator, has confirmed that any Board member is welcome to attend). OCR correction discussed and different approaches were described. IIIF initiative is promising.

Reeve gave some background on Google’s approach to handwriting (skipped around the agenda a bit so he could make another commitment). Using recognition engine for printed text is showing promise for handwritten documents. Increasing interest within digital library community to support handwriting recognition.

Ashok and Jake gave an introduction to Google’s lattice structure for OCR. Discussion on how supporting a lattice model might fit into ALTO. ALTO attempts to give the most definitive description possible of the content of a digital object, would adding the depth of processing supported in a lattice approach add too much complexity? Christian’s experience is that ALTO files are getting sizeable. On the other hand, ALTO has already moved to support glyph variants, so there is precedent for probabilities. Jo noted that there is scope to use additional syntax outside of ALTO for specific cases. Ashok will provide some examples to help better understand the modeling and its applicability to ALTO.

The meeting adjourned after 2 hours of interesting discussion. Thanks to all involved.