Agenda for December 02, 2015 ALTO board teleconference.

  1. Review action items [Frederick leads discussion]
  2. Review proposed changes to ALTO Board Membership. See items in italics and marked [PROPOSED] on the ALTO Board Membership page. [Frederick leads discussion]
  3. Discuss new board member recruited by Jean Philippe. [Jean Philippe leads discussion]
  4. Decide when the schema changes necessary for Shape change request (issue 22) will be published. Will we combine it with some other (simple) issue? [Evelien leads (brief) discussion.]
  5. Summarize the face-to-face meeting discussion of the fragement identifier API for ALTO change request (issue 33). [Jean Philippe leads discussion.]
  6. Summarize the face-to-face meeting discussion and continue to discuss OCR confidence value change request (issue 23). [Jukka and Nate lead discussion.]
  7. Continue discussion of Glyphs change request (issue 26). (See item 10 in minutes from London face-to-face meeting) [Joachim leads discussion.]

View all open action items here.

Attending members

Minutes

The ALTO board welcomes Clemens Neudecker as an ALTO board member.

About new ALTO board members: Jean Philippe reported that he has contacted Stefan Pletschacher at the University of Salford about becoming an ALTO board member. He explained that Stefan participated in the development of PAGE XML for the IMPACT project. Both Jean Philippe and Clemens have worked with Stefan. Everyone agreed that Stefan would be a very qualiifed board member. Evelien reports that she will speak to the Google quality group next week and ask if someone at Google has interest in becoming an ALTO board member. Jukka has not yet been able to talk to anyone at the National Archives of Sweden about ALTO board membership. Joachim (not present) was to have contacted a free-lancer about ALTO board membership. Evelien knows the fellow Joachim was thinking about -- he works at the KB. She will contact him about ALTO board membership. (Note: As these minutes are written, Stefan Pletschacher has submitted his resume and expressed interest in ALTO board membership. Raju Buddharaju submitted his resume, too, in an unsolicited expression of interest in ALTO board membership. )

Frederick mentions the proposed ALTO board membership criteria which were proposed and discussed at the face-to-face meeting in North Carolina. Since there is not a quorum of members (2/3 of the membership as proposed by 4.4), Jukka asks if 2.8 means 4 consecutive meetings or 4 meetings total during the year. Frederick says his intention in writing it was 4 meetings total. The board agrees that 3.1 should be amended to say that 1 ALTO board member must be from the Library of Congress. All attending board members agree that the proposed additions to the ALTO board membership criteria are otherwise Ok as written. The attending board members agree that the proposed changes to the ALTO board membership criteria are Ok as written. Frederick will email the board members not attending and ask for their votes by email. (Note: As these minutes are written, Joachim, Nate, and Brian -- board members who were not part of the Dec 2 teleconference -- have agreed to the proposed changes.)

Evelien reports that we decided that inheritance was not a problem wrt the Shape change request (issue 22). Therefore, since all board members agree that issue 22 is fine (voting is done), issue 22 is now open for public comment. It will remain open for public comment until the next board teleconference. See issue 22 comments.

The board discussed how to publicize the schema change. Frederick suggests publicizing schema changes via the ALTO mailing list hosted by the Library of Congress. Evelien suggests notifying SUCCEED about the schema change. She will write a notice about the schema change, publish via SUCCEED, and send the notice to Clemens. Clemens will forward the notice of the change to the Europeana newspapers mailing list as well as the IMPACT Competence Centre. If there are no critical public comments, schema version 3.1 will be released following the next board teleconference.

Jukka suggested that the flow of an issue through the state changes be publicly documented somewhere. Jukka will make this happen...

Clemens will negotiate with Joachim about becoming champion for issue 27. Jukka suggested too that issue 27 and issue 13 should be combined, or, at the very least, should be discussed simultaneously.

Jean Philippe summarized discussions about a fragment identifier API (Alto Fragment Identifier Framework = AFIF) for ALTO that we had at the Chapel-Hill face-to-face meeting. The idea is based on the International Image Interoperability Framework (IIIF). The idea is to help digital libraries to implement bookmark functionality. The AFIF would not impact the schema. The AFIF would give access to OCR text or a portion of an image. It is a specification that could be implemented by any digital library using ALTO. Jean Philippe gave an example of the AFIF with a note / marginalia from Gallica, which, because the fragment identifier is used, can be referenced from outside Gallica. Jean Philippe said that not only do users share digital library data but applications do too. Hence the usefulness of a AFIF. He suggests that it might be useful to contact IIIF and ask their opinion about the usefulness of such a AFIF. Frederick remarked that a AFIF would amount to a guarantee by the ALTO board that the ALTO schema will not change so much that it would break the API. Clemens said that the ALTO schema can be considered as the AFIF; he cited an example of work that he is doing in his own library. Jean Philippe said that IIIF syntax is very simple and the AFIF would similarily be simple. Evelien said that AFIF would amount to a guarantee that identifiers in ALTO files are permanent. Clemens remarked that IIIF permits a reference to an image or a part of an image which is not possible without IIIF. He further said that this is possible already with ALTO because ALTO data already has internal "markers" or identifiers which can be referenced. Clemens explained too that there already exists a mechanism similar to AFIF for PAGE XML (developed at the University of Salford by Stefan Pletschacher et al). That mechanism for PAGE XML already includes ALTO XML. Jean Philippe said that the idea for AFIF was inspired by IIIF and by EPUB.

Jukka summarized the Chapel Hill face-to-face meeting discussion of OCR confidence. OCR software today does not, perhaps cannot, use a uniform method of calculating OCR confidence across all OCR software. Therefore the calculation of OCR confidence should be a matter of documentation of how OCR confidence is calculated. Clemens remarked that the way different OCR software packages calculate OCR confidence is an internal matter for the creator of the OCR software. Clemens recommended there is a common way to express confidence for page, word, and character in the ALTO schema regardless of the OCR software. Further both Jukka and Clemens recommended that the ALTO board propose a common way of documenting the confidence calculation. Frederick suggested that issue 23 be combined with issue 27. He also remarked that harmonizing OCR confidence expressions, for example, page, word, and character confidences be expressed as floats (or something else), will break backward compatibility.