Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion

Welcome to the CollectiveAccess support forum! Here the developers and community answer questions related to use of the software. Please include the following information in every new issue posted here:

  1. Version of the software that is used, along with browser and version

  2. If the issue pertains to Providence, Pawtucket or both

  3. What steps you’ve taken to try to resolve the issue

  4. Screenshots demonstrating the issue

  5. The relevant sections of your installation profile or configuration including the codes and settings defined for your local elements.

If your question pertains to data import or export, please also include:

  1. Data sample

  2. Your mapping

Answers may be delayed for posts that do not include sufficient information.

Document transcription and annotation

I am new to CA and I am doing some tests to check if it could be implemented by my institution. Due to some peculiarities of our holdings, at the moment we don't use archival description softwares but we are planning to adopt one to catalogue several personal archives.

Now my question is not about standards because I saw that CA is really flexible in this matter; I would like to know if it would be possible to implement (or already exists) in the backend (Providence) a tool to transcribe documents. A typical user story could be the following:
1. a cataloguer describes a textual item (a letter, a book etc.),
2. uploads a digital version of it (the digitization of the letter, book etc.),
3. then provides a transcription of it directly in Providence, using a tool that presents the page to transcribe on one side and a blank space to trascribe it on the other (an implementation of TEI would be great).

I am talking of something like the ProofreadPage Extension in Mediawiki, but without the collaborative environment.

Are there any chance to realize this in CA?

Thanks for any answer!


  • There is a new interface for crowd-sourced transcription in the public front-end component (called "Pawtucket"). It's being tested now. There is no back-end transcription interface currently, beyond plain-old text fields accessible in each media record. Perhaps we'll add that in the future, but there are no concrete plans at the moment.

  • Hello Seth,
    many thanks for your answer, this is very interesting news! I'd be very happy to help in the testing, if I may.

    It makes a lot of sense to implement transcription in the front-end, if the goal is a crowd-sourced oriented tool. Could you give me more information about this new interface? If it's customizable (maybe from the backend?), if it's TEI compliant and support other kind of annotation schemas, how interacts with the database (transcriptions become 'data' of the records or are they more like 'comments'? Are they indexed and searchable?) etc.

    Thanks again for your help,

  • There is one site testing this interface currently. It's not public yet. When it becomes public I'll be happy to share it with you.

    Transcriptions are not TEI compliant per se, they're just text, but there's no reason they couldn't be. We currently don't support markup in the text per our sponsor's requirements, but that would be easy to add as well. Whether we support Markdown, Wikitext, TEI or some subset of HTML is a point for discussion.

    The transcriptions are attached to individual representations on the object and appear in the context of the representation. They are stored separately from other catalogued content but are searchable.


Sign In or Register to comment.