Efficiently Digitizing Image Data

Friday, May 6, 2011
Similar to "translation" and "design" work, digitizing is another line of work that is popular for telecommuting. This is the work of converting image data to text data while working from home, and is also called data input service or data entry service.

There are many types of undigitized data, such as government court papers, family registers, maps and more. Today's workflow assumes a contractor receives the image data and returns the text data.

* Although optical character recognition (OCR) technology has made much process, it's still hard to beat human hands. Workflows will most likely be designed for either (a) all work done by humans, (b) collaboration between humans and OCR, or (c) most work done by OCR and checked by humans. The below workflow goes with "(a) all work done by humans."

1. Register Image Data, 2. Accept & Schedule, 3. Text Data, 4. Check

[Efficiently: < Digitizing Image Data >"2. Accept & Schedule" screen]

<Process Data Items>
  • Title (Summary of digitization job, e.g. "Dec 1990 Osaka district verdict")
<<Image data info>>
  • Original image created on (date)
  • Original image stored at (string)
  • Original image (file)
  • Notes (string)
<<Text data info>>
  • Deadline (date)
  • Digitized text (string)
  • Digitized text doc (file)
  • Correspondence (discussion)
  • Check (OK / NG)
In the above workflow sample, the worker confirms the image data registered in task 1 (Register Image Data) and, if deciding to accept it, responds his/her workable schedule (Task 2. Accept & Schedule).
The below workflow allows the checker to send additional cautionary points. This is good for sharing know-how.