You are previewing Implementing Mobile Document Capture with IBM Datacap Software.
O'Reilly logo
Implementing Mobile Document Capture with IBM Datacap Software

Book Description

Organizations face many challenges in managing ever-increasing documents that they need to conduct their businesses. IBM® content management and imaging solutions can capture, store, manage, integrate, and deliver various forms of content throughout an enterprise. These tools can help reduce costs associated with content management and help organizations deliver improved customer service. The advanced document capture capabilities are provided through IBM Datacap software.

This IBM Redbooks® publication focuses on Datacap components, system architecture, functions, and capabilities. It explains how Datacap works, how to design a document image capture solution, and how to implement the solution using Datacap Developer Tools, such as Datacap FastDoc (Admin). FastDoc is the development tool that designers use to create rules and rule sets, configure a document hierarchy and task profiles, and set up a verification panel for image verification.

A loan application example explains the advanced technologies of IBM Datacap Version 9. This scenario shows how to develop a versatile capture solution that is able to handle both structured and unstructured documents. Information about high availability, scalability, performance, backup and recovery options, preferable practices, and suggestions for designing and implementing an imaging solution is also included.

This book is intended for IT architects and professionals who are responsible for creating, improving, designing, and implementing document imaging solutions for their organizations.

Table of Contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. Preface
    1. The team who wrote this book
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  4. Part 1 Production imaging
  5. Chapter 1. Production imaging overview
    1. 1.1 The business document problem and approach to its solution
      1. 1.1.1 Paper everywhere
      2. 1.1.2 Business challenges posed by paper
      3. 1.1.3 Business challenges posed by electronic documents
      4. 1.1.4 Solving the problem with production imaging
    2. 1.2 Introduction to Production Imaging Edition
      1. 1.2.1 Components of the Production Imaging Edition offering
      2. 1.2.2 Taskmaster base and add-on components
      3. 1.2.3 The production imaging process
      4. 1.2.4 Focus on Taskmaster
    3. 1.3 Examples of applications
      1. 1.3.1 Cross industry: Automated forms processing
      2. 1.3.2 Cross industry: Distributed capture
      3. 1.3.3 Cross industry: General business documents processing
      4. 1.3.4 Cross industry: Accounts payable
      5. 1.3.5 Cross industry: Surveys
      6. 1.3.6 Government: Tax return processing
      7. 1.3.7 Healthcare and insurance: Medical claims
      8. 1.3.8 Banking and finance: Loan applications
      9. 1.3.9 Transportation and logistics: Shipping documents
    4. 1.4 Conclusion
  6. Chapter 2. System architecture
    1. 2.1 Architecture overview of Production Imaging Edition
    2. 2.2 Components of the Taskmaster system
    3. 2.3 Components of FileNet Content Manager
    4. 2.4 Overall system architecture
    5. 2.5 Deployment of Production Imaging Edition
      1. 2.5.1 Centralized deployment
      2. 2.5.2 Distributed deployment
      3. 2.5.3 Taskmaster Web deployment
    6. 2.6 Conclusion
  7. Chapter 3. Production imaging functionality
    1. 3.1 Functionality highlights of Taskmaster
    2. 3.2 Taskmaster process
      1. 3.2.1 Batch preparation for scanning
      2. 3.2.2 The Scan task
      3. 3.2.3 Background processing task
      4. 3.2.4 The Verify task
      5. 3.2.5 The Export task
    3. 3.3 Taskmaster GUI
      1. 3.3.1 Productive GUI
      2. 3.3.2 Image snippets
      3. 3.3.3 Color-coded recognition confidence
      4. 3.3.4 Click’n’Key capability
    4. 3.4 Taskmaster clients
      1. 3.4.1 Taskmaster Client (thick client)
      2. 3.4.2 Taskmaster Web client
    5. 3.5 Taskmaster background processes
      1. 3.5.1 Rule processing
      2. 3.5.2 Job, task, and task profile
      3. 3.5.3 Rule set
      4. 3.5.4 Rule
      5. 3.5.5 Processing of the document hierarchy at run time
      6. 3.5.6 Function and action
    6. 3.6 Taskmaster action libraries
      1. 3.6.1 Image cleanup and enhancements
      2. 3.6.2 Barcode recognition
      3. 3.6.3 Optical Character Recognition
      4. 3.6.4 Intelligent Character Recognition
      5. 3.6.5 Optical Mark Recognition
      6. 3.6.6 Classification
      7. 3.6.7 Fingerprinting
      8. 3.6.8 Content-based identification with IBM Classification Module
      9. 3.6.9 Language support
      10. 3.6.10 Imprinting and redaction
      11. 3.6.11 Locating text
      12. 3.6.12 Validations
      13. 3.6.13 Exports
    7. 3.7 Principles and tools of the Taskmaster configuration
      1. 3.7.1 Datacap Studio
      2. 3.7.2 Flex Capture and Flex Manager
      3. 3.7.3 Taskmaster Application Manager
      4. 3.7.4 RV2 report viewer
      5. 3.7.5 NENU
    8. 3.8 FileNet Content Manager for production imaging
      1. 3.8.1 Workflow management tools
    9. 3.9 Advanced production imaging viewing
      1. 3.9.1 PDF viewing and annotating
      2. 3.9.2 Universal viewing and annotating
      3. 3.9.3 Document streaming
      4. 3.9.4 Permanent redaction
    10. 3.10 Bulk Import Tool
    11. 3.11 Conclusion
  8. Part 2 Solution implementation
  9. Chapter 4. Solution example
    1. 4.1 Scenario background
    2. 4.2 Current claim approval process
    3. 4.3 New claim approval process
    4. 4.4 Summary of benefits
  10. Chapter 5. Designing a production imaging system
    1. 5.1 Design goal of the production imaging system
    2. 5.2 Capture system design
      1. 5.2.1 Document hierarchy
      2. 5.2.2 Capture processing tasks
      3. 5.2.3 Capture workflow
      4. 5.2.4 Capture design considerations
      5. 5.2.5 Discovering the capture process
    3. 5.3 Requirements gathering
      1. 5.3.1 Requirements for current capture or document processing environment
      2. 5.3.2 Processing location requirements
      3. 5.3.3 Document type requirements
      4. 5.3.4 Captured data requirements
      5. 5.3.5 Verification requirements
      6. 5.3.6 Export requirements
      7. 5.3.7 Volume and timing requirements
      8. 5.3.8 Administration requirements
    4. 5.4 Designing the capture for the auto claims scenario
      1. 5.4.1 Document hierarchy
      2. 5.4.2 Capture processing tasks
  11. Chapter 6. Implementing the capture solution
    1. 6.1 Configuring the Datacap application
      1. 6.1.1 Creating a blank application
      2. 6.1.2 Setting up the document, pages, and fields
      3. 6.1.3 Setting up the physical scan device
      4. 6.1.4 Creating a module in Taskmaster
      5. 6.1.5 Creating a job within the Taskmaster Client
      6. 6.1.6 Setting up the iScan task
      7. 6.1.7 Setting up the PageID task
      8. 6.1.8 Setting up the Rulerunner task
      9. 6.1.9 Testing the progress
    2. 6.2 Zones and fingerprints
      1. 6.2.1 Setting up the zones
      2. 6.2.2 Setting up optical mark fields
      3. 6.2.3 Setting up an ICR field
      4. 6.2.4 Removing the Clean rule set
      5. 6.2.5 Creating a rule set to read fields
      6. 6.2.6 Creating a rule set to validate fields
      7. 6.2.7 Validating captured field values against the database
      8. 6.2.8 Setting up routing
      9. 6.2.9 Testing scan validation and routing
      10. 6.2.10 Setting up the verification panel
      11. 6.2.11 Setting up the Verify job
      12. 6.2.12 Setting up the export to the repository
    3. 6.3 Task profiles overview
  12. Chapter 7. Adding a document type to an existing application
    1. 7.1 Adding VScan to a rule set
    2. 7.2 Adding a document with pages and fields
    3. 7.3 Setting the PageID logic
    4. 7.4 Adding the Image Enhance feature to the pages
    5. 7.5 Configuring the CreateDocs rule set for the pages
    6. 7.6 Adding a full page OCR rule set to the pages
    7. 7.7 Setting the fingerprint for the Estimate_Invoice document
    8. 7.8 Obtaining field values in the Claim_Pg document by using the Locate() rule set
      1. 7.8.1 Doc Title field
      2. 7.8.2 Pol_Number field
      3. 7.8.3 Claim_Number field
      4. 7.8.4 Vendor Number field
      5. 7.8.5 Vendor Name field
      6. 7.8.6 Ref ID field
      7. 7.8.7 Ref Date field
      8. 7.8.8 Ref Total field
    9. 7.9 Looking up vendor information from a database byusing the Lookup() rule set
      1. 7.9.1 Looking up a vendor name
      2. 7.9.2 Looking up a vendor number
    10. 7.10 Validating data
      1. 7.10.1 Page Level field
      2. 7.10.2 Doc Title field
      3. 7.10.3 Pol_Number field
      4. 7.10.4 Vendor Number field
      5. 7.10.5 Vendor Name field
      6. 7.10.6 Ref ID field
      7. 7.10.7 Ref Date field
    11. 7.11 Routing scanned pages
    12. 7.12 Verify task with Batch Pilot
      1. 7.12.1 Creating a Verify panel
      2. 7.12.2 Determining which pages to display
    13. 7.13 Export task
  13. Chapter 8. Implementing the business process component
    1. 8.1 FileNet Business Process Manager Tools
      1. 8.1.1 Steps
      2. 8.1.2 Routes
      3. 8.1.3 Maps
      4. 8.1.4 Content events
    2. 8.2 Running the Auto Claims process with the FileNet BPM Tools
    3. 8.3 How the Auto Claim process works
    4. 8.4 Business Process Manager step configuration
      1. 8.4.1 DBExecute step
      2. 8.4.2 Conditional step
      3. 8.4.3 Activity step
  14. Chapter 9. Best practices and recommendations
    1. 9.1 Basic form design and capture
      1. 9.1.1 Scanned document verification
      2. 9.1.2 Measuring scan and capture process improvement
    2. 9.2 Best practices for application development
      1. 9.2.1 Testing an application
      2. 9.2.2 Capturing data
      3. 9.2.3 Smart parameters
      4. 9.2.4 Projects
      5. 9.2.5 Actions
      6. 9.2.6 Scripting
      7. 9.2.7 OMR field configuration
    3. 9.3 Production Imaging Edition implementation principles
  15. Part 3 Advanced technologies
  16. Chapter 10. Dynamic technologies
    1. 10.1 Introduction to dynamic technologies
    2. 10.2 PageID actions and techniques in dynamic applications
      1. 10.2.1 Page identification by barcode separator
      2. 10.2.2 Electronic input of documents
    3. 10.3 FlexID
    4. 10.4 DNA technology
      1. 10.4.1 Fingerprint matching in dynamic applications
      2. 10.4.2 Using the TemplateID
      3. 10.4.3 The offset
    5. 10.5 Sticky fingerprints
    6. 10.6 Managed recognition
    7. 10.7 CCO Merging
    8. 10.8 FPXML
    9. 10.9 Line item detection
      1. 10.9.1 Storing repeating structures
      2. 10.9.2 Zoning the detail structure
      3. 10.9.3 Capturing the detail structure
      4. 10.9.4 Filtering line items
    10. 10.10 Enhanced error messaging
    11. 10.11 Data localization actions
      1. 10.11.1 Actions that affect rules execution
      2. 10.11.2 Actions that affect the captured data
    12. 10.12 Intellocate
    13. 10.13 Flex technology
    14. 10.14 Conclusion
  17. Chapter 11. Technical walkthrough Accounts Payable Capture
    1. 11.1 Introduction to Accounts Payable Capture
    2. 11.2 How IBM Taskmaster Accounts Payable Capture works
    3. 11.3 Jobs available in the workflow
    4. 11.4 When each task profile gets executed
    5. 11.5 A walkthrough of the task profiles
      1. 11.5.1 The VScan Task Profile
      2. 11.5.2 The Batch Profiler Task Profile
      3. 11.5.3 The Verification process
      4. 11.5.4 The Export Task Profile
  18. Part 4 Application and performance
  19. Chapter 12. System scalability, availability, backup, and recovery
    1. 12.1 System scalability, performance, and availability
      1. 12.1.1 Typical Datacap installation
      2. 12.1.2 Scaling Rulerunner vertically (scale up)
      3. 12.1.3 Scaling Rulerunner horizontally (scale out)
      4. 12.1.4 Scaling Rulerunner horizontally and vertically
      5. 12.1.5 Taskmaster Server scaling and redundancy
      6. 12.1.6 Scaling both Taskmaster Server and Rulerunner
      7. 12.1.7 Taskmaster Web scaling and redundancy
      8. 12.1.8 Scaling and redundancy for thick clients
      9. 12.1.9 Load balancing of tasks
      10. 12.1.10 Scaling databases
      11. 12.1.11 Network share drive
      12. 12.1.12 Scaling across geographies
    2. 12.2 Rulerunner
      1. 12.2.1 Single-threaded Rulerunner
      2. 12.2.2 Multithread Rulerunner
      3. 12.2.3 Load balancing Rulerunner
      4. 12.2.4 Race conditions
      5. 12.2.5 Running multithreading in VMware
      6. 12.2.6 Fingerprint Service
    3. 12.3 Configuring Rulerunner
      1. 12.3.1 The Datacap.xml and <project>.app files
      2. 12.3.2 Installing a single Rulerunner server
      3. 12.3.3 Installing an additional Rulerunner server
      4. 12.3.4 Configuring priorities and queuing
      5. 12.3.5 Installing Fingerprint Service
      6. 12.3.6 Using the Fingerprint Service
    4. 12.4 Adding additional Taskmaster Servers
      1. 12.4.1 Adding a failover Taskmaster Server
      2. 12.4.2 Load sharing between Taskmaster Servers
      3. 12.4.3 Overview of Rulerunner Server and Taskmaster Server
      4. 12.4.4 Scaling a thick client
      5. 12.4.5 Scaling a Taskmaster Web client
    5. 12.5 Backup and restore
      1. 12.5.1 Backing up and restoring Rulerunner machines
      2. 12.5.2 Backing up and restoring the Taskmaster Server
      3. 12.5.3 Backing up the database server
      4. 12.5.4 Backing up and restoring the Fingerprint server
      5. 12.5.5 Backing up and restoring the IIS web server
      6. 12.5.6 Backing up the file share
  20. Chapter 13. Installation, migration, and application reuse
    1. 13.1 Installing the Datacap Taskmaster Capture software
      1. 13.1.1 Installing IBM Taskmaster Web client
      2. 13.1.2 Installing Taskmaster thick client
      3. 13.1.3 Installing the Taskmaster Server
    2. 13.2 Migrating application from development to another environment
    3. 13.3 Reusing applications
      1. 13.3.1 Creating an application based on an existing one
      2. 13.3.2 Enhancing a new application with existing design elements
  21. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. Help from IBM
  22. Back cover