You are previewing Managing Multimedia and Unstructured Data in the Oracle Database.
O'Reilly logo
Managing Multimedia and Unstructured Data in the Oracle Database

Book Description

A revolutionary approach to understanding, managing and delivering digital objects, assets, and all types of data

  • Full of illustrations, diagrams, and tips with clear step-by-step instructions and real time examples

  • Get up to speed on all the aspects of this new technology

  • Learn how to work with rich multimedia and control it

  • In Detail

    Multimedia is the new digital frontier. Managers, software architects, administrators and developers need to fully comprehend this exciting new technology as its widespread use and acceptance cannot be ignored any longer.

    "Managing Multimedia and Unstructured Data in the Oracle Database" will give you a complete understanding of how to manage all data, especially multimedia. You will learn all the latest terminology, how to set up a database, load digital objects, search on them and even how to sell them. Whether you are a manager or database administrator, this book will give you the knowledge you need to take control of this rapidly growing and industry- changing technology. Technology which is transforming our lives.

    Starting with the basic principles of unstructured data and detailing the concepts behind multimedia warehouses and digital asset management systems, this book will describe how to load this data, search against it, display it intelligently, and deliver it to customers and users. Learn how all these concepts work within the Oracle 11g R2 database environment and how to tune the database effectively to manage it.

    Begin to learn about this new and exciting field and use it to give your business a competitive edge or give yourself the ability to take a leadership role in this exciting new computing genre.

    Table of Contents

    1. Managing Multimedia and Unstructured Data in the Oracle Database
      1. Table of Contents
      2. Managing Multimedia and Unstructured Data in the Oracle Database
      3. Credits
      4. About the Author
      5. Acknowledgement
      6. About the Reviewers
      7. www.PacktPub.com
        1. Support files, eBooks, discount offers and more
          1. Why Subscribe?
          2. Free Access for Packt account holders
          3. Instant Updates on New Packt Books
      8. Preface
        1. What this book covers
        2. Who this book is for
        3. Conventions
        4. Chapter References
        5. Reader feedback
        6. Customer support
          1. Errata
          2. Piracy
          3. Questions
      9. 1. What is Unstructured Data?
        1. Digital data
          1. Metadata
        2. Defining unstructured data
          1. Terminology
            1. Image
            2. Digital file
            3. Digital image
            4. Digital object
            5. Digital content
            6. Digital asset
            7. Digital material
            8. Digital library
          2. Analyzing the digital object
          3. Digital object types
          4. Core types
          5. Subtypes
            1. Picture
            2. Audio
            3. Model
          6. Creating new base types
            1. Document
            2. Video
            3. Multimedia (Rich Media)
            4. Data
            5. Simulation
            6. Genealogy
          7. Virtual digital object
          8. Digital object delivery
          9. Manipulating digital objects
            1. Conversion
            2. Transformation
            3. Extraction
            4. Compression
            5. Image comparison
            6. Badly compressed
            7. Thumbnail
            8. Transposition
            9. Searching
            10. Product group
            11. Location
        3. Defining multimedia in the Oracle database
          1. Photograph
          2. Video
          3. Audio
          4. Document
          5. Text
          6. Artifact
          7. Additional multimedia types
          8. Composite types
          9. Container
          10. ZIP files
          11. Metadata
          12. The NULL case
        4. Why store unstructured data in a database?
          1. Manageability
          2. Security
          3. Backup/recovery
          4. Integration
          5. Extensibility
          6. Flexibility
          7. Features
        5. Why not store the multimedia in the filesystem?
        6. Why use Oracle multimedia and not a blob?
          1. Addressing the concerns
          2. Performance
          3. Database size
          4. Complexity
        7. Summary
        8. Exercises
        9. Unstructured data conversion table
      10. 2. Understanding Digital Objects
        1. Definitions
          1. Raw format
          2. Compression
          3. Lossy data compression
          4. Lossless data compression
          5. Codec
          6. Container
        2. Understanding each image type
          1. Photo
            1. Icon
            2. Color space
              1. Color calibration
              2. The RGB color model family
              3. Viewing colors
              4. Printing using the CMYK colorspace
              5. Other color spaces
            3. Little endian and big endian
            4. Digital image storage formats
              1. Raster graphics formats
              2. Raw
            5. Vector graphics
          2. Audio
            1. Bit rate
            2. Encoding
            3. Channels
          3. Video
            1. Frame
            2. Frame resolution
              1. Frame aspect ratio
              2. Frame rate
              3. Progressive scan versus interlaced
              4. Codecs/containers
              5. Issues when converting
          4. Documents
            1. Terminology
              1. PDF
              2. DOC/DOCX
              3. ODT
              4. TXT
            2. Transformation
        3. Digital object composition
          1. The starting base – NULL object
          2. The original image
          3. Indexed digital object
            1. Pyramid index
            2. Derivatives
            3. Masters
            4. Components
            5. Version hierarchies
            6. Relationships
          4. Unstructured data business cases
            1. Sporting club
            2. Charity
            3. Neighborhood watch
            4. News
            5. Food
            6. Government
        4. Summary
        5. Exercises
      11. 3. The Multimedia Warehouse
        1. Comparing
          1. The data warehouse
            1. Data consistency
              1. Logical Data Consistency
            2. Dilapidated warehouse
            3. Security
            4. Performance
            5. Information overload
          2. Types of multimedia warehouses
            1. Traditional
            2. Image bank
            3. Data mart
            4. Public
            5. eSales
            6. Intelligence (security/defence)
        2. Structures
          1. Collections
          2. Groups
          3. Categories
          4. Lightbox
          5. Relationships
          6. Thesaurus
          7. Taxonomy
        3. Metadata standards
          1. Digital images
            1. IPTC
            2. EXIF
            3. XMP
          2. Audio
            1. ID3
          3. Relational
            1. CDWA Lite
            2. The Dublin Core® metadata Initiative
            3. Darwin Core
            4. Media Art Notation System
        4. Image tagging
          1. Crowdsourcing
          2. Gaming techniques
        5. Data types
          1. Text
          2. Date
            1. Interval
          3. Time
          4. Season
          5. Circa
          6. Boolean
          7. Number
            1. Metric and imperial
          8. Accession number
          9. Name
          10. Address
          11. Filename
          12. Spatial co-ordinate
        6. Summary
        7. Exercises
      12. 4. Searching the Multimedia Warehouse
        1. Multilingual data
          1. Storing
          2. Diacritic
          3. Multiple languages
          4. Translating
        2. Security
        3. Searching
          1. Indexing performance
          2. Metadata based
          3. Image structure
          4. Electronic commerce
          5. False positives
            1. Stop words
          6. The living search
          7. Data mining
            1. Big O notation
          8. Representing the results
            1. Interface
            2. Visualize the results
            3. Tag cloud
            4. Infinite zoom
            5. Complex social network
            6. Tree map
            7. Lightbox
            8. VRML and SVG
            9. Synchronized Multimedia Integration Language (SMIL)
            10. HTML 5
            11. Adobe Flash
            12. Voice XML
            13. Other devices
              1. Braille devices
              2. Audio
          9. Search features
            1. Summary groups
            2. Workarea
            3. Non discriminatory search
            4. Result notification
            5. Restrict the results
            6. Control the output
            7. Audit search
          10. Designing a search language
            1. Search context
            2. Set theory primer
            3. Order of precedence
            4. Specialized query terms
              1. Spelling mistakes
              2. Sounds like
            5. Stem search
            6. Ranking
            7. Mandatory and other terms
            8. Word frequency
              1. The trouble with documents
            9. Autosuggest
            10. Search engine scalability
          11. Federated search
          12. Fuzzy searching
          13. Collaboration search
        4. Summary
        5. Exercises
      13. 5. Loading Techniques
        1. Loading methods
        2. Finding the images
          1. Pull method
            1. Vertical parallelism
            2. Horizontal parallelism
          2. Push method
          3. Cartridge method
        3. Loading method
          1. Metadata matches to digital object
          2. Digital object matches to metadata
          3. Mixed digital object and metadata
          4. Digital object no metadata
            1. Many masters
            2. Derivatives
        4. Matching existing data to images
          1. Filename encoding
        5. Data cleansing
        6. Loading decisions
          1. Types of loading
            1. Batch
            2. Hot folder
            3. Integration API
            4. Manual
        7. Loading step-by-step
          1. Error handling
            1. Logical errors
            2. Loading via a workflow
        8. Summary
        9. Exercises
      14. 6. Delivery Techniques
        1. Securing an image
          1. Protection from theft
            1. Is it really theft?
            2. Modification
            3. Disruption
            4. Copying
            5. Theft
            6. Forgery
            7. Destruction
            8. Plagiarism
            9. Illegal access
            10. Replace
            11. Accidental
            12. Harvesting
            13. Other
          2. Protection methods
            1. Visible
            2. Preventive
            3. Bookmarking
            4. Reactive
            5. Auditable
            6. Self destruction
            7. Accept
            8. Legal proof
          3. A look at different business situations
            1. Copyright
            2. Greeting card
            3. Music
        2. Electronic commerce
          1. Not all browsers are the same
          2. IP address country tracking
          3. Order lifecycle
          4. Payment methods
          5. A comprehensive audit trail
          6. Locking down the price
          7. Post processing issue
          8. What are you buying?
            1. Price books
              1. Pricing options
          9. Understanding the business rules
            1. Tax rule
            2. Download rule
            3. Pricing rule
            4. User fees rule (pricing calculator)
            5. Postage rule
              1. Mixed orders
              2. Split orders
              3. Combining items
              4. Free postage
              5. Pick up
              6. Negotiated
              7. Delayed
              8. Monitoring
            6. Payment rule
            7. Customer information rule
            8. Customer trigger rule
            9. Discount rule
            10. Refund rule
            11. Ticketing rule
            12. Integrated stock management
            13. Post-purchase workflow
        3. Summary
        4. Exercises
      15. 7. Techniques for Creating a Multimedia Database
        1. Tier architecture
          1. Traditional no tier
          2. Two tier
          3. Three tier
          4. Virtualized architecture
          5. Mobile applications architecture
        2. Basic database configuration concepts
          1. ASM—Automated Storage Management
          2. Block size
            1. UNIFORM extent size and AUTOALLOCATE
            2. Locally managed tablespace UNIFORM extent size
            3. Temporary tablespace
            4. UNDO tablespace
            5. SYSTEM tablespace
            6. Redo logs
            7. Analysis
        3. Oracle Securefile architecture
          1. Enabling storage in row
          2. CHUNK
          3. Logging
          4. Cache
          5. Managing duplicate images
          6. Retention
          7. Lob compression
          8. Encryption
          9. Read-only tablespace
        4. Where does Oracle Multimedia fit in?
        5. Understanding the ORDSYS data types
          1. Creating a table
          2. How to query?
          3. Multimedia methods
        6. Creating a schema
        7. Oracle HTTP servers
        8. Configuring the Oracle embedded gateway
        9. Configuring Apache
          1. Basic diagnostics
            1. Windows
            2. Unix
        10. HTTPD.CONF file
          1. Virtual hosts
          2. Apache rewrites
        11. External locations and security
          1. Oracle directory
          2. Granting access to a directory
          3. UTL_FILE
            1. UTL_TCP
          4. Java
        12. Discussing Raid, SSD, SANs, and NAS
          1. Solid State Disk
            1. Raid 0: stripe across both disks
            2. Raid 1: mirror
            3. Raid 0+1: stripe then mirror
            4. Raid 1+0: mirrors then stripe
            5. Raid 5: parity check
            6. Raid 6: double parity check
          2. NAS
          3. SAN
        13. Setting up Oracle XE to run Oracle Multimedia
        14. Summary
        15. Exercises
      16. 8. Tuning
        1. Introduction to tuning
        2. Tuning methodologies
          1. Reactive versus proactive (for the novice administrator)
          2. What is the role of the DBA?
            1. History
        3. Tuning trend
        4. Scalability
          1. Scalability is bidirectional
          2. Database breakpoints
            1. Locking
            2. CPU limits
            3. Memory limits
            4. Hardware limits
            5. Database limits
            6. Database management
            7. Backup/recovery
          3. Multimedia scalability
            1. Dimension 1 – loading a large number of multimedia files
            2. Dimension 2 – storing a large number of multimedia files
            3. Dimension 3 – loading a very large multimedia file
            4. Dimension 4 – retrieving a large number of multimedia files
            5. Dimension 5 – database management
          4. General considerations
            1. Loading in parallel
            2. Insert/delete performance
            3. Extreme scalability
        5. Object-oriented development
          1. PC mentality
          2. The three tier – ignore the database mentality
          3. Our application should be able to run against any database
        6. Basic tuning operations
          1. Network
            1. HTTPS
            2. VPN
            3. Efficiency in sending
            4. XML and web services
            5. Back to three tier and scalability
          2. Memory
          3. CPU
          4. I/O
          5. Parallelism
            1. Image loading
            2. Horizontal versus vertical parallelism
          6. Locking
          7. Database parameters
            1. plsql_code_type
            2. optimizer_mode
            3. Hints
          8. Backups
          9. Oracle partitioning
            1. Manual partitioning
          10. Indexing
            1. Photo
            2. Video
            3. Audio
            4. Documents
          11. Scalability using Oracle XE
            1. Breaking the rules with XE
            2. VM vSphere
              1. Scenario 1 - Separate install
              2. Scenario 2 - Replicated, high throughput
              3. Scenario 3 - Image server
        7. Summary
        8. Exercises
      17. 9. Understanding the Limitations of Oracle Products
        1. The basic requirements
          1. Acting as more than a filesystem
            1. Full backup/recovery
            2. Long term archival
            3. Data distribution and network balancing
            4. High speed and scalable image loading and processing
            5. Storage scalability to petabytes of data
            6. Flexible image delivery
            7. Security, auditing, and protection from user error (versioning)
            8. Supporting for most image types
            9. Litmus test
        2. A comparison
        3. Oracle products
          1. Development
            1. SQL Developer (v3.1)
            2. SQL*Plus
            3. PL/SQL
            4. Supplied packages
            5. PL/SQL Web Toolkit
            6. SQL
            7. Java
            8. XML
            9. Edition-Based Redefinition
            10. Apex (Oracle Application Express)
          2. Storage
            1. Tablespaces and datafiles
            2. Storage parameters
            3. Partitioning
            4. ASM
            5. DBFS Filesystem
          3. Monitoring
            1. Enterprise Manager
            2. Resource management
          4. Database
            1. Data types
            2. Advanced compression
            3. OLAP
            4. Indexes
            5. Embedded gateway
            6. Data dictionary
            7. Heterogeneous gateway
          5. Tuning
            1. Automatic memory management
            2. Optimizer
            3. Networking
          6. Backup/Recovery
            1. Total recall (flashback)
            2. Redo logs and archives
            3. Data guard
            4. RMAN
            5. Utilities
            6. Streams
            7. Advanced replication
          7. Options
            1. Multimedia
            2. Spatial
            3. Text
            4. Semantics
            5. Warehouse
            6. Data Mining
          8. Security
            1. Encryption
            2. Data vault
            3. Oracle label security
          9. High availability
            1. RAC
            2. Exadata
            3. ZFS
        4. Summary
      18. 10. Working with the Operating System
        1. Why shell out?
        2. Unload and load digital objects
        3. How to shell out
          1. Java
          2. Scheduler
          3. Advanced queueing or pipes
          4. UTL_TCP
        4. Challenges when shelling out
          1. Synchronous or asynchronous?
          2. Hidden Ctrl + M characters on Unix
          3. Capturing output
          4. Parameters
          5. Dynamic shell scripts
          6. Windows program on processing, calls an actual window?
          7. Filesystem limitations
        5. Windows
          1. Powershell versus DOS
          2. LUN
          3. The variety of versions
            1. The Windows Services interface
            2. Windows 2012 and Windows 8
            3. Windows 2008 R2 and Windows 7
            4. Windows 2008 and Windows Vista
            5. Windows 2003
            6. Windows XP
            7. Windows 2000
        6. Unix
          1. How Unix differs from Windows
          2. The variety of versions
            1. Linux
            2. Ubuntu Linux
            3. Solaris
            4. IBM AIX
            5. HP-UX
        7. Summary
        8. Exercises
      19. A. The Circa Data Type
        1. Railroad diagram
        2. EBNF Syntax
      20. B. Multimedia Case Studies
        1. Museum A
        2. Department B
        3. Museum C
        4. Museum D
        5. Whole of government E
        6. Department F
        7. Museum H
        8. Photo laboratory G
      21. C. Proactive Database Tuning
        1. The environment and the DBA
          1. Ensuring optimal performance
        2. Cyclic maintenance
        3. Database review
        4. Forecasting
          1. Securing the database
        5. Data recovery
      22. D. Chapter References
        1. Chapter 1
        2. Chapter 2
        3. Chapter 3
        4. Chapter 4
        5. Chapter 8
        6. Chapter 9
        7. Chapter 10
      23. Index