O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Managing Multimedia and Unstructured Data in the Oracle Database

Book Description

A revolutionary approach to understanding, managing and delivering digital objects, assets, and all types of data

  • Full of illustrations, diagrams, and tips with clear step-by-step instructions and real time examples
  • Get up to speed on all the aspects of this new technology
  • Learn how to work with rich multimedia and control it

In Detail

Multimedia is the new digital frontier. Managers, software architects, administrators and developers need to fully comprehend this exciting new technology as its widespread use and acceptance cannot be ignored any longer.

"Managing Multimedia and Unstructured Data in the Oracle Database" will give you a complete understanding of how to manage all data, especially multimedia. You will learn all the latest terminology, how to set up a database, load digital objects, search on them and even how to sell them. Whether you are a manager or database administrator, this book will give you the knowledge you need to take control of this rapidly growing and industry- changing technology. Technology which is transforming our lives.

Starting with the basic principles of unstructured data and detailing the concepts behind multimedia warehouses and digital asset management systems, this book will describe how to load this data, search against it, display it intelligently, and deliver it to customers and users. Learn how all these concepts work within the Oracle 11g R2 database environment and how to tune the database effectively to manage it.

Begin to learn about this new and exciting field and use it to give your business a competitive edge or give yourself the ability to take a leadership role in this exciting new computing genre.

Table of Contents

  1. Managing Multimedia and Unstructured Data in the Oracle Database
    1. Table of Contents
    2. Managing Multimedia and Unstructured Data in the Oracle Database
    3. Credits
    4. About the Author
    5. Acknowledgement
    6. About the Reviewers
    7. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
        3. Instant Updates on New Packt Books
    8. Preface
      1. What this book covers
      2. Who this book is for
      3. Conventions
      4. Chapter References
      5. Reader feedback
      6. Customer support
        1. Errata
        2. Piracy
        3. Questions
    9. 1. What is Unstructured Data?
      1. Digital data
        1. Metadata
      2. Defining unstructured data
        1. Terminology
          1. Image
          2. Digital file
          3. Digital image
          4. Digital object
          5. Digital content
          6. Digital asset
          7. Digital material
          8. Digital library
        2. Analyzing the digital object
        3. Digital object types
        4. Core types
        5. Subtypes
          1. Picture
          2. Audio
          3. Model
        6. Creating new base types
          1. Document
          2. Video
          3. Multimedia (Rich Media)
          4. Data
          5. Simulation
          6. Genealogy
        7. Virtual digital object
        8. Digital object delivery
        9. Manipulating digital objects
          1. Conversion
          2. Transformation
          3. Extraction
          4. Compression
          5. Image comparison
          6. Badly compressed
          7. Thumbnail
          8. Transposition
          9. Searching
          10. Product group
          11. Location
      3. Defining multimedia in the Oracle database
        1. Photograph
        2. Video
        3. Audio
        4. Document
        5. Text
        6. Artifact
        7. Additional multimedia types
        8. Composite types
        9. Container
        10. ZIP files
        11. Metadata
        12. The NULL case
      4. Why store unstructured data in a database?
        1. Manageability
        2. Security
        3. Backup/recovery
        4. Integration
        5. Extensibility
        6. Flexibility
        7. Features
      5. Why not store the multimedia in the filesystem?
      6. Why use Oracle multimedia and not a blob?
        1. Addressing the concerns
        2. Performance
        3. Database size
        4. Complexity
      7. Summary
      8. Exercises
      9. Unstructured data conversion table
    10. 2. Understanding Digital Objects
      1. Definitions
        1. Raw format
        2. Compression
        3. Lossy data compression
        4. Lossless data compression
        5. Codec
        6. Container
      2. Understanding each image type
        1. Photo
          1. Icon
          2. Color space
            1. Color calibration
            2. The RGB color model family
            3. Viewing colors
            4. Printing using the CMYK colorspace
            5. Other color spaces
          3. Little endian and big endian
          4. Digital image storage formats
            1. Raster graphics formats
            2. Raw
          5. Vector graphics
        2. Audio
          1. Bit rate
          2. Encoding
          3. Channels
        3. Video
          1. Frame
          2. Frame resolution
            1. Frame aspect ratio
            2. Frame rate
            3. Progressive scan versus interlaced
            4. Codecs/containers
            5. Issues when converting
        4. Documents
          1. Terminology
            1. PDF
            2. DOC/DOCX
            3. ODT
            4. TXT
          2. Transformation
      3. Digital object composition
        1. The starting base – NULL object
        2. The original image
        3. Indexed digital object
          1. Pyramid index
          2. Derivatives
          3. Masters
          4. Components
          5. Version hierarchies
          6. Relationships
        4. Unstructured data business cases
          1. Sporting club
          2. Charity
          3. Neighborhood watch
          4. News
          5. Food
          6. Government
      4. Summary
      5. Exercises
    11. 3. The Multimedia Warehouse
      1. Comparing
        1. The data warehouse
          1. Data consistency
            1. Logical Data Consistency
          2. Dilapidated warehouse
          3. Security
          4. Performance
          5. Information overload
        2. Types of multimedia warehouses
          1. Traditional
          2. Image bank
          3. Data mart
          4. Public
          5. eSales
          6. Intelligence (security/defence)
      2. Structures
        1. Collections
        2. Groups
        3. Categories
        4. Lightbox
        5. Relationships
        6. Thesaurus
        7. Taxonomy
      3. Metadata standards
        1. Digital images
          1. IPTC
          2. EXIF
          3. XMP
        2. Audio
          1. ID3
        3. Relational
          1. CDWA Lite
          2. The Dublin Core® metadata Initiative
          3. Darwin Core
          4. Media Art Notation System
      4. Image tagging
        1. Crowdsourcing
        2. Gaming techniques
      5. Data types
        1. Text
        2. Date
          1. Interval
        3. Time
        4. Season
        5. Circa
        6. Boolean
        7. Number
          1. Metric and imperial
        8. Accession number
        9. Name
        10. Address
        11. Filename
        12. Spatial co-ordinate
      6. Summary
      7. Exercises
    12. 4. Searching the Multimedia Warehouse
      1. Multilingual data
        1. Storing
        2. Diacritic
        3. Multiple languages
        4. Translating
      2. Security
      3. Searching
        1. Indexing performance
        2. Metadata based
        3. Image structure
        4. Electronic commerce
        5. False positives
          1. Stop words
        6. The living search
        7. Data mining
          1. Big O notation
        8. Representing the results
          1. Interface
          2. Visualize the results
          3. Tag cloud
          4. Infinite zoom
          5. Complex social network
          6. Tree map
          7. Lightbox
          8. VRML and SVG
          9. Synchronized Multimedia Integration Language (SMIL)
          10. HTML 5
          11. Adobe Flash
          12. Voice XML
          13. Other devices
            1. Braille devices
            2. Audio
        9. Search features
          1. Summary groups
          2. Workarea
          3. Non discriminatory search
          4. Result notification
          5. Restrict the results
          6. Control the output
          7. Audit search
        10. Designing a search language
          1. Search context
          2. Set theory primer
          3. Order of precedence
          4. Specialized query terms
            1. Spelling mistakes
            2. Sounds like
          5. Stem search
          6. Ranking
          7. Mandatory and other terms
          8. Word frequency
            1. The trouble with documents
          9. Autosuggest
          10. Search engine scalability
        11. Federated search
        12. Fuzzy searching
        13. Collaboration search
      4. Summary
      5. Exercises
    13. 5. Loading Techniques
      1. Loading methods
      2. Finding the images
        1. Pull method
          1. Vertical parallelism
          2. Horizontal parallelism
        2. Push method
        3. Cartridge method
      3. Loading method
        1. Metadata matches to digital object
        2. Digital object matches to metadata
        3. Mixed digital object and metadata
        4. Digital object no metadata
          1. Many masters
          2. Derivatives
      4. Matching existing data to images
        1. Filename encoding
      5. Data cleansing
      6. Loading decisions
        1. Types of loading
          1. Batch
          2. Hot folder
          3. Integration API
          4. Manual
      7. Loading step-by-step
        1. Error handling
          1. Logical errors
          2. Loading via a workflow
      8. Summary
      9. Exercises
    14. 6. Delivery Techniques
      1. Securing an image
        1. Protection from theft
          1. Is it really theft?
          2. Modification
          3. Disruption
          4. Copying
          5. Theft
          6. Forgery
          7. Destruction
          8. Plagiarism
          9. Illegal access
          10. Replace
          11. Accidental
          12. Harvesting
          13. Other
        2. Protection methods
          1. Visible
          2. Preventive
          3. Bookmarking
          4. Reactive
          5. Auditable
          6. Self destruction
          7. Accept
          8. Legal proof
        3. A look at different business situations
          1. Copyright
          2. Greeting card
          3. Music
      2. Electronic commerce
        1. Not all browsers are the same
        2. IP address country tracking
        3. Order lifecycle
        4. Payment methods
        5. A comprehensive audit trail
        6. Locking down the price
        7. Post processing issue
        8. What are you buying?
          1. Price books
            1. Pricing options
        9. Understanding the business rules
          1. Tax rule
          2. Download rule
          3. Pricing rule
          4. User fees rule (pricing calculator)
          5. Postage rule
            1. Mixed orders
            2. Split orders
            3. Combining items
            4. Free postage
            5. Pick up
            6. Negotiated
            7. Delayed
            8. Monitoring
          6. Payment rule
          7. Customer information rule
          8. Customer trigger rule
          9. Discount rule
          10. Refund rule
          11. Ticketing rule
          12. Integrated stock management
          13. Post-purchase workflow
      3. Summary
      4. Exercises
    15. 7. Techniques for Creating a Multimedia Database
      1. Tier architecture
        1. Traditional no tier
        2. Two tier
        3. Three tier
        4. Virtualized architecture
        5. Mobile applications architecture
      2. Basic database configuration concepts
        1. ASM—Automated Storage Management
        2. Block size
          1. UNIFORM extent size and AUTOALLOCATE
          2. Locally managed tablespace UNIFORM extent size
          3. Temporary tablespace
          4. UNDO tablespace
          5. SYSTEM tablespace
          6. Redo logs
          7. Analysis
      3. Oracle Securefile architecture
        1. Enabling storage in row
        2. CHUNK
        3. Logging
        4. Cache
        5. Managing duplicate images
        6. Retention
        7. Lob compression
        8. Encryption
        9. Read-only tablespace
      4. Where does Oracle Multimedia fit in?
      5. Understanding the ORDSYS data types
        1. Creating a table
        2. How to query?
        3. Multimedia methods
      6. Creating a schema
      7. Oracle HTTP servers
      8. Configuring the Oracle embedded gateway
      9. Configuring Apache
        1. Basic diagnostics
          1. Windows
          2. Unix
      10. HTTPD.CONF file
        1. Virtual hosts
        2. Apache rewrites
      11. External locations and security
        1. Oracle directory
        2. Granting access to a directory
        3. UTL_FILE
          1. UTL_TCP
        4. Java
      12. Discussing Raid, SSD, SANs, and NAS
        1. Solid State Disk
          1. Raid 0: stripe across both disks
          2. Raid 1: mirror
          3. Raid 0+1: stripe then mirror
          4. Raid 1+0: mirrors then stripe
          5. Raid 5: parity check
          6. Raid 6: double parity check
        2. NAS
        3. SAN
      13. Setting up Oracle XE to run Oracle Multimedia
      14. Summary
      15. Exercises
    16. 8. Tuning
      1. Introduction to tuning
      2. Tuning methodologies
        1. Reactive versus proactive (for the novice administrator)
        2. What is the role of the DBA?
          1. History
      3. Tuning trend
      4. Scalability
        1. Scalability is bidirectional
        2. Database breakpoints
          1. Locking
          2. CPU limits
          3. Memory limits
          4. Hardware limits
          5. Database limits
          6. Database management
          7. Backup/recovery
        3. Multimedia scalability
          1. Dimension 1 – loading a large number of multimedia files
          2. Dimension 2 – storing a large number of multimedia files
          3. Dimension 3 – loading a very large multimedia file
          4. Dimension 4 – retrieving a large number of multimedia files
          5. Dimension 5 – database management
        4. General considerations
          1. Loading in parallel
          2. Insert/delete performance
          3. Extreme scalability
      5. Object-oriented development
        1. PC mentality
        2. The three tier – ignore the database mentality
        3. Our application should be able to run against any database
      6. Basic tuning operations
        1. Network
          1. HTTPS
          2. VPN
          3. Efficiency in sending
          4. XML and web services
          5. Back to three tier and scalability
        2. Memory
        3. CPU
        4. I/O
        5. Parallelism
          1. Image loading
          2. Horizontal versus vertical parallelism
        6. Locking
        7. Database parameters
          1. plsql_code_type
          2. optimizer_mode
          3. Hints
        8. Backups
        9. Oracle partitioning
          1. Manual partitioning
        10. Indexing
          1. Photo
          2. Video
          3. Audio
          4. Documents
        11. Scalability using Oracle XE
          1. Breaking the rules with XE
          2. VM vSphere
            1. Scenario 1 - Separate install
            2. Scenario 2 - Replicated, high throughput
            3. Scenario 3 - Image server
      7. Summary
      8. Exercises
    17. 9. Understanding the Limitations of Oracle Products
      1. The basic requirements
        1. Acting as more than a filesystem
          1. Full backup/recovery
          2. Long term archival
          3. Data distribution and network balancing
          4. High speed and scalable image loading and processing
          5. Storage scalability to petabytes of data
          6. Flexible image delivery
          7. Security, auditing, and protection from user error (versioning)
          8. Supporting for most image types
          9. Litmus test
      2. A comparison
      3. Oracle products
        1. Development
          1. SQL Developer (v3.1)
          2. SQL*Plus
          3. PL/SQL
          4. Supplied packages
          5. PL/SQL Web Toolkit
          6. SQL
          7. Java
          8. XML
          9. Edition-Based Redefinition
          10. Apex (Oracle Application Express)
        2. Storage
          1. Tablespaces and datafiles
          2. Storage parameters
          3. Partitioning
          4. ASM
          5. DBFS Filesystem
        3. Monitoring
          1. Enterprise Manager
          2. Resource management
        4. Database
          1. Data types
          2. Advanced compression
          3. OLAP
          4. Indexes
          5. Embedded gateway
          6. Data dictionary
          7. Heterogeneous gateway
        5. Tuning
          1. Automatic memory management
          2. Optimizer
          3. Networking
        6. Backup/Recovery
          1. Total recall (flashback)
          2. Redo logs and archives
          3. Data guard
          4. RMAN
          5. Utilities
          6. Streams
          7. Advanced replication
        7. Options
          1. Multimedia
          2. Spatial
          3. Text
          4. Semantics
          5. Warehouse
          6. Data Mining
        8. Security
          1. Encryption
          2. Data vault
          3. Oracle label security
        9. High availability
          1. RAC
          2. Exadata
          3. ZFS
      4. Summary
    18. 10. Working with the Operating System
      1. Why shell out?
      2. Unload and load digital objects
      3. How to shell out
        1. Java
        2. Scheduler
        3. Advanced queueing or pipes
        4. UTL_TCP
      4. Challenges when shelling out
        1. Synchronous or asynchronous?
        2. Hidden Ctrl + M characters on Unix
        3. Capturing output
        4. Parameters
        5. Dynamic shell scripts
        6. Windows program on processing, calls an actual window?
        7. Filesystem limitations
      5. Windows
        1. Powershell versus DOS
        2. LUN
        3. The variety of versions
          1. The Windows Services interface
          2. Windows 2012 and Windows 8
          3. Windows 2008 R2 and Windows 7
          4. Windows 2008 and Windows Vista
          5. Windows 2003
          6. Windows XP
          7. Windows 2000
      6. Unix
        1. How Unix differs from Windows
        2. The variety of versions
          1. Linux
          2. Ubuntu Linux
          3. Solaris
          4. IBM AIX
          5. HP-UX
      7. Summary
      8. Exercises
    19. A. The Circa Data Type
      1. Railroad diagram
      2. EBNF Syntax
    20. B. Multimedia Case Studies
      1. Museum A
      2. Department B
      3. Museum C
      4. Museum D
      5. Whole of government E
      6. Department F
      7. Museum H
      8. Photo laboratory G
    21. C. Proactive Database Tuning
      1. The environment and the DBA
        1. Ensuring optimal performance
      2. Cyclic maintenance
      3. Database review
      4. Forecasting
        1. Securing the database
      5. Data recovery
    22. D. Chapter References
      1. Chapter 1
      2. Chapter 2
      3. Chapter 3
      4. Chapter 4
      5. Chapter 8
      6. Chapter 9
      7. Chapter 10
    23. Index