O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Data Modeling: A User-Driven Approach

Book Description

Data modeling is one of the most critical phases in the database application development process, but also the phase most likely to fail. A master data modeler must come into any organization, understand its data requirements, and skillfully model the data for applications that most effectively serve organizational needs.

Mastering Data Modeling is a complete guide to becoming a successful data modeler. Featuring a requirements-driven approach, this book clearly explains fundamental concepts, introduces a user-oriented data modeling notation, and describes a rigorous, step-by-step process for collecting, modeling, and documenting the kinds of data that users need.

Assuming no prior knowledge, Mastering Data Modeling sets forth several fundamental problems of data modeling, such as reconciling the software developer's demand for rigor with the users' equally valid need to speak their own (sometimes vague) natural language. In addition, it describes the good habits that help you respond to these fundamental problems. With these good habits in mind, the book describes the Logical Data Structure (LDS) notation and the process of controlled evolution by which you can create low-cost, user-approved data models that resist premature obsolescence. Also included is an encyclopedic analysis of all data shapes that you will encounter. Most notably, the book describes The Flow, a loosely scripted process by which you and the users gradually but continuously improve an LDS until it faithfully represents the information needs. Essential implementation and technology issues are also covered.

You will learn about such vital topics as:

  • The fundamental problems of data modeling

  • The good habits that help a data modeler be effective and economical

  • LDS notation, which encourages these good habits

  • How to read an LDS aloud--in declarative English sentences

  • How to write a well-formed (syntactically correct) LDS

  • How to get users to name the parts of an LDS with words from their own business vocabulary

  • How to visualize data for an LDS

  • A catalog of LDS shapes that recur throughout all data models

  • The Flow--the template for your conversations with users

  • How to document an LDS for users, data modelers, and technologists

  • How to map an LDS to a relational schema

  • How LDS differs from other notations and why

  • "Story interludes" appear throughout the book, illustrating real-world successes of the LDS notation and controlled evolution process. Numerous exercises help you master critical skills. In addition, two detailed, annotated sample conversations with users show you the process of controlled evolution in action.

    Table of Contents

    1. Foreword
    2. Preface
    3. Chapter 1. Introduction
      1. Logical Data Structures and Physical Data Storage
        1. Logical: Thinking About WHAT Without HOW
        2. Data: Thinking About Data Without Processing
        3. Structure: Articulating Types Without Instances
        4. Example
        5. Extending the Example LDS: Shifting the Burden
      2. Summary
      3. Exercises
    4. Chapter 2. Good Habits
      1. Employ the Users’ Language and Vocabulary
      2. Be Rigorous
      3. Don’t Rely on the Opinion of a Single Expert; Ask Several
      4. Ask First About Data, Not About Processing
      5. Master the Shapes of Data
      6. Use a Notation That Helps You Realize These Good Habits
      7. Summary
      8. Exercises
    5. Chapter 3. Reading an LDS with Sentences
      1. Sentences About What Users Can Remember
        1. Sentences About Words Inside Boxes
        2. Sentences About (Unlabeled) Box-to-Box Lines
        3. Chicken Feet
        4. Similarities Between Sentences
        5. Sentences About Labeled Box-to-Box Lines
        6. Labels and Similarities Between Sentences
        7. Similarities Between Sentences Revisited
      2. Sentences About Differentiating Things from Each Other
        1. One-Bar Boxes
        2. Multiple-Bar Boxes
        3. Outside-the-Box Bars
        4. A Shorthand
        5. A Common Error You Should Avoid
      3. Sentences You Should Not Say
      4. Some Complete Examples
      5. Summary
      6. Exercises
    6. Chapter 4. Vocabulary of LDS
      1. Vocabulary Overview
        1. Entity and Entity Instance
        2. Attribute and Attribute Value
        3. Relationship and Relationship Instance
        4. Link and Link Value
        5. Maximum Degree
        6. Descriptor and Descriptor Value
        7. Identifier and Identifier Value
        8. Identifying Descriptor
        9. Degree-One Descriptor
        10. Degree-Many Descriptor
        11. One-Many Relationship
        12. Many-Many Relationship
        13. One-One Relationship
        14. To-be Relationship
        15. Not-to-be Relationship
        16. Described Entity and Describing Entity
        17. Tiebreaker
      2. A Bit More About Entities, Attributes, and Relationships
        1. Entities versus Attributes
        2. Entities versus Relationships
      3. LDS Reading Rules Revisited
      4. Responsibility for Speaking Well
      5. Summary (and a Chance to Check Your Progress)
        1. Using a Plural Noun for the Described Entity
        2. Overlooking Links
        3. Overlooking the Conventional Role of Identifying Descriptors
        4. Confusing Identifiers with Identifying Descriptors
        5. Overlooking the Similarity Between an Identifying Attribute and an Identifying Link
        6. Neglecting to Read Both Links of a Relationship
      6. Exercises
    7. Chapter 5. Visualizing Allowed and Disallowed Instances
      1. Show the Data and Say Something About It
      2. Plan Your Notes by Considering Elemental Parts of the LDS
        1. Some Notes Apply Simply Because the Diagram Includes an Attribute
        2. Some Notes Apply Because the Diagram Includes a Degree-One Link
        3. Some Notes Apply Because the Diagram Includes a Degree-Many Link
        4. Some Notes Apply Because the Diagram Includes a Single-Descriptor Identifier
        5. Some Notes Apply Because the Diagram Includes a Multiple-Descriptor Identifier
        6. If an Identifier Has Three or More Descriptors, the Notes Can Be More Elaborate Than the Notes for a Two-Descriptor Identifier
      3. As You Visualize Data, Don’t Lose Sight of the Goal
        1. More Data Is Not Necessarily Better
        2. Less Data Is Not Necessarily Better
        3. Don’t Limit Yourself to a Single Way of Visualizing the Data
      4. Exercises
    8. Chapter 6. A Conversation with Users About Creatures and Skills
      1. Summary
      2. Exercises
    9. Story Interlude
    10. Chapter 7. Introduction to Mastering Shapes
      1. Definition of Shape
      2. Mastering Shapes
      3. Reading a Shape Aloud in Several Ways
      4. Visualizing Sample Data in Several Formats
      5. Discussing and Illustrating Noteworthy Disallowed Data
      6. Finding and Focusing on Shapes Within a Large LDS
      7. Recognizing the Differences Between Shapes That Are Similar but Not Identical
      8. Recognizing the Similarity Between Seemingly Dissimilar Shapes
      9. Distinguishing Between Legitimate Shapes and Syntactically Invalid LDS Fragments
      10. Knowing How Shapes Are Likely to Evolve
      11. Asking Questions That Help Users Choose Between Two Similar Shapes
      12. Knowing When to Ask Questions of Users
      13. Knowing When and How to Modify the LDS to Make a Shape Evolve
      14. Understanding the Relative Frequency of the Various Shapes
      15. Referring to each Fundamental Shape by Its Name
      16. Exercises
    11. Chapter 8. One-Entity, No-Relationship Shapes
      1. Shape: Common Independent Entity
      2. Shape: Lonely-Attribute Independent Entity
      3. Shape: Aggregate Independent Entity
      4. Shape: Dependent Entity
      5. Shape: Intersection Entity
      6. Shape: Subordinate Entity
      7. Shape: One-Many Collection Entity
      8. Shape: Many-Many Collection Entity
      9. Unnamed Possibilities
      10. Exercises
    12. Chapter 9. One-Attribute Shapes
      1. Scale
      2. Shape: Nominal-Scale Attributes
      3. Shape: Numeric-Scale Attributes
      4. Shape: Ordinal-Scale Attributes
      5. Shape: Boolean-Scale Attributes
      6. Scale and Datatype
      7. Scale and Attribute Names
      8. Fine Distinctions of Scale
      9. Scale and Abstract Datatypes
      10. Summary of How Scale Restricts an Attribute
      11. Exercises
    13. Chapter 10. Two-Entity Shapes
      1. Two Entities, One Relationship
      2. One-Many Shapes
        1. Shape: Plain One-Many Relationship
        2. Shape: One-Many Relationship Making a Dependent or Intersection Entity
        3. Shape: One-Many Relationship Making a Collection Entity
      3. One-One Shapes
        1. Shape: Plain One-One Relationship
        2. Shape: One-One Relationship Making a Subordinate Entity
        3. Shape: To-be One-One Relationship
        4. Shape: Not-to-be One-One Relationship
        5. Shape: To-be One-One Relationship Making a Subordinate Entity
        6. Shape: Plain To-be One-One Relationship
        7. Shape: Plain Not-to-be One-One Relationship
        8. Shape: Not-to-be One-One Relationship Making a Subordinate Entity
        9. Thinking About One-One Relationship Shapes
      4. Many-Many Shapes
        1. Shape: Plain Many-Many Relationship
        2. Shape: Many-Many Relationship Making a Collection Entity
      5. Two Entities, Two Relationships
        1. Shape: Two One-One Relationships
      6. One-One and One-Many Relationship
        1. Shape: Not-to-be One-One Relationship and One-Many Relationship
        2. Shape: To-be Relationship and One-Many Relationship
      7. Two One-Many Relationships
        1. Shape: Two Same-Direction One-Many Relationships
        2. Shape: Two Opposite-Direction One-Many Relationships
        3. Shape: Many-Many Relationship Plus Some Other Relationships
      8. Two Entities, n Relationships
      9. Exercises
    14. Chapter 11. Shapes with More Than Two Entities
      1. Shape: Chicken Feet In
      2. Shape: Chicken Feet Out
      3. Shape: Chicken Feet Across
      4. Shape: Subordinates Out
      5. Shape: Subordinates Across
      6. Shape: Multiple Plain To-be Relationships
      7. Shape: Multiple To-be Relationships
      8. Shape: Multiple Short Paths
      9. Exercises
    15. Chapter 12. Shapes with Reflexive Relationships
      1. Shape: One-One Reflexive Relationship
      2. Sequence Data and Cyclic Sequence Data
      3. Ordered Pairs
      4. Shape: One-Many Reflexive Relationship
      5. Shape: Many-Many Reflexive Relationship
      6. Exercises
    16. Story Interlude
    17. Chapter 13. LDS Syntax Rules
      1. Within Any LDS, Each Entity, Attribute, Relationship, and Link Has an Official Name That Is Unique
      2. No Reflexive Relationship Is a To-be Relationship
      3. Between Any Pair of Entities, There Is at Most One To-be Relationship
      4. Each Entity Has at Least One Identifier
      5. An Entity Can Have Several Identifiers
      6. No Identifier Can Be a Strict Subset of Another
      7. The LDS Cannot Contain Any Cycles of Identification Dependency
      8. No Link of a Reflexive Relationship Can Contribute to an Identifier
      9. Both Links of a Relationship Cannot Contribute to Identifiers
      10. A Single-Descriptor Identifier Cannot Include the Degree-One Link of a One-Many Relationship
      11. A Multiple-Descriptor Identifier Cannot Include a Link of a One-One Relationship
      12. A Multiple-Descriptor Identifier Cannot Include the Degree-Many Link of a One-Many Relationship
      13. A Relationship Has Either Two Labels or Zero Labels
      14. All One-One Relationships Have Labels
      15. All Reflexive Relationships Have Labels
      16. Between Any Pair of Entities, There Is at Most One Unlabeled Relationship
      17. Valid Relationships
      18. Exercises
    18. Chapter 14. Getting the Names Right
      1. Entity Names
        1. Too-Exclusive and Too-Inclusive Entity Names
        2. Too-Coarse Entity Names
        3. Too-Fine Entity Names
        4. Completely Inaccurate Names
        5. Overloaded Names
      2. Working with Users to Get the Entity Names Right
        1. Expect to Work Hard on Naming Entities
        2. Be Willing to Work Hard Because It’s Worth It
        3. Manage the Difficulty: Choose the Right Moments to Work Hard
        4. Manage the Difficulty: Use the Expertise at Your Disposal
        5. Manage the Difficulty: Teach the Users a Helpful Basic Principle
        6. Example: Checking a Name That You Suspect is Too Coarse
      3. Naming Attributes
      4. Naming Relationships and Links
      5. Exercises
    19. Chapter 15. Official Names
      1. Official Names Can Be Awkward
        1. Coping with Awkwardness in Official Names of Links
        2. Coping with Awkwardness in Official Names of Relationships
      2. A Few Notes About Official Names and To-be Relationships
      3. Exercise
    20. Chapter 16. Labeling Links
      1. Exercises
    21. Chapter 17. Documenting an LDS
      1. The Audience
      2. Front Matter
      3. Entity Documentation
      4. Attribute Documentation
      5. Link Documentation
      6. Relationship Documentation
      7. Fragment Documentation
      8. Constraint Documentation
      9. Issues List
      10. Supplemental Material for Secondary Audiences
        1. Software Professionals
        2. Business Designers
      11. Exercise
    22. Story Interlude
    23. Chapter 18. Script for Controlled Evolution: The Flow
      1. Script for The Flow
      2. Discussing a Not-to-be Relationship
        1. Flow Investigation: Discover Relationship
        2. Listening to Users
        3. Discovering Hidden Relationships
        4. Explicitly Asking for Relationships
        5. Securing an Answer to the Question
        6. Possible Answers to the Question
      3. Flow Stage: Not-to-be Relationship
      4. Flow Investigation: Seek a Chicken Foot
        1. Securing an Answer to the Question
        2. Possible Answers to the Question
      5. Flow Investigation: Seek a One-Many Relationship
        1. Securing an Answer to the Question
        2. Possible Answers to the Question
      6. Flow Investigation: Seek a Many-Many Relationship
        1. Securing an Answer to the Question
        2. Possible Answers to the Question
      7. Flow Stage: One-One, Not-to-be Relationship
      8. Flow Stage: One-Many Relationship
      9. Flow Stages: Initial Many-Many Relationship and New Intersection Entity
      10. Developing a Chicken-Feet-In Shape
        1. Preview of Working on a New Intersection Entity
      11. Flow Investigation: Seek Descriptors for Intersection Entity
      12. Flow Investigation: Seek Tiebreaker
      13. Flow Investigation: Consider Overidentification
      14. Flow Investigation: Seek Independent Entity
      15. Discussing a To-be Relationship
      16. Flow Investigation: Consider Synonymy
        1. Securing an Answer to the Question
        2. Possible Answers to the Question
      17. Flow Investigation: Consider Subordination
      18. Continuing the Discussion
      19. Flow Continuation: Seek Other Relationships
      20. Flow Continuation: Seek Further Evolution for a One-Many Relationship
      21. Flow Continuation: Seek Further Evolution for the Chicken-Feet-Across Shape
      22. Flow Continuation: Seek Further Evolution for the Chicken-Feet-In Shape
      23. Exercises
    24. Chapter 19. Local, Anytime Steps of Controlled Evolution
      1. Discovering Entities
        1. Keep Grounded
        2. Notice Unnamed Data
      2. Fixing Identifiers
        1. Underidentification
        2. Overidentification
        3. General Misidentification
        4. Extraneous Identifiers
      3. Seeking Descriptors
        1. Asking the Question
        2. Possible Answers to the Question
        3. Giving the Users Extra Help
      4. Promoting Attributes
        1. Promoting a Plural Attribute
        2. Promoting a Singular Attribute
      5. Relocating Misplaced Descriptors
        1. Factors for Characterizing Descriptor Misplacement
        2. Ways to Detect Descriptor Misplacement
      6. Exercises
    25. Chapter 20. Global, Anytime Steps of Controlled Evolution
      1. Redrawing the Diagram
      2. Altering the Overall Style of an LDS
      3. Changing the Level of Abstraction
        1. Combining Multiple Short Paths
        2. Collapsing a Taxonomy
        3. Collapsing Subordinates Out
        4. Guidelines for Increasing or Decreasing Abstractness
        5. When You Can and Cannot Use Abstract Shapes
      4. Exercises
    26. Chapter 21. Conversations About Dairy Farming
      1. Meeting with Users from the General Offices
      2. Meeting with Veterinary Epidemiologists
      3. Meeting with Economic Analysts
      4. Exercises
    27. Story Interlude
    28. Chapter 22. Constraints
      1. Constraint Definition Requires a Stabilized Data Model
      2. Many Candidate Constraints Turn Out to Be False
      3. Many Constraints Subject a Data Model to Premature Obsolescence
      4. Worthy Constraints
      5. Constraints and Shifting the Burden
      6. Summary and Final Thoughts
      7. Exercise
    29. Chapter 23. LDS for LDS
      1. The Meta-LDS
        1. Entities and Attributes
        2. Entities Have Descriptors
        3. Relationships and Links
        4. Identifiers
      2. Discussion
        1. Anchoring Your Understanding with Instances
        2. Anticipating How the Meta-LDS Might Evolve
        3. Constraints on the Meta-LDS
      3. Summary
      4. Exercises
    30. Chapter 24. Decisions: Designing a Data-Modeling Notation
      1. Overall Decisions
        1. What Is the Purpose?
        2. What Concepts Are Modeled?
        3. What Are the Names of the Modeled Concepts?
        4. Should a Model Include Behavior?
        5. What Graphical Notations Should Be Used?
      2. Decisions About Entities
        1. Should an Entity Name Characterize One Instance or Many?
        2. Should There Be Different Notations for Different Kinds of Entity?
        3. Should Each Entity Have an Identifier?
      3. Decisions About Identifiers
        1. Should All Identifiers Be Arbitrary?
        2. How Should Identifiers Be Annotated?
        3. Can Identifiers Include Links?
        4. Can an Entity Have Multiple Identifiers?
        5. Must an Entity Have Multiple Identifiers?
        6. How Are Multiple Identifiers Annotated?
      4. Decisions About Attributes
        1. Is There a Difference Between Entities and Attributes?
        2. Do Attributes Belong on the Diagram?
        3. Do Data Types Belong on the Diagram?
        4. Do Scales Belong on the Diagram?
        5. Are Plural Attributes Allowed?
        6. Are Foreign Key Attributes on the Model?
        7. Are Type-Level Attributes Allowed?
      5. Decisions About Relationships
        1. Are All Relationships Binary?
        2. Is There a Difference Between Relationships and Entities?
        3. Can a Relationship Have Attributes?
        4. Can a Relationship Have Links?
        5. Can a Relationship Have an ID? Must a Relationship Have an ID?
        6. Is There a Difference Between Relationships and Entities? Revisited
        7. Do Relationships Have Names?
        8. Should There Be Different Notations for Different Kinds of Relationships?
        9. Should Relationship Names Be Verbs?
        10. What Does a Relationship Look Like on the Diagram?
      6. Decisions About Links
        1. Do Links Have Names?
        2. Are Link Names on the Diagram?
        3. How Should a Link Label Be Annotated?
        4. Which Links Get Labeled?
        5. What Does Maximum Degree Look Like?
      7. Decisions About Descriptors
        1. Should Entities and Descriptors Share a Namespace?
      8. Decisions About Constraints
        1. Should the Diagram Capture Minimum Degree?
        2. Should the Diagram Capture Intrainstance, Intraattribute Constraints?
        3. Should the Diagram Capture Intrainstance, Interattribute Constraints?
        4. Should the Diagram Capture Intrainstance, Intralink Constraints?
        5. Should the Diagram Capture Intrainstance, Interdescriptor Constraints?
        6. Should the Diagram Capture Triangle Relationships?
        7. Should the Diagram Capture Interinstance, Intradescriptor Constraints?
        8. Should the Diagram Capture Interinstance, Interdescriptor, Intraentity Constraints?
        9. Should the Diagram Capture Interinstance, Interentity Constraints?
      9. Summary and Final Thoughts
      10. Exercises
    31. Chapter 25. LDS and the Relational Model
      1. Relational Databases
      2. Mapping an LDS to a Relational Schema
      3. LDS and Normal Forms
        1. First Normal Form
        2. Second Normal Form
        3. Third Normal Form
        4. Fourth Normal Form
        5. Fifth Normal Form
      4. Summary
      5. Exercises
    32. Chapter 26. Cookbook: Recipes for Data Modelers
      1. Set Recipes
      2. Graph Recipes
      3. Matrix Recipes
      4. Taxonomy and Near Taxonomy Recipes
      5. Exercises
    33. Story Interlude
    34. Appendix: Exercises for Mastery
    35. Index