O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Designing Voice User Interfaces

Book Description

Voice user interfaces (VUIs) are becoming all the rage today. But how do you build one that people can actually converse with? Whether you’re designing a mobile app, a toy, or a device such as a home assistant, this practical book guides you through basic VUI design principles, helps you choose the right speech recognition engine, and shows you how to measure your VUI’s performance and improve upon it.

Author Cathy Pearl also takes product managers, UX designers, and VUI designers into advanced design topics that will help make your VUI not just functional, but great.

Table of Contents

  1. Dedication
  2. Special Upgrade Offer
  3. Praise for Designing Voice User Interfaces
  4. Preface
    1. Why Write This Book?
    2. The Chinese Room and the Turing Test
    3. Who Should Read This Book
    4. How This Book Is Organized
    5. O’Reilly Safari
    6. How to Contact Us
    7. Acknowledgments
  5. 1. Introduction
    1. A Brief History of VUIs
      1. The Second Era of VUIs
      2. Why Voice User Interfaces?
    2. Conversational User Interfaces
      1. An Interview with Alexa
    3. What Is a VUI Designer?
    4. Chatbots
    5. Conclusion
  6. 2. Basic Voice User Interface Design Principles
    1. Designing for Mobile Devices Versus IVR Systems
    2. Conversational Design
    3. Setting User Expectations
    4. Design Tools
      1. Sample Dialogs
      2. Visual Mock-Ups
      3. Flow
      4. Prototyping Tools
    5. Confirmations
      1. Method 1: Three-Tiered Confidence
      2. Method 2: Implicit Confirmation
      3. Method 3: Nonspeech Confirmation
      4. Method 4: Generic Confirmation
      5. Method 5: Visual Confirmation
    6. Command-and-Control Versus Conversational
      1. Command-and-Control
      2. Conversational
    7. Conversational Markers
    8. Error Handling
      1. No Speech Detected
      2. Speech Detected but Nothing Recognized
      3. Recognized but Not Handled
      4. Recognized but Incorrectly
      5. Escalating Error
    9. Don’t Blame the User
    10. Novice and Expert Users
    11. Keeping Track of Context
    12. Help and Other Universals
    13. Latency
    14. Disambiguation
    15. Design Documentation
      1. Prompts
      2. Grammars/Key Phrases
    16. Accessibility
      1. Interaction Should Be Time-Efficient
      2. Keep It Short
      3. Talk Faster!
      4. Interrupt Me at Any Time
      5. Provide Context
      6. Where Am I?
      7. Text-to-Speech Personalization
    17. Conclusion
  7. 3. Personas, Avatars, Actors, and Video Games
    1. Personas
    2. Should My VUI Be Seen?
    3. Using an Avatar: What Not to Do
    4. Using an Avatar (or Recorded Video): What to Do
      1. Storytelling
      2. Teamwork
      3. Video Games
    5. When Should I Use Video in My VUI?
    6. Visual VUI—Best Practices
      1. Should My Users See Themselves?
      2. What About the GUI?
      3. Handling Errors
      4. Turn Taking and Barge-In
      5. Maintaining Engagement and the Illusion of Awareness
    7. Visual (Non-Avatar) Feedback
    8. Choosing a Voice
    9. Pros of an Avatar
    10. The Downsides of an Avatar
      1. The Uncanny Valley
    11. Conclusion
  8. 4. Speech Recognition Technology
    1. Choosing an Engine
    2. Barge-In
      1. Timeouts
        1. End-of-speech timeout
        2. No speech timeout
        3. Too much speech
    3. N-Best Lists
    4. The Challenges of Speech Recognition
      1. Noise
      2. Multiple Speakers
      3. Children
      4. Names, Spelling, and Alphanumeric
    5. Data Privacy
    6. Conclusion
  9. 5. Advanced Voice User Interface Design
    1. Branching Based on Voice Input
      1. Constrained Responses
      2. Open Speech
      3. Categorization of Input
      4. Wildcards and Logical Expressions
    2. Disambiguation
      1. Not Enough Information
      2. More Than One Piece of Information When Only One Is Expected
    3. Handling Negation
    4. Capturing Intent and Objects
    5. Dialog Management
    6. Don’t Leave Your User Hanging
    7. Should the VUI Display What It Recognized?
    8. Sentiment Analysis and Emotion Detection
    9. Text-to-Speech Versus Recorded Speech
    10. Speaker Verification
    11. “Wake” Words
    12. Context
    13. Advanced Multimodal
    14. Bootstrapping Datasets
      1. Website data
      2. Call center data
      3. Data collection
    15. Advanced NLU
    16. Conclusion
  10. 6. User Testing for Voice User Interfaces
    1. Special VUI Considerations
    2. Background Research on Users and Use Cases
      1. Don’t Reinvent the Wheel
    3. Designing a Study with Real Users
      1. Task Definition
      2. Choosing Participants
      3. Questions to Ask
        1. Open responses (to be asked verbally)
      4. Things to Look For
    4. Early-Stage Testing
      1. Sample Dialogs
      2. Mock-ups
      3. Wizard of Oz Testing
      4. Difference Between WOz and Usability Testing
    5. Usability Testing
      1. Remote Testing
        1. Moderated versus unmoderated
        2. Video recording
        3. Services for remote testing
      2. Lab Testing
      3. Guerrilla Testing
    6. Performance Measures
    7. Next Steps
    8. Testing VUIS in Cars, Devices, and Robots
      1. Cars
      2. Devices and Robots
    9. Conclusion
  11. 7. Your Voice User Interface Is Finished! Now What?
    1. Prerelease Testing
      1. Dialog Traversal Testing
      2. Recognition Testing
      3. Load Testing
    2. Measuring Performance
      1. Task Completion Rates
      2. Dropout Rate
      3. Other Items to Track
        1. Amount of time in the VUI
        2. Barge-in
        3. Speech versus GUI
        4. High no-speech timeouts, no matches
        5. Navigation
        6. Latency
        7. Whole call recording
    3. Logging
    4. Transcription
    5. Release Phases
      1. Pilot
    6. Surveys
    7. Analysis
      1. Confidence Thresholds
      2. End-of-Speech Timeouts
      3. Interim Results versus Final Results
      4. Custom Dictionaries
      5. Prompts
    8. Tools
      1. Regression Testing
    9. Conclusion
  12. 8. Voice-Enabled Devices and Cars
    1. Devices
      1. Home Assistants
      2. Watches/Bands/Earbuds
      3. Other Devices
    2. Cars and Autonomous Vehicles
      1. Challenges of Designing VUI for the Car
      2. Designing for in the Car
      3. Distracted Driving
      4. Device Shifting
      5. Interaction Mode
      6. Conclusions on Cars
    3. Conclusion
  13. A. Epilogue
  14. B. Products Mentioned in This Book
    1. Mobile Phone Assistants
    2. Home Assistants
    3. Toys/Other
    4. Apps
    5. Video Games
    6. Watches / Bands
    7. Cars
  15. C. About the Author
  16. Index
  17. About the Author
  18. Colophon
  19. Special Upgrade Offer
  20. Copyright