You are previewing JUNOS High Availability.
O'Reilly logo
JUNOS High Availability

Book Description

Whether your network is a complex carrier or just a few machines supporting a small enterprise, JUNOS High Availability will help you build reliable and resilient networks that include Juniper Networks devices. With this book's valuable advice on software upgrades, scalability, remote network monitoring and management, high-availability protocols such as VRRP, and more, you'll have your network uptime at the five, six, or even seven nines -- or 99.99999% of the time. Rather than focus on "greenfield" designs, the authors explain how to intelligently modify multi-vendor networks. You'll learn to adapt new devices to existing protocols and platforms, and deploy continuous systems even when reporting scheduled downtime. JUNOS High Availability will help you save time and money.

  • Manage network equipment with Best Common Practices

  • Enhance scalability by adjusting network designs and protocols

  • Combine the IGP and BGP networks of two merging companies

  • Perform network audits

  • Identify JUNOScripting techniques to maintain high availability

  • Secure network equipment against breaches, and contain DoS attacks

  • Automate network configuration through specific strategies and tools

This book is a core part of the Juniper Networks Technical Library™.

Table of Contents

  1. JUNOS High Availability
  2. Preface
    1. What Is High Availability?
    2. How to Use This Book
    3. What’s in This Book?
      1. Part I
      2. Part II
      3. Part III
      4. Part IV
    4. Conventions Used in This Book
    5. Using Code Examples
    6. Safari® Books Online
    7. Comments and Questions
    8. Acknowledgments
  3. I. JUNOS HA Concepts
    1. 1. High Availability Network Design Considerations
      1. Why Mention Cost in a Technical Book?
      2. A Simple Enterprise Network
      3. Redundancy and the Layered Model
        1. Redundant Site Architectures
        2. Redundant Component Architectures
        3. Combined Component and Site-Redundant Architectures
        4. Redundant System Architectures
        5. Combined System- and Site-Redundant Architectures
        6. Combined System- and Component-Redundant Architectures
        7. Combined System-, Component-, and Site-Redundant Architectures
      4. What Does It All Mean?
    2. 2. Hardware High Availability
      1. Divide and Conquer
        1. The Brains: The Routing Engine
          1. RE comparison
            1. M Series
            2. MX Series
            3. T Series
            4. EX Series
            5. SRX Series
            6. J Series
        2. The Brawn: The Packet Forwarding Engine
          1. Hardware components
          2. Model comparison
            1. M Series
            2. MX Series
            3. T Series
            4. EX Series
            5. SRX Series
            6. J Series
      2. Packet Flows
        1. M Series
        2. MX Series
        3. T Series
        4. EX Series
        5. SRX Series
        6. J Series
      3. Redundancy and Resiliency
        1. M Series
        2. MX Series
        3. T Series
        4. J Series
        5. SRX Series
        6. EX Series
    3. 3. Software High Availability
      1. Software Architecture
        1. Stable Foundations
        2. Modular Design
          1. Daemons
      2. One OS to Rule Them
        1. Single OS
          1. Forks and trains
          2. No reeducation through labor
        2. One Release Architecture
      3. Automation of Operations
        1. Configuration Management
        2. Application Programming Interfaces
        3. Scripting
          1. Commit scripts
          2. Operation scripts
          3. Event policy scripts
    4. 4. Control Plane High Availability
      1. Under the Hood of the Routing Engine
        1. Routing Update Process
          1. Step 1: Verify that the RE and PFEs are up
          2. Step 2: Verify that the socket is built
          3. Step 3: Verify that there is a valid TNP communication
          4. Step 4: Verify that BGP adjacencies are established
          5. Step 5: Verify that BGP updates are being received
          6. Step 6: Verify that route updates are processed correctly
          7. Step 7: Verify that the correct next hop is being selected
          8. Step 8: Verify that the correct copy of the route is being selected for kernel update
          9. Step 9: Verify that the correct copy of the route is being sent to the forwarding plane
          10. Step 10: Verify that the correct copy of the route is being installed into the forwarding plane on the PFE complex
      2. Graceful Routing Engine Switchover
        1. Implementation and Configuration
          1. Configuration examples
          2. Troubleshooting GRES
      3. Graceful Restart
        1. Graceful Restart in OSPF
          1. Configuration
          2. Immunizing against topology change
        2. Graceful Restart in IS-IS
          1. Configuration
        3. Graceful Restart in BGP
          1. Restarting the node
          2. Peers
          3. Configuration
      4. MPLS Support for Graceful Restart
        1. Graceful Restart in RSVP
          1. Configuration
        2. Graceful Restart in LDP
          1. Configuration
        3. Graceful Restart in MPLS-Based VPNs
          1. Configuration
        4. Graceful Restart in Multicast Protocols, PIM, and MSDP
      5. Non-Stop Active Routing
        1. Implementation Details and Configs
      6. Non-Stop Bridging
        1. Implementation Details and Configurations
      7. Choosing Your High Availability Control Plane Solution
    5. 5. Virtualization for High Availability
      1. Virtual Chassis in the Switching Control Plane
        1. VC Roles
        2. IDs for VCs
        3. Priorities and the Election Process
          1. How to rig an election
        4. Basic VC Setup and Configuration
        5. Eliminating Loops Within the VC
        6. Highly Available Designs for VCs
          1. Manipulating a split VC
          2. Server resilience with VCs
      2. Control System Chassis
        1. Requirements and Implementation
        2. Consolidation Example and Configuration
        3. Taking Consolidation to the Next Level: Scalable Route Reflection
  4. II. JUNOS HA Techniques
    1. 6. JUNOS Pre-Upgrade Procedures
      1. JUNOS Package Overview
        1. Software Package Naming Conventions
        2. When to Upgrade JUNOS in a High Availability Environment
        3. The Right Target Release for a High Availability Environment
        4. High Availability Upgrade Strategy
          1. Conduct a lab trial
          2. Choose the device to upgrade
          3. Ensure router steady state
            1. Save the working configuration
            2. System-archive a copy of the working configuration
          4. Establish a quarantine period
      2. Pre-Upgrade Verifications
        1. Filesystems and Logs
        2. Checklist
      3. Moving Services Away from a Router
        1. Interface Configuration
        2. Switching Ownership of a VRRP Virtual IP
        3. IGP Traffic Control Tweaks
          1. OSPF and the overload bit
          2. Moving the designated router
          3. The overload bit and IS-IS
          4. Moving the DIS
        4. Label-Switched Paths
          1. RSVP-signaled LSPs
    2. 7. Painless Software Upgrades
      1. Snapshots
      2. Software Upgrades with Unified ISSU
        1. How It Works
        2. Implementation Details
          1. Configuration dependencies
            1. GRES configuration
            2. NSR configuration
      3. Software Upgrades Without Unified ISSU
        1. Loading a JUNOS Image
      4. Snapshots Redux
      5. Image Upgrade Tweaks and Options
      6. J Series Considerations
        1. Cleanup
        2. Backup Images
        3. Rescue Configuration
    3. 8. JUNOS Post-Upgrade Verifications
      1. Post-Upgrade Verification
        1. Device State
          1. Verify chassis hardware
          2. Check for alarms
          3. Verify interfaces
          4. Verify memory
        2. Network State (Routes, Peering Relationships, and Databases)
          1. Verify routing
          2. Routing table consistency
        3. State of Existing Services
        4. Filesystems and Logs
          1. Install logfiles
          2. Messages file
          3. Syslog settings
        5. Removal of Configuration Workarounds
      2. Fallback Procedures
      3. Applicability
    4. 9. Monitoring for High Availability
      1. I Love Logs
        1. Syslog Overview
          1. Facilities
          2. Severity
          3. Header and MSG parts
        2. Syslog Planning
          1. Pitfalls
        3. Implementing Syslog
          1. Sample configuration
          2. Monitoring syslog
      2. Simple Network Management Protocol
        1. SNMP Overview
          1. Notification categories
          2. RMON alarms
          3. Health monitoring
        2. SNMP Planning
        3. Implementing SNMP
          1. SNMPv3
          2. RMON
          3. Health monitoring
          4. Pitfalls
      3. Traffic Monitoring
        1. Traffic Monitoring Overview
        2. Traffic Monitoring Planning
        3. Implementing Traffic Monitoring
          1. Packet sampling
          2. Port mirroring
          3. Counters
        4. Route Monitoring
          1. Route Views
          2. Cyclops
          3. BGPlayer
          4. Pitfalls
    5. 10. Management Interfaces
      1. A GUI for Junior Techs
        1. Using J-Web
        2. J-Web for High Availability
      2. Mid-Level Techs and the CLI
        1. Event Policy Planning
          1. Sample event policy configuration
        2. Event Policies for High Availability
      3. Deep Magic for Advanced Techs
        1. JUNOS APIs
          1. XSLT
          2. SLAX
        2. Automation Scripts
          1. Operation scripts
          2. Event scripts
        3. Working with Scripts
          1. Planning scripts
          2. Loading and calling scripts
          3. Refreshing scripts
    6. 11. Management Tools
      1. JUNOScope
        1. Overview
        2. JUNOScope and High Availability
          1. Looking Glass
          2. Configuration Manager
          3. Inventory Management System
          4. Software Manager
        3. Using JUNOScope
          1. JUNOScope installation
      2. Juniper AIS
        1. Overview
        2. AIS for High Availability
          1. Installation
          2. AIS planning
      3. Partner Tools
        1. Open IP Service Development Platform (OSDP)
        2. Partner Solution Development Platform (PSDP)
    7. 12. Managing Intradomain Routing Table Growth
      1. Address Allocation
        1. Interface Addressing
          1. JUNOS interface addressing syntax
        2. Infrastructure Routes
        3. Customer Routes
          1. Virtual Router Redundancy Protocol
        4. Network Virtualization and Service Overlays
          1. Routing instances
          2. Logical routers
          3. Enable VLAN tagging in the primary logical router
          4. Configuring the service overlay
      2. Address Aggregation
        1. What Is Aggregation?
          1. Practical aggregation for a large domain
          2. Is there a risk?
        2. Use of the Private Address Space
          1. Private addressing and internal services
          2. Private addressing and customer services
          3. Private addressing, NAT, and MIP
        3. Use of Public Address Space
        4. Static Routes
          1. When to configure static routes
        5. Using Protocol Tweaks to Control Routing Table Size
          1. IS-IS areas and levels
          2. OSPF areas
    8. 13. Managing an Interdomain Routing Table
      1. Enterprise Size and Effective Management
        1. Small to Medium-Size Enterprise Perspective
        2. Large Enterprises and Service Providers
        3. AS Number
      2. Border Gateway Protocol (BGP)
        1. EBGP Loop Prevention
        2. IBGP Loop Prevention
          1. IBGP full-mesh requirements
          2. Implications of full mesh for high availability
          3. Alternatives to full mesh
        3. Route Reflection
          1. Route reflection basics
          2. High availability design considerations for route reflection
          3. Turning it on
          4. Route reflectors and policy configuration
            1. Route reflection and next-hop self: What not to do
          5. What is wrong with this picture?
            1. Be terrific; be specific
        4. Confederation
          1. Confederation syntax
          2. Implications of confederation for high availability
          3. Configuration for redundancy
          4. How does multihop affect my routing table?
        5. Common High Availability Routing Policies
          1. Local address filters
          2. Prefix-length enforcement
          3. Default routes: To block or not to block?
          4. Route damping
          5. A “damp” policy
          6. Implications of damping
        6. BGP Tweak: Prefix Limit
          1. Implications of route and prefix limits
  5. III. Network Availability
    1. 14. Fast High Availability Protocols
      1. Protocols for Optical Networks
        1. Ethernet Operations, Administration, and Maintenance (OAM)
          1. IEEE 802.1ah and 802.1ag
        2. SONET/SDH Automatic Protection Switching
      2. Rapid Spanning Tree Protocol
      3. Interior Gateway Protocols
      4. Bidirectional Forwarding Detection
        1. Setting the Interval for BFD Control Packets
      5. Virtual Router Redundancy Protocol
      6. MPLS Path Protection
        1. Fast Reroute
        2. Node and Link Protection
    2. 15. Transitioning Routing and Switching to a Multivendor Environment
      1. Industry Standards
      2. Multivendor Architecture for High Availability
        1. Two Sensible Approaches
          1. Layered approach to multivendor networks
            1. CDA model
            2. PE-CE model
          2. Site-based approach to multivendor networks
        2. Multivendor As a Transition State
          1. Layered transitions
          2. Site-based transitions
      3. Routing Protocol Interoperability
        1. Interface Connectivity
        2. OSPF Adjacencies Between Cisco and Juniper Equipment
          1. OSPF authentication keys
        3. IBGP Peering
        4. EBGP Peering
          1. The BGP next hop issue
          2. The other issue
          3. Success
    3. 16. Transitioning MPLS to a Multivendor Environment
      1. Multivendor Reality Check
        1. Cost Concerns
      2. MPLS Signaling for High Availability
        1. A Simple Multivendor Topology
        2. RSVP Signaling
          1. Traffic engineering
          2. Juniper–Cisco RSVP
          3. Router r5 configuration
        3. LDP Signaling
          1. A few LDP implementation differences
      3. MPLS Transition Case Studies
        1. Case Study 1: Transitioning Provider Devices
          1. Phase 1: P router transition
          2. Phase 2: P router transition
          3. Phase 3: P router transition
          4. Final state: P router transition
        2. Case Study 2: Transitioning Provider Edge Devices
          1. Phase 1: PE router transition
          2. Phase 2: PE router transition
          3. Phase 3: PE router transition
          4. Phase 4: PE router transition
          5. Final state: PE router transition
    4. 17. Monitoring Multivendor Networks
      1. Are You In or Out?
        1. In-Band Management
        2. Out-of-Band Management
          1. OoB and fxp0
          2. Configuration groups for high availability
      2. SNMP Configuration
        1. JUNOS SNMP Configuration
        2. IOS SNMP Configuration
        3. SNMP and MRTG
      3. Syslog Configuration
        1. Syslog in JUNOS
        2. Syslog in IOS
        3. Syslog and Kiwi
      4. Configuration Management
      5. Configuration for AAA
        1. TACACS+
          1. JUNOS authentication
          2. IOS authentication
          3. JUNOS locally defined accounts and authorization
          4. IOS authorization
          5. JUNOS accounting (activity tracking)
          6. IOS accounting (activity tracking)
      6. JUNOS GUI Support
      7. What IS Normal?
    5. 18. Network Scalability
      1. Hardware Capacity
        1. Device Resources to Monitor
          1. Control plane capacity best practices
          2. Data plane specifications
      2. Network Scalability by Design
        1. Scaling BGP for High Availability
          1. Route reflectors and clusters
          2. What’s the point?
        2. MPLS for Network Scalability and High Availability
          1. Basic LSP configuration syntax
          2. Secondary LSPs
          3. Hot standby
            1. Fast reroute
            2. Link and node-link protection
        3. Traffic Engineering Case Study
    6. 19. Choosing, Migrating, and Merging Interior Gateway Protocols
      1. Choosing Between IS-IS and OSPF
        1. OSPF
          1. Advantages
          2. Disadvantages
          3. High availability features for OSPF in JUNOS Software
            1. Link and node failure detection
            2. Authenticating packets
            3. Designated routers
            4. Graceful Restart
            5. Non-Stop Active Routing
            6. Overload
            7. Prefix limits
            8. Bidirectional Forwarding Detection
        2. IS-IS
          1. Advantages
          2. Disadvantages
          3. High availability features for IS-IS in JUNOS Software
            1. Link and node failure detection
            2. Authenticating packets
            3. Graceful Restart
            4. Non-Stop Active Routing
            5. Overload
            6. Prefix limits
            7. Bidirectional Forwarding Detection
        3. Which Protocol Is “Better”?
          1. A final thought
      2. Migrating from One IGP to Another
        1. Migrating from OSPF to IS-IS
          1. Step 1: Plan for the migration
          2. Step 2: Add IS-IS to the network
          3. Step 3: Make IS-IS the “preferred” IGP
          4. Step 4: Verify the success of the migration
          5. Step 5: Remove OSPF from the network
        2. Migrating from IS-IS to OSPF
          1. Step 1: Plan for the migration
          2. Step 2: Add OSPF to the network
          3. Step 3: Make OSPF the “preferred” IGP
          4. Step 4: Verify the success of the migration
          5. Step 5: Remove IS-IS from the network
      3. Merging Networks Using a Common IGP
        1. Considerations
          1. Area design
          2. Matching configuration parameters
          3. Tunneling
        2. Other Options for Merging IGPs
          1. BGP
          2. Routing instances
    7. 20. Merging BGP Autonomous Systems
      1. Planning the Merge
        1. Architecture
          1. Making the choice
          2. Pitfalls
          3. External peering
          4. Route reflector 1
          5. Route reflector 2
          6. Oscillation commences
        2. Outcomes
        3. BGP Migration Features in JUNOS
          1. Graceful Restart
          2. Non-Stop Active Routing
          3. Full mesh made easy (well, easier)
          4. Zen and the art of AS numbers
          5. Sometimes loopy is OK
      2. Merging Our ASs Off
        1. Merge with Full Mesh
          1. IBGP
          2. Bring in the EBGP peer
        2. Merge with Route Reflectors
          1. Cluster 1
          2. Cluster 2
        3. Merge with Confederations
      3. Monitoring the Merge
        1. Neighbor Peering
          1. Persistent route oscillation
    8. 21. Making Configuration Audits Painless
      1. Why Audit Configurations?
        1. Knowledge Is Power
        2. JUNOS: Configuration Auditing Made Easy
      2. Configuration Auditing 101
        1. Organizing the Audit
          1. Configuration modules
          2. Functional network areas
          3. Organization involvement
      3. Auditing Configurations
        1. Baseline Configurations
          1. Saving a baseline
          2. Baseline configuration with JUNOS groups
          3. Baseline configuration with commit scripts
        2. Manually Auditing Configurations
          1. Manual auditing through the GUI
          2. Manual auditing through the CLI
        3. Automating Configuration Audits
          1. Event policies
          2. JUNOScope
          3. Advanced Insight Solution
      4. Performing and Updating Audits
        1. Auditing Intervals
        2. Analyzing Updates
        3. Auditing Changes
    9. 22. Securing Your Network Equipment Against Security Breaches
      1. Authentication Methods
        1. Local Password Authentication
        2. RADIUS and TACACS+ Authentication
        3. Authentication Order
      2. Hardening the Device
        1. Use a Strong Password, and Encrypt It
        2. Disable Unused Access Methods
        3. Control Physical Access to the Device
        4. Control Network Access to the Device
        5. Control and Authenticate Protocol Traffic
        6. Define Access Policies
      3. Firewall Filters
        1. Firewall Filter Syntax
          1. Match conditions
          2. Actions
          3. Evaluating filters
          4. Implicit discard
        2. Applying Firewall Filters
        3. Using Firewall Filters to Protect the Network
          1. Spoof prevention
          2. Securing a web/FTP server
          3. The options are endless
        4. Using Firewall Filters to Protect the Routing Engine
        5. Stateful Firewalls
    10. 23. Monitoring and Containing DoS Attacks in Your Network
      1. Attack Detection
        1. Using Filtering to Detect Ping Attacks
        2. Using Filtering to Detect TCP SYN Attacks
      2. Taking Action When a DoS Attack Occurs
        1. Using Filtering to Block DoS Attacks
          1. Filter some, filter all
        2. Request Help from Your Upstream Provider
      3. Attack Prevention
        1. Eliminate Unused Services
        2. Enable Reverse Path Forwarding
        3. Use Firewall Filters
        4. Use Rate Limiting
        5. Deploy Products Specifically to Address DoS Attacks
      4. Gathering Evidence
        1. Firewall Logs and Counters
        2. Port Mirroring
        3. Sampling
        4. cflowd
    11. 24. Goals of Configuration Automation
      1. CLI Configuration Automation
        1. Hierarchical Configuration
        2. Protections for Manual Configuration
          1. User access
          2. Exclusive configuration
          3. Private configuration
        3. Transaction-Based Provisioning
          1. Standard commits
          2. Commit with scripts
            1. Persistent changes
            2. Transient changes
          3. Script processing
        4. Archives and Rollback
          1. Configuration stores
      2. Automating Remote Configuration
    12. 25. Automated Configuration Strategies
      1. Configuration Change Types
        1. Deployment
          1. Network equipment
          2. Services
        2. Infrastructure
        3. Ad Hoc Changes
          1. Workarounds
          2. One-off configurations
      2. Automation Strategies
        1. Global Strategies
        2. Deployment
          1. Hardware deployment
            1. Interfaces
            2. Routing engines
          2. Service deployment
        3. Infrastructure
          1. Interfaces
          2. Routing
        4. Ad Hoc Changes
          1. Workarounds
            1. JUNOS issues
            2. External device issues
          2. One-off workarounds
  6. IV. Appendixes
    1. A. System Test Plan
      1. Physical Inspection and Power On
      2. Check General System Status
        1. Check for Any Active Alarms
        2. Save the System Hardware Configuration for Future Reference
        3. Check Voltages and Temperatures
        4. Check the Status of the Individual Components
      3. Check Routing Engine and Storage Media
        1. Check Routing Engine Status
        2. Check Storage Media on Each Routing Engine
      4. Test Optical Interfaces
        1. Configure a Private IP Address and Run Ping Tests
          1. Run a loopback test on SONET/SDH interfaces
          2. Run a loopback test on Fast Ethernet and Gigabit Ethernet interfaces
      5. Failover and Redundancy Tests
        1. Routing Engine Redundancy
        2. SFM Redundancy (M40e Platform Only)
      6. Final Burn-In Check
        1. Power Down the Router
        2. Power On the Router/Burn-In Test
        3. Final Checks and Power Down
    2. B. Configuration Audit
      1. Audit Responsibilities
      2. Audit Response Key
      3. Audit Checklist
      4. Audit Interval
    3. C. High Availability Configuration Statements
      1. Routing Engine and Switching Control Board
        1. cfeb
        2. description
        3. failover on-disk-failure
        4. failover on-loss-of-keepalives
        5. failover other-routing-engine
        6. feb (Creating a Redundancy Group)
        7. feb (Assigning a FEB to a Redundancy Group)
        8. keepalive-time
        9. no-auto-failover
        10. redundancy
        11. redundancy-group
        12. routing-engine
        13. sfm
        14. ssb
      2. Graceful Routing Engine Switchover
        1. graceful-switchover
      3. Nonstop Bridging Statements
        1. nonstop-bridging
      4. Nonstop Active Routing
        1. commit synchronize
        2. nonstop-routing
        3. traceoptions
      5. Graceful Restart
        1. disable
        2. graceful-restart
        3. helper-disable
        4. maximum-helper-recovery-time
        5. maximum-helper-restart-time
        6. maximum-neighbor-reconnect-time
        7. maximum-neighbor-recovery-time
        8. no-strict-lsa-checking
        9. notify-duration
        10. reconnect-time
        11. recovery-time
        12. restart-duration
        13. restart-time
        14. stale-routes-time
        15. traceoptions
      6. VRRP
        1. accept-data
        2. advertise-interval
        3. authentication-key
        4. authentication-type
        5. bandwidth-threshold
        6. fast-interval
        7. hold-time
        8. inet6-advertise-interval
        9. interface
        10. preempt
        11. priority
        12. priority-cost
        13. priority-hold-time
        14. route
        15. startup-silent-period
        16. traceoptions
        17. track
        18. virtual-address
        19. virtual-inet6-address
        20. virtual-link-local-address
        21. vrrp-group
        22. vrrp-inet6-group
      7. Unified In-Service Software Upgrade (ISSU)
        1. no-issu-timer-negotiation
        2. traceoptions
  7. Index
  8. About the Authors
  9. Colophon
  10. Copyright