You are previewing Reversing: Secrets of Reverse Engineering.
O'Reilly logo
Reversing: Secrets of Reverse Engineering

Book Description

Beginning with a basic primer on reverse engineering-including computer internals, operating systems, and assembly language-and then discussing the various

applications of reverse engineering, this book provides readers with practical, in-depth techniques for software reverse engineering. The book is broken into two parts, the first deals with security-related reverse engineering and the second explores the more practical aspects of reverse engineering. In addition, the author explains how to reverse engineer a third-party software library to improve interfacing and how to reverse engineer a competitor's software to build a better product.

* The first popular book to show how software reverse engineering can help defend against security threats, speed up development, and unlock the secrets of competitive products

* Helps developers plug security holes by demonstrating how hackers exploit reverse engineering techniques to crack copy-protection schemes and identify software targets for viruses and other malware

* Offers a primer on advanced reverse-engineering, delving into "disassembly"-code-level reverse engineering-and explaining how to decipher assembly language

Table of Contents

  1. Copyright
  2. Credits
  3. Foreword
  4. Acknowledgments
  5. Introduction
    1. Reverse Engineering and Low-Level Software
    2. How This Book Is Organized
    3. Who Should Read this Book
    4. Tools and Platforms
    5. What's on the Web Site
    6. Where to Go from Here?
  6. I. Reversing 101
    1. 1. Foundations
      1. 1.1. What Is Reverse Engineering?
      2. 1.2. Software Reverse Engineering: Reversing
      3. 1.3. Reversing Applications
        1. 1.3.1. Security-Related Reversing
          1. 1.3.1.1. Malicious Software
          2. 1.3.1.2. Reversing Cryptographic Algorithms
          3. 1.3.1.3. Digital Rights Management
          4. 1.3.1.4. Auditing Program Binaries
        2. 1.3.2. Reversing in Software Development
          1. 1.3.2.1. Achieving Interoperability with Proprietary Software
          2. 1.3.2.2. Developing Competing Software
          3. 1.3.2.3. Evaluating Software Quality and Robustness
      4. 1.4. Low-Level Software
        1. 1.4.1. Assembly Language
        2. 1.4.2. Compilers
        3. 1.4.3. Virtual Machines and Bytecodes
        4. 1.4.4. Operating Systems
      5. 1.5. The Reversing Process
        1. 1.5.1. System-Level Reversing
        2. 1.5.2. Code-Level Reversing
      6. 1.6. The Tools
        1. 1.6.1. System-Monitoring Tools
        2. 1.6.2. Disassemblers
        3. 1.6.3. Debuggers
        4. 1.6.4. Decompilers
      7. 1.7. Is Reversing Legal?
        1. 1.7.1. Interoperability
        2. 1.7.2. Competition
        3. 1.7.3. Copyright Law
        4. 1.7.4. Trade Secrets and Patents
        5. 1.7.5. The Digital Millenium Copyright Act
        6. 1.7.6. DMCA Cases
        7. 1.7.7. License Agreement Considerations
      8. 1.8. Code Samples & Tools
      9. 1.9. Conclusion
    2. 2. Low-Level Software
      1. 2.1. High-Level Perspectives
        1. 2.1.1. Program Structure
          1. 2.1.1.1. Modules
          2. 2.1.1.2. Common Code Constructs
        2. 2.1.2. Data Management
          1. 2.1.2.1. Variables
          2. 2.1.2.2. User-Defined Data Structures
          3. 2.1.2.3. Lists
        3. 2.1.3. Control Flow
        4. 2.1.4. High-Level Languages
          1. 2.1.4.1. C
          2. 2.1.4.2. C++
          3. 2.1.4.3. Java
          4. 2.1.4.4. C#
      2. 2.2. Low-Level Perspectives
        1. 2.2.1. Low-Level Data Management
          1. 2.2.1.1. Registers
          2. 2.2.1.2. The Stack
          3. 2.2.1.3. Heaps
          4. 2.2.1.4. Executable Data Sections
        2. 2.2.2. Control Flow
      3. 2.3. Assembly Language 101
        1. 2.3.1. Registers
        2. 2.3.2. Flags
        3. 2.3.3. Instruction Format
        4. 2.3.4. Basic Instructions
          1. 2.3.4.1. Moving Data
          2. 2.3.4.2. Arithmetic
          3. 2.3.4.3. Comparing Operands
          4. 2.3.4.4. Conditional Branches
          5. 2.3.4.5. Function Calls
        5. 2.3.5. Examples
      4. 2.4. A Primer on Compilers and Compilation
        1. 2.4.1. Defining a Compiler
        2. 2.4.2. Compiler Architecture
          1. 2.4.2.1. Front End
          2. 2.4.2.2. Intermediate Representations
          3. 2.4.2.3. Optimizer
            1. 2.4.2.3.1. Code Structure
            2. 2.4.2.3.2. Redundancy Elimination
          4. 2.4.2.4. Back End
        3. 2.4.3. Listing Files
        4. 2.4.4. Specific Compilers
      5. 2.5. Execution Environments
        1. 2.5.1. Software Execution Environments (Virtual Machines)
          1. 2.5.1.1. Bytecodes
          2. 2.5.1.2. Interpreters
          3. 2.5.1.3. Just-in-Time Compilers
          4. 2.5.1.4. Reversing Strategies
        2. 2.5.2. Hardware Execution Environments in Modern Processors
          1. 2.5.2.1. Intel NetBurst
          2. 2.5.2.2. ╬╝ops (Micro-Ops)
          3. 2.5.2.3. Pipelines
          4. 2.5.2.4. Branch Prediction
      6. 2.6. Conclusion
    3. 3. Windows Fundamentals
      1. 3.1. Components and Basic Architecture
        1. 3.1.1. Brief History
        2. 3.1.2. Features
        3. 3.1.3. Supported Hardware
      2. 3.2. Memory Management
        1. 3.2.1. Virtual Memory and Paging
          1. 3.2.1.1. Paging
          2. 3.2.1.2. Page Faults
        2. 3.2.2. Working Sets
        3. 3.2.3. Kernel Memory and User Memory
        4. 3.2.4. The Kernel Memory Space
        5. 3.2.5. Section Objects
        6. 3.2.6. VAD Trees
        7. 3.2.7. User-Mode Allocations
        8. 3.2.8. Memory Management APIs
      3. 3.3. Objects and Handles
        1. 3.3.1. Named objects
      4. 3.4. Processes and Threads
        1. 3.4.1. Processes
        2. 3.4.2. Threads
        3. 3.4.3. Context Switching
        4. 3.4.4. Synchronization Objects
        5. 3.4.5. Process Initialization Sequence
      5. 3.5. Application Programming Interfaces
        1. 3.5.1. The Win32 API
        2. 3.5.2. The Native API
        3. 3.5.3. System Calling Mechanism
      6. 3.6. Executable Formats
        1. 3.6.1. Basic Concepts
        2. 3.6.2. Image Sections
        3. 3.6.3. Section Alignment
        4. 3.6.4. Dynamically Linked Libraries
        5. 3.6.5. Headers
        6. 3.6.6. Imports and Exports
        7. 3.6.7. Directories
      7. 3.7. Input and Output
        1. 3.7.1. The I/O System
        2. 3.7.2. The Win32 Subsystem
          1. 3.7.2.1. Object Management
      8. 3.8. Structured Exception Handling
      9. 3.9. Conclusion
    4. 4. Reversing Tools
      1. 4.1. Different Reversing Approaches
        1. 4.1.1. Offline Code Analysis (Dead-Listing)
        2. 4.1.2. Live Code Analysis
        3. 4.1.3. Disassemblers
          1. 4.1.3.1. IDA Pro
        4. 4.1.4. ILDasm
      2. 4.2. Debuggers
        1. 4.2.1. User-Mode Debuggers
          1. 4.2.1.1. OllyDbg
          2. 4.2.1.2. User Debugging in WinDbg
          3. 4.2.1.3. IDA Pro
          4. 4.2.1.4. PEBrowse Professional Interactive
        2. 4.2.2. Kernel-Mode Debuggers
          1. 4.2.2.1. Kernel Debugging in WinDbg
          2. 4.2.2.2. Numega SoftICE
          3. 4.2.2.3. Kernel Debugging on Virtual Machines
      3. 4.3. Decompilers
      4. 4.4. System-Monitoring Tools
      5. 4.5. Patching Tools
        1. 4.5.1. Hex Workshop
      6. 4.6. Miscellaneous Reversing Tools
        1. 4.6.1. Executable-Dumping Tools
          1. 4.6.1.1. DUMPBIN
          2. 4.6.1.2. PEView
          3. 4.6.1.3. PEBrowse Professional
      7. 4.7. Conclusion
  7. II. Applied Reversing
    1. 5. Beyond the Documentation
      1. 5.1. Reversing and Interoperability
      2. 5.2. Laying the Ground Rules
      3. 5.3. Locating Undocumented APIs
        1. 5.3.1. What Are We Looking For?
      4. 5.4. Case Study: The Generic Table API in NTDLL.DLL
        1. 5.4.1. RtlInitializeGenericTable
        2. 5.4.2. RtlNumberGenericTableElements
        3. 5.4.3. RtlIsGenericTableEmpty
        4. 5.4.4. RtlGetElementGenericTable
          1. 5.4.4.1. Setup and Initialization
          2. 5.4.4.2. Logic and Structure
          3. 5.4.4.3. Search Loop 1
          4. 5.4.4.4. Search Loop 2
          5. 5.4.4.5. Search Loop 3
          6. 5.4.4.6. Search Loop 4
          7. 5.4.4.7. Reconstructing the Source Code
        5. 5.4.5. RtlInsertElementGenericTable
          1. 5.4.5.1. RtlLocateNodeGenericTable
            1. 5.4.5.1.1. The Callback
            2. 5.4.5.1.2. High-Level Theories
            3. 5.4.5.1.3. Callback Parameters
            4. 5.4.5.1.4. Summarizing the Findings
          2. 5.4.5.2. RtlRealInsertElementWorker
            1. 5.4.5.2.1. Linking the Element
            2. 5.4.5.2.2. Copying the Element
            3. 5.4.5.2.3. Splaying the Table
          3. 5.4.5.3. Splay Trees
        6. 5.4.6. RtlLookupElementGenericTable
        7. 5.4.7. RtlDeleteElementGenericTable
        8. 5.4.8. Putting the Pieces Together
      5. 5.5. Conclusion
    2. 6. Deciphering File Formats
      1. 6.1. Cryptex
      2. 6.2. Using Cryptex
      3. 6.3. Reversing Cryptex
      4. 6.4. The Password Verification Process
        1. 6.4.1. Catching the "Bad Password" Message
        2. 6.4.2. The Password Transformation Algorithm
        3. 6.4.3. Hashing the Password
      5. 6.5. The Directory Layout
        1. 6.5.1. Analyzing the Directory Processing Code
        2. 6.5.2. Analyzing a File Entry
      6. 6.6. Dumping the Directory Layout
      7. 6.7. The File Extraction Process
      8. 6.8. Scanning the File List
        1. 6.8.1. Decrypting the File
        2. 6.8.2. The Floating-Point Sequence
        3. 6.8.3. The Decryption Loop
        4. 6.8.4. Verifying the Hash Value
      9. 6.9. The Big Picture
      10. 6.10. Digging Deeper
      11. 6.11. Conclusion
    3. 7. Auditing Program Binaries
      1. 7.1. Defining the Problem
      2. 7.2. Vulnerabilities
        1. 7.2.1. Stack Overflows
          1. 7.2.1.1. A Simple Stack Vulnerability
          2. 7.2.1.2. Intrinsic Implementations
          3. 7.2.1.3. Stack Checking
          4. 7.2.1.4. Nonexecutable Memory
        2. 7.2.2. Heap Overflows
        3. 7.2.3. String Filters
        4. 7.2.4. Integer Overflows
          1. 7.2.4.1. Arithmetic Operations on User-Supplied Integers
        5. 7.2.5. Type Conversion Errors
      3. 7.3. Case-Study: The IIS Indexing Service Vulnerability
        1. 7.3.1. CVariableSet::AddExtensionControlBlock
        2. 7.3.2. DecodeURLEscapes
      4. 7.4. Conclusion
    4. 8. Reversing Malware
      1. 8.1. Types of Malware
        1. 8.1.1. Viruses
        2. 8.1.2. Worms
        3. 8.1.3. Trojan Horses
        4. 8.1.4. Backdoors
        5. 8.1.5. Mobile Code
        6. 8.1.6. Adware/Spyware
      2. 8.2. Sticky Software
      3. 8.3. Future Malware
        1. 8.3.1. Information-Stealing Worms
        2. 8.3.2. BIOS/Firmware Malware
      4. 8.4. Uses of Malware
      5. 8.5. Malware Vulnerability
      6. 8.6. Polymorphism
      7. 8.7. Metamorphism
      8. 8.8. Establishing a Secure Environment
      9. 8.9. The HackArmy Backdoor
        1. 8.9.1. Unpacking the Executable
        2. 8.9.2. Initial Impressions
        3. 8.9.3. The Initial Installation
        4. 8.9.4. Initializing Communications
        5. 8.9.5. Connecting to the Server
        6. 8.9.6. Joining the Channel
        7. 8.9.7. Communicating with the Backdoor
        8. 8.9.8. Running SOCKS4 Servers
        9. 8.9.9. Clearing the Crime Scene
      10. 8.10. The Hackarmy Backdoor: A Command Reference
      11. 8.11. Conclusion
  8. III. Cracking
    1. 9. Piracy and Copy Protection
      1. 9.1. Copyrights in the New World
      2. 9.2. The Social Aspect
      3. 9.3. Software Piracy
        1. 9.3.1. Defining the Problem
        2. 9.3.2. Class Breaks
        3. 9.3.3. Requirements
        4. 9.3.4. The Theoretically Uncrackable Model
      4. 9.4. Types of Protection
        1. 9.4.1. Media-Based Protections
        2. 9.4.2. Serial Numbers
        3. 9.4.3. Challenge Response and Online Activations
        4. 9.4.4. Hardware-Based Protections
        5. 9.4.5. Software as a Service
      5. 9.5. Advanced Protection Concepts
        1. 9.5.1. Crypto-Processors
      6. 9.6. Digital Rights Management
        1. 9.6.1. DRM Models
          1. 9.6.1.1. The Windows Media Rights Manager
          2. 9.6.1.2. Secure Audio Path
      7. 9.7. Watermarking
      8. 9.8. Trusted Computing
      9. 9.9. Attacking Copy Protection Technologies
      10. 9.10. Conclusion
    2. 10. Antireversing Techniques
      1. 10.1. Why Antireversing?
      2. 10.2. Basic Approaches to Antireversing
      3. 10.3. Eliminating Symbolic Information
      4. 10.4. Code Encryption
      5. 10.5. Active Antidebugger Techniques
        1. 10.5.1. Debugger Basics
        2. 10.5.2. The IsDebuggerPresent API
        3. 10.5.3. SystemKernelDebuggerInformation
        4. 10.5.4. Detecting SoftICE Using the Single-Step Interrupt
        5. 10.5.5. The Trap Flag
        6. 10.5.6. Code Checksums
      6. 10.6. Confusing Disassemblers
        1. 10.6.1. Linear Sweep Disassemblers
        2. 10.6.2. Recursive Traversal Disassemblers
        3. 10.6.3. Applications
      7. 10.7. Code Obfuscation
      8. 10.8. Control Flow Transformations
        1. 10.8.1. Opaque Predicates
        2. 10.8.2. Confusing Decompilers
        3. 10.8.3. Table Interpretation
        4. 10.8.4. Inlining and Outlining
        5. 10.8.5. Interleaving Code
        6. 10.8.6. Ordering Transformations
      9. 10.9. Data Transformations
        1. 10.9.1. Modifying Variable Encoding
        2. 10.9.2. Restructuring Arrays
      10. 10.10. Conclusion
    3. 11. Breaking Protections
      1. 11.1. Patching
      2. 11.2. Keygenning
      3. 11.3. Ripping Key-Generation Algorithms
      4. 11.4. Advanced Cracking: Defender
        1. 11.4.1. Reversing Defender's Initialization Routine
        2. 11.4.2. Analyzing the Decrypted Code
        3. 11.4.3. SoftICE's Disappearance
        4. 11.4.4. Reversing the Secondary Thread
        5. 11.4.5. Defeating the "Killer" Thread
        6. 11.4.6. Loading KERNEL32.DLL
        7. 11.4.7. Reencrypting the Function
        8. 11.4.8. Back at the Entry Point
        9. 11.4.9. Parsing the Program Parameters
        10. 11.4.10. Processing the Username
        11. 11.4.11. Validating User Information
        12. 11.4.12. Unlocking the Code
        13. 11.4.13. Brute-Forcing Your Way through Defender
      5. 11.5. Protection Technologies in Defender
        1. 11.5.1. Localized Function-Level Encryption
          1. 11.5.1.1. Relatively Strong Cipher Block Chaining
          2. 11.5.1.2. Reencrypting
        2. 11.5.2. Obfuscated Application/Operating System Interface
        3. 11.5.3. Processor Time-Stamp Verification Thread
        4. 11.5.4. Runtime Generation of Decryption Keys
          1. 11.5.4.1. Interdependent Keys
          2. 11.5.4.2. User-Input-Based Decryption Keys
        5. 11.5.5. Heavy Inlining
      6. 11.6. Conclusion
  9. IV. Beyond Disassembly
    1. 12. Reversing .NET
      1. 12.1. Ground Rules
      2. 12.2. .NET Basics
        1. 12.2.1. Managed Code
        2. 12.2.2. .NET Programming Languages
        3. 12.2.3. Common Type System (CTS)
      3. 12.3. Intermediate Language (IL)
        1. 12.3.1. The Evaluation Stack
        2. 12.3.2. Activation Records
        3. 12.3.3. IL Instructions
        4. 12.3.4. IL Code Samples
          1. 12.3.4.1. Counting Items
          2. 12.3.4.2. A Linked List Sample
            1. 12.3.4.2.1. The ListItem Class
            2. 12.3.4.2.2. The LinkedList Class
            3. 12.3.4.2.3. The StringItem Class
      4. 12.4. Decompilers
      5. 12.5. Obfuscators
        1. 12.5.1. Renaming Symbols
        2. 12.5.2. Control Flow Obfuscation
        3. 12.5.3. Breaking Decompilation and Disassembly
      6. 12.6. Reversing Obfuscated Code
        1. 12.6.1. XenoCode Obfuscator
        2. 12.6.2. DotFuscator by Preemptive Solutions
        3. 12.6.3. Remotesoft Obfuscator and Linker
        4. 12.6.4. Remotesoft Protector
        5. 12.6.5. Precompiled Assemblies
        6. 12.6.6. Encrypted Assemblies
      7. 12.7. Conclusion
    2. 13. Decompilation
      1. 13.1. Native Code Decompilation: An Unsolvable Problem?
      2. 13.2. Typical Decompiler Architecture
      3. 13.3. Intermediate Representations
        1. 13.3.1. Expressions and Expression Trees
        2. 13.3.2. Control Flow Graphs
      4. 13.4. The Front End
        1. 13.4.1. Semantic Analysis
        2. 13.4.2. Generating Control Flow Graphs
      5. 13.5. Code Analysis
        1. 13.5.1. Data-Flow Analysis
          1. 13.5.1.1. Single Static Assignment (SSA)
          2. 13.5.1.2. Data Propagation
          3. 13.5.1.3. Register Variable Identification
          4. 13.5.1.4. Data Type Propagation
        2. 13.5.2. Type Analysis
          1. 13.5.2.1. Primitive Data Types
          2. 13.5.2.2. Complex Data Types
        3. 13.5.3. Control Flow Analysis
        4. 13.5.4. Finding Library Functions
      6. 13.6. The Back End
      7. 13.7. Real-World IA-32 Decompilation
      8. 13.8. Conclusion
  10. A. Deciphering Code Structures
    1. A.1. Understanding Low-Level Logic
      1. A.1.1. Comparing Operands
        1. A.1.1.1. Signed Comparisons
        2. A.1.1.2. Unsigned Comparisons
      2. A.1.2. The Conditional Codes
        1. A.1.2.1. Signed Conditional Codes
        2. A.1.2.2. Unsigned Conditional Codes
    2. A.2. Control Flow & Program Layout
      1. A.2.1. Deciphering Functions
        1. A.2.1.1. Internal Functions
        2. A.2.1.2. Imported Functions
      2. A.2.2. Single-Branch Conditionals
      3. A.2.3. Two-Way Conditionals
      4. A.2.4. Multiple-Alternative Conditionals
      5. A.2.5. Compound Conditionals
        1. A.2.5.1. Logical Operators
        2. A.2.5.2. Simple Combinations
        3. A.2.5.3. Complex Combinations
      6. A.2.6. n-way Conditional (Switch Blocks)
        1. A.2.6.1. Table Implementation
        2. A.2.6.2. Tree Implementation
      7. A.2.7. Loops
        1. A.2.7.1. Pretested Loops
        2. A.2.7.2. Posttested Loops
        3. A.2.7.3. Loop Break Conditions
        4. A.2.7.4. Loop Skip-Cycle Statements
        5. A.2.7.5. Loop Unrolling
      8. A.2.8. Branchless Logic
        1. A.2.8.1. Pure Arithmetic Implementations
        2. A.2.8.2. Predicated Execution
          1. A.2.8.2.1. Set Byte on Condition (SETcc)
          2. A.2.8.2.2. Conditional Move (CMOVcc)
    3. A.3. Effects of Working-Set Tuning on Reversing
      1. A.3.1. Function-Level Working-Set Tuning
      2. A.3.2. Line-Level Working-Set Tuning
  11. B. Understanding Compiled Arithmetic
    1. B.1. Arithmetic Flags
      1. B.1.1. The Overflow Flags (CF and OF)
      2. B.1.2. The Zero Flag (ZF)
      3. B.1.3. The Sign Flag (SF)
      4. B.1.4. The Parity Flag (PF)
    2. B.2. Basic Integer Arithmetic
      1. B.2.1. Addition and Subtraction
      2. B.2.2. Multiplication and Division
        1. B.2.2.1. Multiplication
        2. B.2.2.2. Division
          1. B.2.2.2.1. Understanding Reciprocal-Multiplications
          2. B.2.2.2.2. Deciphering Reciprocal-Multiplications
      3. B.2.3. Modulo
    3. B.3. 64-Bit Arithmetic
      1. B.3.1. Addition
      2. B.3.2. Subtraction
      3. B.3.3. Multiplication
      4. B.3.4. Division
    4. B.4. Type Conversions
      1. B.4.1. Zero Extending
      2. B.4.2. Sign Extending
        1. B.4.2.1. To 32 Bits
        2. B.4.2.2. To 64 Bits
  12. C. Deciphering Program Data
    1. C.1. The Stack
      1. C.1.1. Stack Frames
      2. C.1.2. The ENTER and LEAVE Instructions
      3. C.1.3. Calling Conventions
        1. C.1.3.1. The cdecl Calling Convention
        2. C.1.3.2. The fastcall Calling Convention
        3. C.1.3.3. The stdcall Calling Convention
        4. C.1.3.4. The C++ Class Member Calling Convention (thiscall)
    2. C.2. Basic Data Constructs
      1. C.2.1. Global Variables
      2. C.2.2. Local Variables
        1. C.2.2.1. Stack-Based
          1. C.2.2.1.1. Overwriting Passed Parameters
        2. C.2.2.2. Register-Based
      3. C.2.3. Imported Variables
      4. C.2.4. Constants
      5. C.2.5. Thread-Local Storage (TLS)
    3. C.3. Data Structures
      1. C.3.1. Generic Data Structures
        1. C.3.1.1. Alignment
      2. C.3.2. Arrays
        1. C.3.2.1. Generic Data Type Arrays
        2. C.3.2.2. Data Structure Arrays
      3. C.3.3. Linked Lists
        1. C.3.3.1. Singly Linked Lists
        2. C.3.3.2. Doubly Linked Lists
      4. C.3.4. Trees
    4. C.4. Classes
      1. C.4.1. Data Members
      2. C.4.2. Data Members in Inherited Classes
      3. C.4.3. Class Methods
      4. C.4.4. Virtual Functions
      5. C.4.5. Identifying Virtual Function Calls
      6. C.4.6. Identifying Constructors of Objects with Inheritance
  13. D. Citations