Cover image for Social Network Analysis for Startups

Book description

SNA techniques are derived from sociological and social-psychological theories and take into account the whole network (or, in case of very large networks such as Twitter -- a large segment of the network). Thus, we may arrive at results that may seem counter-intuitive -- e.g. that Jusin Bieber (7.5 mil. followers) and Lady Gaga (7.2 mil. followers) have relatively little actual influence despite their celebrity status -- while a middle-of-the-road blogger with 30K followers is able to generate tweets that "go viral" and result in millions of impressions. O'Reilly's "Mining Social Media" and "Programming Collective Intelligence" books are an excellent start for people inteseted in SNA. This book builds on these books' foundations to teach a new, pragmatic, way of doing SNA. I would like to write a book that links theory ("why is this important?", "how do various concepts interact?", "how do I interpret quantitative results?") and practice -- gathering, analyzing and visualizing data using Python and other open-source tools.

Table of Contents

  1. Social Network Analysis for Startups
  2. A Note Regarding Supplemental Files
  3. Preface
    1. Prerequisites
    2. Open-Source Tools
    3. Conventions Used in This Book
    4. Using Code Examples
    5. Safari® Books Online
    6. How to Contact Us
    7. Content Updates
      1. March 16, 2012
    8. Thanks
  4. 1. Introduction
    1. Analyzing Relationships to Understand People and Groups
      1. Binary and Valued Relationships
      2. Symmetric and Asymmetric Relationships
      3. Multimode Relationships
    2. From Relationships to Networks—More Than Meets the Eye
    3. Social Networks vs. Link Analysis
    4. The Power of Informal Networks
    5. Terrorists and Revolutionaries: The Power of Social Networks
      1. Social Networks in Prison
      2. Informal Networks in Terrorist Cells
      3. The Revolution Will Be Tweeted
        1. Social Media and Social Networks
        2. Egyptian Revolution and Twitter
  5. 2. Graph Theory—A Quick Introduction
    1. What Is a Graph?
      1. Adjacency Matrices
      2. Edge-Lists and Adjacency Lists
      3. 7 Bridges of Königsberg
    2. Graph Traversals and Distances
      1. Depth-First Traversal
        1. Implementation
        2. DFS with NetworkX
      2. Breadth-First Traversal
        1. Algorithm
        2. BFS with NetworkX
      3. Paths and Walks
      4. Dijkstra’s Algorithm
    3. Graph Distance
      1. Graph Diameter
    4. Why This Matters
    5. 6 Degrees of Separation is a Myth!
    6. Small World Networks
  6. 3. Centrality, Power, and Bottlenecks
    1. Sample Data: The Russians are Coming!
      1. Get Oriented in Python and NetworkX
      2. Read Nodes and Edges from LiveJournal
      3. Snowball Sampling
      4. Saving and Loading a Sample Dataset from a File
    2. Centrality
      1. Who Is More Important in this Network?
      2. Find the “Celebrities”
        1. Degree centrality in the LiveJournal network
      3. Find the Gossipmongers
      4. Find the Communication Bottlenecks and/or Community Bridges
      5. Putting It Together
      6. Who Is a “Gray Cardinal?”
        1. In practice
      7. Klout Score
      8. PageRank—How Google Measures Centrality
        1. Simplified PageRank algorithm
    3. What Can’t Centrality Metrics Tell Us?
  7. 4. Cliques, Clusters and Components
    1. Components and Subgraphs
      1. Analyzing Components with Python
      2. Islands in the Net
    2. Subgraphs—Ego Networks
      1. Extracting and Visualizing Ego Networks with Python
    3. Triads
      1. Fraternity Study—Tie Stability and Triads
      2. Triads and Terrorists
      3. The “Forbidden Triad” and Structural Holes
      4. Structural Holes and Boundary Spanning
      5. Triads in Politics
      6. Directed Triads
      7. Analyzing Triads in Real Networks
      8. Real Data
    4. Cliques
      1. Detecting Cliques
    5. Hierarchical Clustering
      1. The Algorithm
      2. Clustering Cities
      3. Preparing Data and Clustering
      4. Block Models
    6. Triads, Network Density, and Conflict
  8. 5. 2-Mode Networks
    1. Does Campaign Finance Influence Elections?
    2. Theory of 2-Mode Networks
      1. Affiliation Networks
      2. Attribute Networks
      3. A Little Math
      4. 2-Mode Networks in Practice
      5. PAC Networks
      6. Candidate Networks
    3. Expanding Multimode Networks
      1. Exercise
  9. 6. Going Viral! Information Diffusion
    1. Anatomy of a Viral Video
      1. What Did Facebook Do Right?
      2. How Do You Estimate Critical Mass?
      3. Wikinomics of Critical Mass
      4. Content is (Still) King
        1. Heterogenous Preferences
    2. How Does Information Shape Networks (and Vice Versa)?
      1. Birds of a Feather?
      2. Homophily vs. Curiosity
        1. Boundary Spanners
      3. Weak Ties
      4. Dunbar Number and Weak Ties
    3. A Simple Dynamic Model in Python
      1. Influencers in the Midst
      2. Exercises for the Reader
    4. Coevolution of Networks and Information
      1. Exercises for the Reader
      2. Why Model Networks?
  10. 7. Graph Data in the Real World
    1. Medium Data: The Tradition
    2. Big Data: The Future, Starting Today
    3. “Small Data”—Flat File Representations
      1. EdgeList Files
      2. .net Format
      3. GML, GraphML, and other XML Formats
      4. Ancient Binary Format—##h Files
    4. “Medium Data”: Database Representation
      1. What are Cursors?
      2. What are Transactions?
      3. Names
      4. Nodes as Data, Attributes as ?
      5. The Class
      6. Functions and Decorators
        1. Decorator notation
      7. The Adaptor
    5. Working with 2-Mode Data
      1. Exercises for the Reader
    6. Social Networks and Big Data
      1. NoSQL
      2. Structural Realities
        1. Plain text is king
        2. The freedom to store
      3. Computational Complexities
      4. Big Data is Big
    7. Big Data at Work
      1. What Are We Distributing?
      2. Hadoop, S3, and MapReduce
      3. Hive
      4. SQL is Still Our Friend
  11. A. Data Collection
    1. A Note on the Ethics of Data Collection
    2. The Old-Fashioned Way
    3. Mining Server Logs
    4. Mining Social Media Sites
      1. Business and Investments
      2. Politics, Elections, and Courts
      3. Blogosphere and Social Bookmarking
    5. Twitter Data Collection
    6. Facebook
      1. Private Ego-Networks
      2. Facebook Social Graph API
  12. B. Installing Software
    1. Why (We Love) Python?
    2. Exploratory Programming
    3. Python
    4. IPython
    5. NetworkX
    6. matplotlib
      1. pylab: matplotlib with IPython
  13. About the Authors
  14. Copyright