You are previewing Analyzing Social Media Networks with NodeXL.
O'Reilly logo
Analyzing Social Media Networks with NodeXL

Book Description

Analyzing Social Media Networks with NodeXL offers backgrounds in information studies, computer science, and sociology. This book is divided into three parts: analyzing social media, NodeXL tutorial, and social-media network analysis case studies.
Part I provides background in the history and concepts of social media and social networks. Also included here is social network analysis, which flows from measuring, to mapping, and modeling collections of connections. The next part focuses on the detailed operation of the free and open-source NodeXL extension of Microsoft Excel, which is used in all exercises throughout this book. In the final part, each chapter presents one form of social media, such as e-mail, Twitter, Facebook, Flickr, and Youtube. In addition, there are descriptions of each system, the nature of networks when people interact, and types of analysis for identifying people, documents, groups, and events.

*Walks you through NodeXL, while explaining the theory and development behind each step, providing takeaways that can apply to any SNA

*Demonstrates how visual analytics research can be applied to SNA tools for the mass market

*Includes case studies from researchers who use NodeXL on popular networks like email, Facebook, Twitter, and wikis

*Download companion materials and resources at https://nodexl.codeplex.com/documentation

Table of Contents

  1. Copyright
    1. Dedication
  2. Preface
  3. Acknowledgments
  4. About the Authors
  5. Contributors
  6. I. Getting Started with Analyzing Social Media Networks
    1. 1. Introduction to Social Media and Social Networks
      1. 1.1. Introduction
      2. 1.2. A Historical Perspective
      3. 1.3. The Rise of Social Media as Consumer Applications
      4. 1.4. Individual Contributions Generate Public Wealth
      5. 1.5. Who Should Read This Book
      6. 1.6. Applying Social Media to National Priorities
      7. 1.7. Worldwide Efforts
      8. 1.8. Practitioner’s Summary
      9. 1.9. Researcher’s Agenda
        1. References
        2. Additional Resources
    2. 2. Social Media: New Technologies of Collaboration
      1. 2.1. Introduction
      2. 2.2. Social Media Defined
      3. 2.3. Social Media Design Framework
        1. 2.3.1. Size of Producer and Consumer Population
        2. 2.3.2. Pace of Interaction
        3. 2.3.3. Genre of Basic Elements
        4. 2.3.4. Control of Basic Elements
        5. 2.3.5. Types of Connections
        6. 2.3.6. Retention of Content
      4. 2.4. Social Media Examples
        1. 2.4.1. Asynchronous Threaded Conversation
          1. Email
          2. Email Lists
          3. Usenet Newsgroups
          4. From Bulletin Board Systems (BBS) to Discussion Forums
        2. 2.4.2. Synchronous Conversation
          1. Chat
          2. Instant Messaging
          3. Texting
          4. Audio Conferencing
          5. Videoconferencing
        3. 2.4.3. World Wide Web
        4. 2.4.4. Collaborative Authoring
          1. Wikis
          2. Shared Documents
        5. 2.4.5. Blogs and Podcasts
          1. Microblogs and Activity Streams
          2. Multimedia Blogs and Podcasts
        6. 2.4.6. Social Sharing
          1. Video and TV
          2. Photo and Art
          3. Music
          4. Bookmarks, News, and Books
        7. 2.4.7. Social Networking Services
          1. Social and Dating
          2. Professional
          3. Niche Networks
        8. 2.4.8. Online Markets and Production
          1. Financial Transaction
          2. User-Generated Products
          3. Review Sites
        9. 2.4.9. Idea Generation
        10. 2.4.10. Virtual Worlds
          1. Virtual Reality Worlds
          2. Massively Multiplayer Games
        11. 2.4.11. Mobile-Based Services
          1. Location Sharing, Annotation, and Games
      5. 2.5. Practitioner’s Summary
      6. 2.6. Researcher’s Agenda
        1. References
        2. Additional Resources
    3. 3. Social Network Analysis: Measuring, Mapping, and Modeling Collections of Connections
      1. 3.1. Introduction
      2. 3.2. The Network Perspective
        1. 3.2.1. A Simple Twitter Network Example
        2. 3.2.2. Vertices
        3. 3.2.3. Edges
        4. 3.2.4. Network Data Representations
      3. 3.3. Types of Networks
        1. 3.3.1. Full, Partial, and Egocentric Networks
        2. 3.3.2. Unimodal, Multimodal, and Affiliation Networks
        3. 3.3.3. Multiplex Networks
      4. 3.4. The Network Analysis Research and Practitioner Landscape
      5. 3.5. Network Analysis Metrics
        1. 3.5.1. Aggregate Networks Metrics
        2. 3.5.2. Vertex-Specific Networks Metrics
          1. Degree Centrality
          2. Betweenness Centralities: Bridge Scores for Boundary Spanners
          3. Closeness Centrality: Distance Scores for Broadly Connected People
          4. Eigenvector Centrality: Influence Scores for Strategically Connected People
            1. Clustering Coefficient: How Connected are My Friends?
        3. 3.5.3. Clustering and Community Detection Algorithms
        4. 3.5.4. Structures, Network Motifs, and Social Roles
      6. 3.6. Social Networks in the Era of Abundant Computation
      7. 3.7. The Era of Abundant Social Networks: From the Desktop to Your Pocket
      8. 3.8. Tools for Network Analysis
      9. 3.9. Node-Link Diagrams: Visually Mapping Social Networks
      10. 3.10. Common Network Analysis Questions Applied to Social Media
      11. 3.11. Practitioner’s Summary
      12. 3.12. Researcher’s Agenda
        1. References and Resources
        2. Additional Resources
  7. II. NodeXL Tutorial: Learning by Doing
    1. 4. Getting Started with NodeXL, Layout, Visual Design, and Labeling
      1. 4.1. Introduction
      2. 4.2. Downloading and Installing NodeXL
      3. 4.3. Getting Started with NodeXL
        1. 4.3.1. Data Entry
        2. 4.3.2. Showing the Graph
        3. 4.3.3. Highlighting an Edge
        4. 4.3.4. Importing an Edge List
        5. 4.3.5. Resizing and Moving the Graph Pane
      4. 4.4. Layout: Arranging Vertices in the Graph Pane
        1. 4.4.1. Automatic Layout
        2. 4.4.2. Directed Graph Type
        3. 4.4.3. Updating the Graph Pane
        4. 4.4.4. Manual Layout
        5. 4.4.5. Preserving Manual Layout
        6. 4.4.6. Zooming and Scale
      5. 4.5. Visual Design: Making Network Displays Meaningful
        1. 4.5.1. Vertex Colors
        2. 4.5.2. Adding Descriptive Data
        3. 4.5.3. Changing Vertex Size (and Other Visual Properties)
        4. 4.5.4. Autofilling Columns
        5. 4.5.5. Legend
        6. 4.5.6. Changing General Graph Appearance
      6. 4.6. Labeling: Adding Text Labels to Vertices and Edges
        1. 4.6.1. Adding Vertex Labels
        2. 4.6.2. Viewing Hidden Columns
        3. 4.6.3. Adding Labels as the Shape
        4. 4.6.4. Adding Labels Alongside a Shape
        5. 4.6.5. Adding Tooltips
        6. 4.6.6. Adding Edge Labels
        7. 4.6.7. Saving a NodeXL File
      7. 4.7. Practitioner’s Summary
      8. 4.8. Researcher’s Agenda
        1. References
        2. NodeXL Papers
    2. 5. Calculating and Visualizing Network Metrics
      1. 5.1. Introduction
      2. 5.2. Kite Network Example
        1. 5.2.1. Opening an Existing NodeXL File
      3. 5.3. Computing Graph Metrics
        1. 5.3.1. Vertex-Specific Metrics
          1. Degree
          2. Betweenness Centrality
          3. Closeness Centrality
        2. 5.3.2. Eigenvector Centrality
          1. Clustering Coefficient
        3. 5.3.3. Overall Graph Metrics
      4. 5.4. Les Misérables Co-Appearance Network
        1. 5.4.1. Sorting the Edge Weight Column
        2. 5.4.2. Visualizing Edge Weights
        3. 5.4.3. Calculating and Visualizing Vertex Metrics to Find Important Individuals
        4. 5.4.4. Mapping Graph Metrics to X and Y Coordinates
        5. 5.4.5. Viewing and Hiding Graph Elements Such as Axes
      5. 5.5. Practitioner’s Summary
      6. 5.6. Researcher’s Agenda
        1. References
    3. 6. Preparing Data and Filtering
      1. 6.1. Introduction
      2. 6.2. Serious Eats Network Example
        1. 6.2.1. Working with Multimodal Network Data
        2. 6.2.2. Merging Duplicate Edges
        3. 6.2.3. Using Shapes and Colors to Identify Vertices of Different Types
        4. 6.2.4. Sorting Vertices from A to Z
        5. 6.2.5. Auto-Filling Data Columns
        6. 6.2.6. Customizing Layout Options
      3. 6.3. Filtering to Reduce Clutter and Reveal Important Features
        1. 6.3.1. Dynamic Filters
        2. 6.3.2. Filtering by Controlling Visibility with Autofill Columns
        3. 6.3.3. Creating Subgraph Images with NodeXL
      4. 6.4. Putting It All Together
      5. 6.5. Practitioner’s Summary
      6. 6.6. Researcher’s Agenda
        1. References
    4. 7. Clustering and Grouping
      1. 7.1. Introduction
      2. 7.2. The 2007 Senate Voting Analysis
        1. 7.2.1. Filtering Edges to Identify Groups within a Network
        2. 7.2.2. Automatically Identifying Clusters
        3. 7.2.3. Creating Clusters Manually
      3. 7.3. Les Misérables Character Clusters
      4. 7.4. Federal Communications Commission (FCC) Lobbying Coalition Network
      5. 7.5. Practitioner’s Summary
      6. 7.6. Researcher’s Agenda
        1. References
        2. Additional Resources
  8. III. Social Media Network Analysis Case Studies
    1. 8. Email: The Lifeblood of Modern Communication
      1. 8.1. Introduction
      2. 8.2. History and Definition of Email
      3. 8.3. Email Networks
      4. 8.4. What Questions Can Be Answered by Analyzing Email Networks?
        1. 8.4.1. Personal Email Network Questions
        2. 8.4.2. Organizational Email Network Questions
      5. 8.5. Working with Email Data
        1. 8.5.1. Preparing Email
        2. 8.5.2. Importing Email Networks into NodeXL
      6. 8.6. Cleaning Email Data in NodeXL
        1. 8.6.1. Remove Duplicate Email Addresses for the Same Individual
        2. 8.6.2. Merge Duplicate Edges
      7. 8.7. Analyzing Personal Email Networks
        1. 8.7.1. Creating an Email Overview Visualization
          1. Step 1: Import Data into NodeXL
          2. Step 2: Clean Data
          3. Step 3: Filter Data
          4. Step 4: Compute Graph Metrics and Add New Columns
          5. Step 5: Visualize the Email Social Network
          6. Step 6: Understand Social Network Visualizations and Metrics Data
        2. 8.7.2. Creating an Expertise Network Email Graph
          1. Step 1: Import Email Social Network Data into NodeXL
          2. Step 2: Clean Data
          3. Step 3: Compute Graph Metrics and Add New Columns
          4. Step 4: Filter Data
          5. Step 5: Visualize Network
          6. Step 6: Understanding the Network Visualization and Data
      8. 8.8. Creating a Living Org-Chart with an Organizational Email Network
        1. 8.8.1. TechABC’s Organizational Unit Email Network
        2. 8.8.2. Normalizing and Filtering TechABC’s Data
        3. 8.8.3. Creating an Overview of TechABC’s Communication Patterns
        4. 8.8.4. Examining TechABC’s Research Division
      9. 8.9. Historical and Legal Analysis of Enron Email
        1. 8.9.1. Identifying Key Individuals Using Content Networks
      10. 8.10. Practitioner’s Summary
      11. 8.11. Researcher’s Agenda
        1. References
    2. 9. Thread Networks: Mapping Message Boards and Email Lists
      1. 9.1. Introduction
      2. 9.2. History and Definition of Threaded Conversation
      3. 9.3. What Questions Can Be Asked
      4. 9.4. Threaded Conversation Networks
      5. 9.5. Analyzing a Technical Support Email List: CCS-D
        1. 9.5.1. Preparing Email List Network Data
        2. 9.5.2. Identifying Important People and Social Roles at CSS-D
      6. 9.6. Finding a New Community Administrator for the ABC-D Email List
      7. 9.7. Understanding Groups at Ravelry
      8. 9.8. Practitioner’s Summary
      9. 9.9. Researcher’s Agenda
        1. 9.9. References
    3. 10. Twitter: Conversation, Entertainment, and Information, All in One Network!
      1. 10.1. Introduction
      2. 10.2. The Nuts and Bolts of Twitter
        1. 10.2.1. @replies and @mentions
        2. 10.2.2. #hashtags
        3. 10.2.3. Retweeting
      3. 10.3. Networks in Twitter
        1. 10.3.1. Friends, Followers, Information, and Attention
        2. 10.3.2. Attention, Importance, and Eigenvector Centrality
        3. 10.3.3. Information, Advantage, and Betweenness Centrality
        4. 10.3.4. @replies and Symmetric Connections
        5. 10.3.5. Retweets, #hashtags, and the Diffusion of Information
      4. 10.4. Acquiring Data
      5. 10.5. Discovery with Twitter
        1. 10.5.1. The Ego Network
        2. 10.5.2. A Trending Topic
      6. 10.6. Practitioner’s Summary
      7. 10.7. Researcher’s Agenda
        1. References
        2. Additional Resources
    4. 11. Visualizing and Interpreting Facebook Networks
      1. 11.1. Introduction to Facebook: The World’s Social Graph
      2. 11.2. The History of Facebook
      3. 11.3. Why Map a Facebook Network?
      4. 11.4. What Kind of Network is a Facebook Friendship Network?
      5. 11.5. Getting your Data into NodeXL
      6. 11.6. Creating a Basic Facebook Visualization
        1. 11.6.1. Hiding Yourself From Your Own Network
        2. 11.6.2. That Networky Look
      7. 11.7. Ordered and Nonordered Data and Attributes
        1. 11.7.1. Visualizing Nonordered Data: Clusters and Categories
        2. 11.7.2. Visualizing Ordered Data
          1. Degree
          2. Betweenness
      8. 11.8. Friend Wheel to Pinwheel: A Facebook Visualization the NodeXL Way
      9. 11.9. Practitioner’s Summary
      10. 11.10. Researcher’s Agenda
        1. References
        2. Additional Resources
    5. 12. WWW Hyperlink Networks: Robert Ackland
      1. 12.1. Introduction
      2. 12.2. Hyperlink Networks
        1. 12.2.1. Theory of Hyperlinking
        2. 12.2.2. Methodological Issues
      3. 12.3. The VOSON Data Provider
      4. 12.4. Example 1: Who Links to My Organization’s Web Site?
        1. 12.4.1. Preliminaries
        2. 12.4.2. Constructing Your Own Hyperlink Network Using the VOSON Data Provider
      5. 12.5. Example 2: What Is the Hyperlink Network of this Field/Industry/Sector?
      6. 12.6. Blogs, Temporal Changes, and Network Flow
      7. 12.7. Practitioner’s Summary
      8. 12.8. Researcher’s Agenda
        1. References
    6. 13. Flickr: Linking People, Photos, and Tags
      1. 13.1. Introduction
      2. 13.2. Flickr Social Media
        1. 13.2.1. What Can You Do with Flickr?
        2. 13.2.2. Flickr Sets, Collections, and Tags
        3. 13.2.3. Flickr Groups
        4. 13.2.4. Sharing and Social Interaction
      3. 13.3. Flickr Networks
        1. 13.3.1. Tag Networks
        2. 13.3.2. User Networks
          1. User Contacts
          2. Users That Comment on Other Users’ Photos
      4. 13.4. What Questions Can Be Answered by Analyzing Flickr Networks?
        1. 13.4.1. Personal Sphere
        2. 13.4.2. Community Sphere
        3. 13.4.3. Application Sphere
          1. E-commerce
          2. Service and Infrastructure
          3. Geo-Tagged Applications
      5. 13.5. Importing Flickr Data into NodeXL
        1. 13.5.1. Related Tags Network
        2. 13.5.2. Flickr User Networks
      6. 13.6. Working with the Flickr Data
        1. 13.6.1. Graph Type
        2. 13.6.2. Merging Duplicate Edges
      7. 13.7. Analyzing Flickr Networks with NodeXL
        1. 13.7.1. Revealing Landmarks and Tourist Attractions Using a Location Tag
          1. Step 1. Import Data into NodeXL
          2. Step 2. Prepare Data
          3. Step 3. Compute Graph Metrics and Filter Data
          4. Step 4. Visualize the Tag Cloud Graph
          5. Step 5. Interpreting the Visualizations
        2. 13.7.2. Identifying Tag Clusters to Disambiguate the Meaning of a Tag
          1. Step 1. Import Data into NodeXL
          2. Step 2. Cluster Vertices
          3. Step 3. Interpreting the Visualizations
        3. 13.7.3. Flickr User Networks
          1. Step 1. Import Data into NodeXL
          2. Step 2. Prepare Data and Create New Columns
          3. Step 3. Filter Data and Set Visual Properties
          4. Step 4. Visualize the User Comments Graph
          5. Step 5. Get User Contacts Information
          6. Step 6. Visualize the Scatter Graph of Self versus Others’ Comments
          7. Step 7. Interpreting the Visualizations
      8. 13.8. Practitioner’s Summary
        1. 13.8.1. Data Preparation
        2. 13.8.2. Graph Layouts
          1. Manual and Multiple Layouts
      9. 13.9. Researcher’s Agenda
        1. 13.9.1. Exploration and Validation
        2. 13.9.2. Integration
        3. References
    7. 14. YouTube: Contrasting Patterns of Content, Interaction, and Prominence
      1. 14.1. Introduction
      2. 14.2. What Is YouTube?
      3. 14.3. YouTube’s Structure
        1. 14.3.1. Videos
          1. Descriptors
        2. 14.3.2. Users’ Channels
      4. 14.4. Networks in YouTube
        1. 14.4.1. Video Networks
        2. 14.4.2. Users’ Networks
      5. 14.5. Hubs, Groups, and Layers: What Questions can Social Network Analysis of YouTube Answer?
        1. 14.5.1. Video Network
        2. 14.5.2. User Networks
      6. 14.6. Importing YouTube Data into NodeXL
        1. 14.6.1. Importing Video Data
        2. 14.6.2. Importing User Data
        3. 14.6.3. Ethical Considerations
        4. 14.6.4. Problems With YouTube Network Data
      7. 14.7. Preparing YouTube Network Data
      8. Analyzing YouTube Networks
        1. 14.8.1. User Networks
        2. 14.8.2. Video Networks
        3. 14.8.3. The YouTube “Makeup” Video Network
        4. 14.8.4. Healthcare Reform YouTube Video Networks
      9. 14.9. Practitioner’s Summary
      10. 14.10. Researcher’s Agenda
        1. 14.10. References
    8. 15. Wiki Networks: Connections of Creativity and Collaboration
      1. 15.1. Introduction
      2. 15.2. Key Features of Wiki Systems
      3. 15.3. Wiki Networks from Edit Activity
        1. 15.3.1. Wiki Networks of General Interest
      4. 15.4. Identifying Different Types of Editors within a Wiki Project
        1. 15.4.1. Wiki Social Network Sampling Frame and Data Collection
        2. 15.4.2. Defining Edges and Attributes in Wiki Social Networks
        3. 15.4.3. Wiki Network Data Collection
      5. 15.5. NodeXL Visualization Strategies for Revealing Distinct User Types
        1. 15.5.1. Making Top Wiki Editors Stand Out by Visually Formatting the Network Graph
        2. 15.5.2. Interpreting Wiki Network Graphs for Evidence of Distinctive Social Roles
        3. 15.5.3. Using Subgraph Images to Distinguish between User Types
        4. 15.5.4. Seeing the Trees and Forest with Wiki Network Analysis
      6. 15.6. Identifying High-Quality Contributors in Article Talk Pages
        1. 15.6.1. Tasks and Strategies for Identifying Types of Contributors by Visualizing Article Discussion Page Networks
        2. 15.6.2. Searching for Structural Signatures of Confrontation and Deliberation in Wiki Article Talk Page Networks
      7. 15.7. Navigating Lostpedia: Using NodeXL to Reveal the Large-Scale Collaborative Structure of Wiki Systems
        1. 15.7.1. Creating an Overview Network Map of Lostpedia Content in NodeXL
        2. 15.7.2. Creating an Overview Map of Lostpedia Users
        3. 15.7.3. Normalizing Data to Infer Stronger Connections
      8. 15.8. Data Collection from Wiki Systems
      9. 15.9. Practitioner’s Summary
      10. 15.10. Researcher’s Agenda
        1. References
  9. NodeXL for Programmers
    1. A.1. Importing Custom Graph Data
      1. A.1.1. Getting Started
      2. A.1.2. The IGraphDataProvider Interface
      3. A.1.3. GraphML
      4. A.1.4. Sample Code
    2. A.2. Custom Graphing Applications
      1. A.2.1. Getting Started
      2. A.2.2. Populating the Graph
      3. A.2.3. Controlling the Appearance of the Graph
      4. A.2.4. Controlling the Layout
      5. A.2.5. More Features
      6. A.2.6. Sample Code