You are previewing Open Source Software in Life Science Research.
O'Reilly logo
Open Source Software in Life Science Research

Book Description

The free/open source approach has grown from a minor activity to become a significant producer of robust, task-orientated software for a wide variety of situations and applications. To life science informatics groups, these systems present an appealing proposition - high quality software at a very attractive price. Open source software in life science research considers how industry and applied research groups have embraced these resources, discussing practical implementations that address real-world business problems.

The book is divided into four parts. Part one looks at laboratory data management and chemical informatics, covering software such as Bioclipse, OpenTox, ImageJ and KNIME. In part two, the focus turns to genomics and bioinformatics tools, with chapters examining GenomicsTools and EBI Atlas software, as well as the practicalities of setting up an ‘omics’ platform and managing large volumes of data. Chapters in part three examine information and knowledge management, covering a range of topics including software for web-based collaboration, open source search and visualisation technologies for scientific business applications, and specific software such as DesignTracker and Utopia Documents. Part four looks at semantic technologies such as Semantic MediaWiki, TripleMap and Chem2Bio2RDF, before part five examines clinical analytics, and validation and regulatory compliance of free/open source software. Finally, the book concludes by looking at future perspectives and the economics and free/open source software in industry.

  • Discusses a broad range of applications from a variety of sectors
  • Provides a unique perspective on work normally performed behind closed doors
  • Highlights the criteria used to compare and assess different approaches to solving problems

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. List of figures and tables
  7. Foreword
  8. About the editors
  9. About the contributors
  10. Introduction
  11. Chapter 1: Building research data handling systems with open source tools
    1. Abstract:
    2. 1.1 Introduction
    3. 1.2 Legacy
    4. 1.3 Ambition
    5. 1.4 Path chosen
    6. 1.5 The ‘ilities
    7. 1.6 Overall vision
    8. 1.7 Lessons learned
    9. 1.8 Implementation
    10. 1.9 Who uses LSP today?
    11. 1.10 Organisation
    12. 1.11 Future aspirations
  12. Chapter 2: Interactive predictive toxicology with Bioclipse and OpenTox
    1. Abstract:
    2. 2.1 Introduction
    3. 2.2 Basic Bioclipse-OpenTox interaction examples
    4. 2.3 Use Case 1: Removing toxicity without interfering with pharmacology
    5. 2.4 Use Case 2: Toxicity prediction on compound collections
    6. 2.5 Discussion
    7. 2.6 Availability
  13. Chapter 3: Utilizing open source software to facilitate communication of chemistry at RSC
    1. Abstract:
    2. 3.1 Introduction
    3. 3.2 Project Prospect and open ontologies
    4. 3.3 ChemSpider
    5. 3.4 ChemDraw Digester
    6. 3.5 Learn Chemistry Wiki
    7. 3.6 Conclusion
    8. 3.7 Acknowledgments
  14. Chapter 4: Open source software for mass spectrometry and metabolomics
    1. Abstract:
    2. 4.1 Introduction
    3. 4.2 A short mass spectrometry primer
    4. 4.3 Metabolomics and metabonomics
    5. 4.4 Data types
    6. 4.5 Metabolomics data processing
    7. 4.6 Metabolomics data processing using the open source workflow engine, KNIME
    8. 4.7 Open source software for multivariate analysis
    9. 4.8 Performing PCA on metabolomics data in R/KNIME
    10. 4.9 Other open source packages
    11. 4.10 Perspective
    12. 4.11 Acknowledgments
  15. Chapter 5: Open source software for image processing and analysis: picture this with ImageJ
    1. Abstract:
    2. 5.1 Introduction
    3. 5.2 ImageJ
    4. 5.3 ImageJ macros: an overview
    5. 5.4 Graphical user interface
    6. 5.5 Industrial applications of image analysis
    7. 5.6 Summary
  16. Chapter 6: Integrated data analysis with KNIME
    1. Abstract:
    2. 6.1 The KNIME platform
    3. 6.2 The KNIME success story
    4. 6.3 Benefits of 'professional open source'
    5. 6.4 Application examples
    6. 6.5 Conclusion and outlook
    7. 6.6 Acknowledgments
  17. Chapter 7: Investigation-Study-Assay, a toolkit for standardizing data capture and sharing
    1. Abstract:
    2. 7.1 The growing need for content curation in industry
    3. 7.2 The BioSharing initiative: cooperating standards needed
    4. 7.3 The ISA framework – principles for progress
    5. 7.4 Lessons learned
    6. 7.5 Acknowledgments
  18. Chapter 8: GenomicTools: an open source platform for developing high-throughput analytics in genomics
    1. Abstract:
    2. 8.1 I ntroduction
    3. 8.2 Data types
    4. 8.3 Tools overview
    5. 8.4 C++ API for developers
    6. 8.5 Case study: a simple ChIP-seq pipeline
    7. 8.6 Performance
    8. 8.7 Conclusion
    9. 8.8 Resources
  19. Chapter 9: Creating an in-house ’omics data portal using EBI Atlas software
    1. Abstract:
    2. 9.1 Introduction
    3. 9.2 Leveraging ’omics data for drug discovery
    4. 9.3 The EBI Atlas software
    5. 9.4 Deploying Atlas in the enterprise
    6. 9.5 Conclusion and learnings
    7. 9.6 Acknowledgments
  20. Chapter 10: Setting up an ’omics platform in a small biotech
    1. Abstract:
    2. 10.1 Introduction
    3. 10.2 General changes over time
    4. 10.3 The hardware solution
    5. 10.4 Maintenance of the system
    6. 10.5 Backups
    7. 10.6 Keeping up-to-date
    8. 10.7 Disaster recovery
    9. 10.8 Personnel skill sets
    10. 10.9 Conclusion
    11. 10.10 Acknowledgements
  21. Chapter 11: Squeezing big data into a small organisation
    1. Abstract:
    2. 11.1 Introduction
    3. 11.2 Our service and its goals
    4. 11.3 Manage the data: relieving the burden of data-handling
    5. 11.4 Organising the data
    6. 11.5 Standardising to your requirements
    7. 11.6 Analysing the data: helping users work with their own data
    8. 11.7 Helping biologists to stick to the rules
    9. 11.8 Running programs
    10. 11.9 Helping the user to understand the details
    11. 11.10 Summary
  22. Chapter 12: Design Tracker: an easy to use and flexible hypothesis tracking system to aid project team working
    1. Abstract:
    2. 12.1 Overview
    3. 12.2 Methods
    4. 12.3 Technical overview
    5. 12.4 Infrastructure
    6. 12.5 Review
    7. 12.6 Acknowledgements
  23. Chapter 13: Free and open source software for web-based collaboration
    1. Abstract:
    2. 13.1 Introduction
    3. 13.2 Application of the FLOSS assessment framework
    4. 13.3 Conclusion
    5. 13.4 Acknowledgements
  24. Chapter 14: Developing scientific business applications using open source search and visualisation technologies
    1. Abstract:
    2. 14.1 A changing attitude
    3. 14.2 The need to make sense of large amounts of data
    4. 14.3 Open source search technologies
    5. 14.4 Creating the foundation layer
    6. 14.5 Visualisation technologies
    7. 14.6 Prefuse visualisation toolkit
    8. 14.7 Business applications
    9. 14.8 Other applications
    10. 14.9 Challenges and future developments
    11. 14.10 Reflections
    12. 14.11 Thanks and Acknowledgements
  25. Chapter 15: Utopia Documents: transforming how industrial scientists interact with the scientific literature
    1. Abstract:
    2. 15.1 Utopia Documents in industry
    3. 15.2 Enabling collaboration
    4. 15.3 Sharing, while playing by the rules
    5. 15.4 History and future of Utopia Documents
  26. Chapter 16: Semantic MediaWiki in applied life science and industry: building an Enterprise Encyclopaedia
    1. Abstract:
    2. 16.1 Introduction
    3. 16.2 Wiki-based Enterprise Encyclopaedia
    4. 16.3 Semantic MediaWiki
    5. 16.4 Conclusion and future directions
    6. 16.5 Acknowledgements
  27. Chapter 17: Building disease and target knowledge with Semantic MediaWiki
    1. Abstract:
    2. 17.1 The Targetpedia
    3. 17.2 The Disease Knowledge Workbench (DKWB)
    4. 17.3 Conclusion
    5. 17.4 Acknowledgements
  28. Chapter 18: Chem2Bio2RDF: a semantic resource for systems chemical biology and drug discovery
    1. Abstract:
    2. 18.1 The need for integrated, semantic resources in drug discovery
    3. 18.2 The Semantic Web in drug discovery
    4. 18.3 Implementation challenges
    5. 18.4 Chem2Bio2RDF architecture
    6. 18.5 Tools and methodologies that use Chem2Bio2RDF
    7. 18.6 Conclusions
  29. Chapter 19: TripleMap: a web-based semantic knowledge discovery and collaboration application for biomedical research
    1. Abstract:
    2. 19.1 The challenge of Big Data
    3. 19.2 Semantic technologies
    4. 19.3 Semantic technologies overview
    5. 19.4 The design and features of TripleMap
    6. 19.5 TripleMap Generated Entity Master ('GEM') semantic data core
    7. 19.6 TripleMap semantic search interface
    8. 19.7 TripleMap collaborative, dynamic knowledge maps
    9. 19.8 Comparison and integration with third-party systems
    10. 19.9 Conclusions
  30. Chapter 20: Extreme scale clinical analytics with open source software
    1. Abstract:
    2. 20.1 Introduction
    3. 20.2 Interoperability
    4. 20.3 Mirth
    5. 20.4 Mule ESB
    6. 20.5 Unified Medical Language System (UMLS)
    7. 20.6 Open source databases
    8. 20.7 Analytics
    9. 20.8 Final architectural overview
  31. Chapter 21: Validation and regulatory compliance of free/open source software
    1. Abstract:
    2. 21.1 Introduction
    3. 21.2 The need to validate open source applications
    4. 21.3 Who should validate open source software?
    5. 21.4 Validation planning
    6. 21.5 Risk management and open source software
    7. 21.6 Key validation activities
    8. 21.7 Ongoing validation and compliance
    9. 21.8 Conclusions
  32. Chapter 22: The economics of free/open source software in industry
    1. Abstract:
    2. 22.1 Introduction
    3. 22.2 Background
    4. 22.3 Open source innovation
    5. 22.4 Open source software in the pharmaceutical industry
    6. 22.5 Open source as a catalyst for pre-competitive collaboration in the pharmaceutical industry
    7. 22.6 The Pistoia Alliance Sequence Services Project
    8. 22.7 Conclusion
  33. Index