You are previewing Data Analysis with Open Source Tools.

Data Analysis with Open Source Tools

Cover of Data Analysis with Open Source Tools by Philipp K. Janert Published by O'Reilly Media, Inc.
  1. Data Analysis with Open Source Tools
  2. Dedication
  3. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  4. A Note Regarding Supplemental Files
  5. Preface
    1. Before We Begin
    2. Conventions Used in This Book
    3. Using Code Examples
    4. Safari® Books Online
    5. How to Contact Us
    6. Acknowledgments
  6. 1. Introduction
    1. Data Analysis
    2. What’s in This Book
    3. What’s with the Workshops?
    4. What’s with the Math?
    5. What You’ll Need
    6. What’s Missing
  7. I. Graphics: Looking at Data
    1. 2. A Single Variable: Shape and Distribution
      1. Dot and Jitter Plots
      2. Histograms and Kernel Density Estimates
      3. The Cumulative Distribution Function
      4. Rank-Order Plots and Lift Charts
      5. Only When Appropriate: Summary Statistics and Box Plots
      6. Workshop: NumPy
      7. Further Reading
    2. 3. Two Variables: Establishing Relationships
      1. Scatter Plots
      2. Conquering Noise: Smoothing
      3. Logarithmic Plots
      4. Banking
      5. Linear Regression and All That
      6. Showing What’s Important
      7. Graphical Analysis and Presentation Graphics
      8. Workshop: matplotlib
      9. Further Reading
    3. 4. Time As a Variable: Time-Series Analysis
      1. Examples
      2. The Task
      3. Smoothing
      4. Don’t Overlook the Obvious!
      5. The Correlation Function
      6. Optional: Filters and Convolutions
      7. Workshop: scipy.signal
      8. Further Reading
    4. 5. More Than Two Variables: Graphical Multivariate Analysis
      1. False-Color Plots
      2. A Lot at a Glance: Multiplots
      3. Composition Problems
      4. Novel Plot Types
      5. Interactive Explorations
      6. Workshop: Tools for Multivariate Graphics
      7. Further Reading
    5. 6. Intermezzo: A Data Analysis Session
      1. A Data Analysis Session
      2. Workshop: gnuplot
      3. Further Reading
  8. II. Analytics: Modeling Data
    1. 7. Guesstimation and the Back of the Envelope
      1. Principles of Guesstimation
      2. How Good Are Those Numbers?
      3. Optional: A Closer Look at Perturbation Theory and Error Propagation
      4. Workshop: The Gnu Scientific Library (GSL)
      5. Further Reading
    2. 8. Models from Scaling Arguments
      1. Models
      2. Arguments from Scale
      3. Mean-Field Approximations
      4. Common Time-Evolution Scenarios
      5. Case Study: How Many Servers Are Best?
      6. Why Modeling?
      7. Workshop: Sage
      8. Further Reading
    3. 9. Arguments from Probability Models
      1. The Binomial Distribution and Bernoulli Trials
      2. The Gaussian Distribution and the Central Limit Theorem
      3. Power-Law Distributions and Non-Normal Statistics
      4. Other Distributions
      5. Optional: Case Study—Unique Visitors over Time
      6. Workshop: Power-Law Distributions
      7. Further Reading
    4. 10. What You Really Need to Know About Classical Statistics
      1. Genesis
      2. Statistics Defined
      3. Statistics Explained
      4. Controlled Experiments Versus Observational Studies
      5. Optional: Bayesian Statistics—The Other Point of View
      6. Workshop: R
      7. Further Reading
    5. 11. Intermezzo: Mythbusting—Bigfoot, Least Squares, and All That
      1. How to Average Averages
      2. The Standard Deviation
      3. Least Squares
      4. Further Reading
  9. III. Computation: Mining Data
    1. 12. Simulations
      1. A Warm-Up Question
      2. Monte Carlo Simulations
      3. Resampling Methods
      4. Workshop: Discrete Event Simulations with SimPy
      5. Further Reading
    2. 13. Finding Clusters
      1. What Constitutes a Cluster?
      2. Distance and Similarity Measures
      3. Clustering Methods
      4. Pre- and Postprocessing
      5. Other Thoughts
      6. A Special Case: Market Basket Analysis
      7. A Word of Warning
      8. Workshop: Pycluster and the C Clustering Library
      9. Further Reading
    3. 14. Seeing the Forest for the Trees: Finding Important Attributes
      1. Principal Component Analysis
      2. Visual Techniques
      3. Kohonen Maps
      4. Workshop: PCA with R
      5. Further Reading
    4. 15. Intermezzo: When More Is Different
      1. A Horror Story
      2. Some Suggestions
      3. What About Map/Reduce?
      4. Workshop: Generating Permutations
      5. Further Reading
  10. IV. Applications: Using Data
    1. 16. Reporting, Business Intelligence, and Dashboards
      1. Business Intelligence
      2. Corporate Metrics and Dashboards
      3. Data Quality Issues
      4. Workshop: Berkeley DB and SQLite
      5. Further Reading
    2. 17. Financial Calculations and Modeling
      1. The Time Value of Money
      2. Uncertainty in Planning and Opportunity Costs
      3. Cost Concepts and Depreciation
      4. Should You Care?
      5. Is This All That Matters?
      6. Workshop: The Newsvendor Problem
      7. Further Reading
    3. 18. Predictive Analytics
      1. Topics in Predictive Analytics
      2. Some Classification Terminology
      3. Algorithms for Classification
      4. The Process
      5. The Secret Sauce
      6. The Nature of Statistical Learning
      7. Workshop: Two Do-It-Yourself Classifiers
      8. Further Reading
    4. 19. Epilogue: Facts Are Not Reality
  11. A. Programming Environments for Scientific Computation and Data Analysis
    1. Software Tools
      1. Scientific Software Is Different
    2. A Catalog of Scientific Software
      1. Matlab
      2. R
      3. Python
      4. What About Java?
      5. Other Players
      6. Recommendations
    3. Writing Your Own
    4. Further Reading
      1. Matlab
      2. R
      3. NumPy/SciPy
  12. B. Results from Calculus
    1. Common Functions
      1. Powers
      2. Polynomials and Rational Functions
      3. Exponential Function and Logarithm
      4. Trigonometric Functions
      5. Gaussian Function and the Normal Distribution
      6. Other Functions
      7. The Inverse of a Function
    2. Calculus
      1. Derivatives
      2. Finding Minima and Maxima
      3. Integrals
      4. Limits, Sequences, and Series
      5. Power Series and Taylor Expansion
    3. Useful Tricks
      1. The Binomial Theorem
      2. The Linear Transformation
      3. Dividing by Zero
    4. Notation and Basic Math
      1. On Reading Formulas
      2. Elementary Algebra
      3. Working with Fractions
      4. Sets, Sequences, and Series
      5. Special Symbols
      6. The Greek Alphabet
    5. Where to Go from Here
      1. On Math
    6. Further Reading
      1. Calculus
      2. Linear Algebra
      3. Complex Analysis
      4. Mindbenders
  13. C. Working with Data
    1. Sources for Data
    2. Cleaning and Conditioning
    3. Sampling
    4. Data File Formats
    5. The Care and Feeding of Your Data Zoo
    6. Skills
    7. Terminology
      1. Types of Data
      2. The Data Type Depends on the Semantics
      3. Types of Data Sets
    8. Further Reading
      1. Data Set Repositories
  14. D. About the Author
  15. Index
  16. About the Author
  17. Colophon
  18. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  19. Copyright
O'Reilly logo

Chapter 17. Financial Calculations and Modeling

I RECENTLY RECEIVED A NOTICE FROM A MAGAZINE REMINDING ME THAT MY SUBSCRIPTION WAS RUNNING OUT. It’s a relatively expensive weekly magazine, and they offered me three different plans to renew my subscription: one year (52 issues) for $130, two years for $220, or three years for $275. Table 17-1 summarizes these options and also shows the respective cost per issue.

Table 17-1. Pricing plans for a magazine subscription

Subscription

Total price

Price per issue

Single issue

n/a

6.00

1 year

130

2.50

2 years

220

2.12

3 years

275

1.76

Assuming that I want to continue the subscription, which of these three options makes the most sense? From Table 17-1, we can see that each issue of the magazine becomes cheaper as I commit myself to a longer subscription period, but is this a good deal? In fact, what does it mean for a proposal like this to be a “good deal”? Somehow, stomping up nearly three hundred dollars right now seems like a stretch, even if I remind myself that it saves me more than half the price on each issue.

This little story demonstrates the central topic of this chapter: the time value of money, which expresses the notion that a hundred dollars today are worth more than a hundred dollars a year from now. In this chapter, I shall introduce some standard concepts and calculational tools that are required whenever we need to make a choice between different investment decisions—whether they involve our own personal finances or the evaluation of business ...

The best content for your career. Discover unlimited learning on demand for around $1/day.