Let’s step back for a moment and consider how far we’ve come. At this point, we’ve created a database of records: the shelve, as well as per-record pickle file approaches of the prior section suffice for basic data storage tasks. As is, our records are represented as simple dictionaries, which provide easier-to-understand access to fields than do lists (by key, rather than by position). Dictionaries, however, still have some limitations that may become more critical as our program grows over time.
For one thing, there is no central place for us to collect record processing logic. Extracting last names and giving raises, for instance, can be accomplished with code like the following:
db = shelve.open('people-shelve')>>>
bob = db['bob']>>>
bob['name'].split()[-1]# get bob's last name 'Smith' >>>
sue = db['sue']>>>
sue['pay'] *= 1.25# give sue a raise >>>
db['sue'] = sue>>>
This works, and it might suffice for some short programs. But if we ever need to change the way last names and raises are implemented, we might have to update this kind of code in many places in our program. In fact, even finding all such magical code snippets could be a challenge; hardcoding or cutting and pasting bits of logic redundantly like this in more than one place will almost always come back to haunt you eventually.
It would be better to somehow hide—that is, encapsulate—such bits of code. Functions in a module would allow us to implement such operations in a single place and thus avoid code redundancy, but still wouldn’t naturally associate them with the records themselves. What we’d like is a way to bind processing logic with the data stored in the database in order to make it easier to understand, debug, and reuse.
Another downside to using dictionaries for records is that they are difficult to expand over time. For example, suppose that the set of data fields or the procedure for giving raises is different for different kinds of people (perhaps some people get a bonus each year and some do not). If we ever need to extend our program, there is no natural way to customize simple dictionaries. For future growth, we’d also like our software to support extension and customization in a natural way.
If you’ve already studied Python in any sort of depth, you probably already know that this is where its OOP support begins to become attractive:
With OOP, we can naturally associate processing logic with record data—classes provide both a program unit that combines logic and data in a single package and a hierarchy that allows code to be easily factored to avoid redundancy.
With OOP, we can also wrap up details such as name processing and pay increases behind method functions—i.e., we are free to change method implementations without breaking their users.
And with OOP, we have a natural growth path. Classes can be extended and customized by coding new subclasses, without changing or breaking already working code.
That is, under OOP, we program by customizing and reusing, not by rewriting. OOP is an option in Python and, frankly, is sometimes better suited for strategic than for tactical tasks. It tends to work best when you have time for upfront planning—something that might be a luxury if your users have already begun storming the gates.
But especially for larger systems that change over time, its code reuse and structuring advantages far outweigh its learning curve, and it can substantially cut development time. Even in our simple case, the customizability and reduced redundancy we gain from classes can be a decided advantage.
OOP is easy to use in Python, thanks largely to Python’s dynamic typing model. In fact, it’s so easy that we’ll jump right into an example: Example 1-14 implements our database records as class instances rather than as dictionaries.
Example 1-14. PP4E\Preview\person_start.py
class Person: def __init__(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job if __name__ == '__main__': bob = Person('Bob Smith', 42, 30000, 'software') sue = Person('Sue Jones', 45, 40000, 'hardware') print(bob.name, sue.pay) print(bob.name.split()[-1]) sue.pay *= 1.10 print(sue.pay)
There is not much to this class—just a constructor method that fills out the instance with data passed in as arguments to the class name. It’s sufficient to represent a database record, though, and it can already provide tools such as defaults for pay and job fields that dictionaries cannot. The self-test code at the bottom of this file creates two instances (records) and accesses their attributes (fields); here is this file’s output when run under IDLE (a system command-line works just as well):
Bob Smith 40000 Smith 44000.0
This isn’t a database yet, but we could stuff these objects into a list or dictionary as before in order to collect them as a unit:
from person_start import Person>>>
bob = Person('Bob Smith', 42)>>>
sue = Person('Sue Jones', 45, 40000)>>>
people = [bob, sue]# a "database" list >>>
for person in people:
print(person.name, person.pay)Bob Smith 0 Sue Jones 40000 >>>
x = [(person.name, person.pay) for person in people]>>>
x[('Bob Smith', 0), ('Sue Jones', 40000)] >>>
[rec.name for rec in people if rec.age >= 45]# SQL-ish query ['Sue Jones'] >>>
[(rec.age ** 2 if rec.age >= 45 else rec.age) for rec in people][42, 2025]
Notice that Bob’s pay defaulted to zero this time because we didn’t pass in a value for that argument (maybe Sue is supporting him now?). We might also implement a class that represents the database, perhaps as a subclass of the built-in list or dictionary types, with insert and delete methods that encapsulate the way the database is implemented. We’ll abandon this path for now, though, because it will be more useful to store these records persistently in a shelve, which already encapsulates stores and fetches behind an interface for us. Before we do, though, let’s add some logic.
So far, our class is just data: it replaces dictionary keys with object attributes, but it doesn’t add much to what we had before. To really leverage the power of classes, we need to add some behavior. By wrapping up bits of behavior in class method functions, we can insulate clients from changes. And by packaging methods in classes along with data, we provide a natural place for readers to look for code. In a sense, classes combine records and the programs that process those records; methods provide logic that interprets and updates the data (we say they are object-oriented, because they always process an object’s data).
For instance, Example 1-15 adds the last-name
and raise logic as class methods; methods use the
self argument to access or update the
instance (record) being processed.
Example 1-15. PP4E\Preview\person.py
class Person: def __init__(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job def lastName(self): return self.name.split()[-1] def giveRaise(self, percent): self.pay *= (1.0 + percent) if __name__ == '__main__': bob = Person('Bob Smith', 42, 30000, 'software') sue = Person('Sue Jones', 45, 40000, 'hardware') print(bob.name, sue.pay) print(bob.lastName()) sue.giveRaise(.10) print(sue.pay)
The output of this script is the same as the last, but the results are being computed by methods now, not by hardcoded logic that appears redundantly wherever it is required:
Bob Smith 40000 Smith 44000.0
One last enhancement to our records before they become
permanent: because they are implemented as classes now, they
naturally support customization through the inheritance search
mechanism in Python. Example 1-16, for instance,
customizes the last section’s
Person class in order to give a 10 percent
bonus by default to managers whenever they receive a raise (any
relation to practice in the real world is purely
Example 1-16. PP4E\Preview\manager.py
from person import Person class Manager(Person): def giveRaise(self, percent, bonus=0.1): self.pay *= (1.0 + percent + bonus) if __name__ == '__main__': tom = Manager(name='Tom Doe', age=50, pay=50000) print(tom.lastName()) tom.giveRaise(.20) print(tom.pay)
When run, this script’s self-test prints the following:
appears in a module of its own, but it could have been added to the
person module instead (Python
doesn’t require just one class per file). It inherits the
constructor and last-name methods from its superclass, but it
customizes just the
method (there are a variety of ways to code this extension, as we’ll
see later). Because this change is being added as a new subclass,
Person class, and
any objects generated from it, will continue working unchanged. Bob
and Sue, for example, inherit the original raise logic, but Tom gets
the custom version because of the class from which he is created. In
OOP, we program by customizing, not by
In fact, code that uses our objects doesn’t need to be at all
aware of what the raise method does—it’s up to the object to do the
right thing based on the class from which it is created. As long as
the object supports the expected interface (here, a method called
giveRaise), it will be compatible
with the calling code, regardless of its specific type, and even if
its method works differently than others.
If you’ve already studied Python, you may know this behavior
as polymorphism; it’s a core property of the
language, and it accounts for much of your code’s flexibility. When
the following code calls the
giveRaise method, for example, what
happens depends on the
being processed; Tom gets a 20 percent raise instead of 10 percent
because of the
from person import Person>>>
from manager import Manager>>>
bob = Person(name='Bob Smith', age=42, pay=10000)>>>
sue = Person(name='Sue Jones', age=45, pay=20000)>>>
tom = Manager(name='Tom Doe', age=55, pay=30000)>>>
db = [bob, sue, tom]>>>
for obj in db:
obj.giveRaise(.10)# default or custom >>>
for obj in db:
print(obj.lastName(), '=>', obj.pay)Smith => 11000.0 Jones => 22000.0 Doe => 36000.0
As a first alternative, notice that we have introduced some
redundancy in Example 1-16: the raise
calculation is now repeated in two places (in the two classes). We
could also have implemented the customized
Manager class by
augmenting the inherited raise method instead
of replacing it completely:
class Manager(Person): def giveRaise(self, percent, bonus=0.1): Person.giveRaise(self, percent + bonus)
The trick here is to call back the superclass’s version of
the method directly, passing in the
self argument explicitly. We still
redefine the method, but we simply run the general version after
adding 10 percent (by default) to the passed-in percentage. This
coding pattern can help reduce code redundancy (the original raise
method’s logic appears in only one place and so is easier to
change) and is especially handy for kicking off superclass
constructor methods in practice.
If you’ve already studied Python OOP, you know that this coding scheme works because we can always call methods through either an instance or the class name. In general, the following are equivalent, and both forms may be used explicitly:
instance.method(arg1, arg2) class.method(instance, arg1, arg2)
In fact, the first form is mapped to the second—when calling
through the instance, Python determines the class by searching the
inheritance tree for the method name and passes in the instance
automatically. Either way, within
self refers to the instance that is the
subject of the call.
For more object-oriented fun, we could also add a few
operator overloading methods to our people classes. For example, a
__str__ method, shown here,
could return a string to give the display format for our objects
when they are printed as a whole—much better than the default
display we get for an instance:
class Person: def __str__(self): return '<%s => %s>' % (self.__class__.__name__, self.name) tom = Manager('Tom Jones', 50) print(tom) # prints: <Manager => Tom Jones>
__class__ gives the
lowest class from which
was made, even though
may be inherited. The net effect is that
__str__ allows us to print instances
directly instead of having to print specific attributes. We could
__str__ to loop
through the instance’s
attribute dictionary to display all attributes generically; for
this preview we’ll leave this as a suggested exercise.
We might even code an
__add__ method to make
+ expressions automatically call the
giveRaise method. Whether we
should is another question; the fact that a
+ expression gives a person a raise
might seem more magical to the next person reading our code than
Finally, notice that we didn’t pass the
job argument when making a manager in
Example 1-16; if we had,
it would look like this with keyword arguments:
tom = Manager(name='Tom Doe', age=50, pay=50000, job='manager')
The reason we didn’t include a job in the example is that it’s redundant with the class of the object: if someone is a manager, their class should imply their job title. Instead of leaving this field blank, though, it may make more sense to provide an explicit constructor for managers, which fills in this field automatically:
class Manager(Person): def __init__(self, name, age, pay): Person.__init__(self, name, age, pay, 'manager')
Now when a manager is created, its job is filled in
automatically. The trick here is to call to the superclass’s
version of the method explicitly, just as we did for the
earlier in this section; the only difference here is the unusual
name for the constructor method.
We won’t use any of this section’s three extensions in later
examples, but to demonstrate how they work, Example 1-17 collects these
ideas in an alternative implementation of our
Example 1-17. PP4E\Preview\person_alternative.py
""" Alternative implementation of person classes, with data, behavior, and operator overloading (not used for objects stored persistently) """ class Person: """ a general person: data+logic """ def __init__(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job def lastName(self): return self.name.split()[-1] def giveRaise(self, percent): self.pay *= (1.0 + percent) def __str__(self): return ('<%s => %s: %s, %s>' % (self.__class__.__name__, self.name, self.job, self.pay)) class Manager(Person): """ a person with custom raise inherits general lastname, str """ def __init__(self, name, age, pay): Person.__init__(self, name, age, pay, 'manager') def giveRaise(self, percent, bonus=0.1): Person.giveRaise(self, percent + bonus) if __name__ == '__main__': bob = Person('Bob Smith', 44) sue = Person('Sue Jones', 47, 40000, 'hardware') tom = Manager(name='Tom Doe', age=50, pay=50000) print(sue, sue.pay, sue.lastName()) for obj in (bob, sue, tom): obj.giveRaise(.10) # run this obj's giveRaise print(obj) # run common __str__ method
Notice the polymorphism in this module’s self-test loop: all three objects share the constructor, last-name, and printing methods, but the raise method called is dependent upon the class from which an instance is created. When run, Example 1-17 prints the following to standard output—the manager’s job is filled in at construction, we get the new custom display format for our objects, and the new version of the manager’s raise method works as before:
<Person => Sue Jones: hardware, 40000> 40000 Jones <Person => Bob Smith: None, 0.0> <Person => Sue Jones: hardware, 44000.0> <Manager => Tom Doe: manager, 60000.0>
Such refactoring (restructuring) of code is common as class hierarchies grow and evolve. In fact, as is, we still can’t give someone a raise if his pay is zero (Bob is out of luck); we probably need a way to set pay, too, but we’ll leave such extensions for the next release. The good news is that Python’s flexibility and readability make refactoring easy—it’s simple and quick to restructure your code. If you haven’t used the language yet, you’ll find that Python development is largely an exercise in rapid, incremental, and interactive programming, which is well suited to the shifting needs of real-world projects.
It’s time for a status update. We now have encapsulated in the form of classes customizable implementations of our records and their processing logic. Making our class-based records persistent is a minor last step. We could store them in per-record pickle files again; a shelve-based storage medium will do just as well for our goals and is often easier to code. Example 1-18 shows how.
Example 1-18. PP4E\Preview\make_db_classes.py
import shelve from person import Person from manager import Manager bob = Person('Bob Smith', 42, 30000, 'software') sue = Person('Sue Jones', 45, 40000, 'hardware') tom = Manager('Tom Doe', 50, 50000) db = shelve.open('class-shelve') db['bob'] = bob db['sue'] = sue db['tom'] = tom db.close()
This file creates three class instances (two from the original class and one from its customization) and assigns them to keys in a newly created shelve file to store them permanently. In other words, it creates a shelve of class instances; to our code, the database looks just like a dictionary of class instances, but the top-level dictionary is mapped to a shelve file again. To check our work, Example 1-19 reads the shelve and prints fields of its records.
Example 1-19. PP4E\Preview\dump_db_classes.py
import shelve db = shelve.open('class-shelve') for key in db: print(key, '=>\n ', db[key].name, db[key].pay) bob = db['bob'] print(bob.lastName()) print(db['tom'].lastName())
Note that we don’t need to reimport the
Person class here in order to fetch its
instances from the shelve or run their methods. When instances are
shelved or pickled, the underlying pickling system records both
instance attributes and enough information to locate their classes
automatically when they are later fetched (the class’s module simply
has to be on the module search path when an instance is loaded).
This is on purpose; because the class and its instances in the
shelve are stored separately, you can change the class to modify the
way stored instances are interpreted when loaded (more on this later
in the book). Here is the shelve dump script’s output just after
creating the shelve with the maker script:
bob => Bob Smith 30000 sue => Sue Jones 40000 tom => Tom Doe 50000 Smith Doe
As shown in Example 1-20, database updates are as simple as before (compare this to Example 1-13), but dictionary keys become attributes of instance objects, and updates are implemented by class method calls instead of hardcoded logic. Notice how we still fetch, update, and reassign to keys to update the shelve.
Example 1-20. PP4E\Preview\update_db_classes.py
import shelve db = shelve.open('class-shelve') sue = db['sue'] sue.giveRaise(.25) db['sue'] = sue tom = db['tom'] tom.giveRaise(.20) db['tom'] = tom db.close()
And last but not least, here is the dump script again after running the update script; Tom and Sue have new pay values, because these objects are now persistent in the shelve. We could also open and inspect the shelve by typing code at Python’s interactive command line; despite its longevity, the shelve is just a Python object containing Python objects.
bob => Bob Smith 30000 sue => Sue Jones 50000.0 tom => Tom Doe 65000.0 Smith Doe
Tom and Sue both get a raise this time around, because they are persistent objects in the shelve database. Although shelves can also store simpler object types such as lists and dictionaries, class instances allow us to combine both data and behavior for our stored items. In a sense, instance attributes and class methods take the place of records and processing programs in more traditional schemes.
At this point, we have a full-fledged database system: our classes
simultaneously implement record data and record processing, and they
encapsulate the implementation of the behavior. And the Python
shelve modules provide simple ways to
store our database persistently between program executions. This is
not a relational database (we store objects, not tables, and queries
take the form of Python object processing code), but it is
sufficient for many kinds of programs.
If we need more functionality, we could migrate this application to even more powerful tools. For example, should we ever need full-blown SQL query support, there are interfaces that allow Python scripts to communicate with relational databases such as MySQL, PostgreSQL, and Oracle in portable ways.
ORMs (object relational mappers) such as SQLObject and SqlAlchemy offer another approach which retains the Python class view, but translates it to and from relational database tables—in a sense providing the best of both worlds, with Python class syntax on top, and enterprise-level databases underneath.
Moreover, the open source ZODB system provides a more comprehensive object database for Python, with support for features missing in shelves, including concurrent updates, transaction commits and rollbacks, automatic updates on in-memory component changes, and more. We’ll explore these more advanced third-party tools in Chapter 17. For now, let’s move on to putting a good face on our system.