# code for loading the format for the notebook
import os
# path : store the current path to convert back to it later
path = os.getcwd()
os.chdir(os.path.join('..', 'notebook_format'))
from formats import load_style
load_style(plot_style=False)
os.chdir(path)
# 1. magic to print version
# 2. magic so that the notebook will reload external python modules
%load_ext watermark
%load_ext autoreload
%autoreload 2
import json
import xml.etree.ElementTree as et
%watermark -a 'Ethen' -d -t -v
Let's look at an example where we need to convert the Song object to a string representation according to a user-specified format parameter.
class Song:
"""
By default Python uses a dict (__dict__) to store an object’s instance attributes.
This is really helpful as it allows the user to set arbitrary new attributes at runtime.
However, for small classes with known attributes it might be a bottleneck as dict wastes a lot of RAM
due to the fact that Python can’t just allocate a static amount of memory at object creation to store
all the attributes. Therefore it sucks a lot of RAM if we create a lot of objects. One way to circumvent
this issue involves the usage of __slots__ to tell Python not to use a dict and only allocate space for
a fixed set of attributes. By adding it we no longer have the ability to add new attributes to
the class at run time.
http://book.pythontips.com/en/latest/__slots__magic.html
"""
__slots__ = ['song_id', 'title', 'artist']
def __init__(self, song_id, title, artist):
self.song_id = song_id
self.title = title
self.artist = artist
def serialize_song(song, format):
if format == 'JSON':
song_info = {
'id': song.song_id,
'title': song.title,
'artist': song.artist
}
return json.dumps(song_info)
elif format == 'XML':
song_info = et.Element('song', attrib={'id': song.song_id})
title = et.SubElement(song_info, 'title')
title.text = song.title
artist = et.SubElement(song_info, 'artist')
artist.text = song.artist
return et.tostring(song_info, encoding='unicode')
else:
raise ValueError(format)
song = Song('1', 'Water of Love', 'Dire Straits')
print(serialize_song(song, 'JSON'))
print(serialize_song(song, 'XML'))
The code above works fine but can benefit from refactoring. One of the best practices behind writing clean code is Single Responsibility Principle. Here, instead of using a complex if/elif/else conditional structure to determine the concrete implementation, the application delegates that decision to a separate component that creates the concrete object. Then concrete implementation of the interface is identified by some parameter. With this approach, we can have a class/method that does one thing only and one thing well, making it more reusable and easier to maintain.
This type of creational design pattern is so called Factory Design Pattern.
Let's take a look at how we can refactor the code above. The first step when we see complex conditional code in an application is to identify the common goal of each of the execution paths (or logical paths), and separate out implementations for each logical path. With the factory pattern, we let the client, our serialize
method depend on a creator, get_serializer
method, which returns the actual implementation using some sort of identifier.
def serialize_song(song, format):
serializer = get_serializer(format)
return serializer(song)
def get_serializer(format):
if format == 'JSON':
return _serialize_to_json
elif format == 'XML':
return _serialize_to_xml
else:
raise ValueError(format)
def _serialize_to_json(song):
payload = {
'id': song.song_id,
'title': song.title,
'artist': song.artist
}
return json.dumps(payload)
def _serialize_to_xml(song):
song_element = et.Element('song', attrib={'id': song.song_id})
title = et.SubElement(song_element, 'title')
title.text = song.title
artist = et.SubElement(song_element, 'artist')
artist.text = song.artist
return et.tostring(song_element, encoding='unicode')
song = Song('1', 'Water of Love', 'Dire Straits')
serialize_song(song, 'JSON')