From charlesreid1

Line 1: Line 1:
=Goodrich Python Chapter 5 Notes=
=Notes=
 
==Notes from Goodrich et al Data Structures in Python Chapter 5==


Basic usage of lists, strings, and tuples seems straightforward, but there are ''several important subtleties regarding behaviors associated with these classes''
Basic usage of lists, strings, and tuples seems straightforward, but there are ''several important subtleties regarding behaviors associated with these classes''

Revision as of 05:34, 29 May 2017

Notes

Notes from Goodrich et al Data Structures in Python Chapter 5

Basic usage of lists, strings, and tuples seems straightforward, but there are several important subtleties regarding behaviors associated with these classes

Basics

encapsulation - noting when the user of the class does not need to know the internal details of the implementation

byte - a typical unit of storage, equivalent to 8 bits

memory address - how the computer keeps track of what information is stored in what byte

Individual bytes of memory can be stored or retrieved in O(1) time

Python characters are represented using the Unicode character set. Python internally represents each Unicode character with 16 bits (2 bytes). Thus, a six character string would be stored in 12 consecutive bytes in memory.

Can use the sys.getsizeof() function from the sys module to get the size in bytes of a particular variable.

See Arrays/Python/Sizeof for details and script examples.

Referential lists

Python lists are referential - they contain pointers to addresses in memory

The number of bits used to store the memory address of each element of a list is fixed - 64 bits per address. The None object can be used as an element of the list to represent an empty/missing item, it still takes up 64 bits of address space.

Referential nature is important to understanding semantics:

Integer lists are going to refer to an immutable object in memory, representing that particular integer value. So, if we have two or three lists of integers, and they share numbers, those are actually addresses in memory, that point to immutable values of integers.

When we run a command like counters[2] += 1 we technically do not change the value of the existing integer, we end up pointing to space in memory for the new, incremented integer value.

If we want to create a new list as a copy, we can do a shallow copy or a deep copy.

  • Shallow copy: backup = list(primes) - this produces a new list that references the same addresses in memory as the original list
  • Deep copy: from copy import deep_copy; backup = deep_copy(primes) - this produces a new version that is duplicated element-by-element

if we want to add all the elements from one list onto the end of another list, we use the extend() command, as in a.extend(b)

Compact arrays

By default, most lists are referential. However, string arrays are not referential - they store the 16-bit Unicode character (2 bytes apiece) instead of a 64-bit address (8 bytes), making it cheaper. It also uses contiguous memory.

A referential integer array, on the other hand, can potentially use up a whole lot of space. For example, each time we add a new item to an integer list, it has to set aside space in memory for the value of that new integer (if it has not already), and then it has to set aside space for the 64-bit (or 8-byte) address where that integer lives.

To illustrate - if we want to store a sequence of 1M 64-bit integers, it won't take 64 x 1M = 64 M bits. It will actually take 4-5 times that amount of space.

Each new element creates a new 64-bit address. But that address has to point to an integer value, which is stored somewhere else in memory.

We can use the getsizeof method to get the actual size of things in memory (see Arrays/Python/Sizeof).

Compact structure forces the integers to be stored directly in memory, and not as references.

The advantage to compact structures is high performance. They live closer in memory.

To use, import the module named array. The array constructor requires a type code to be specified for the first argument.

primes = array('i', [2,3,5,7,11,13,17,19])

See Arrays/Python/CompactArrays

Goodrich Python Chapter 5 Questions

R-5.1, R-5.2, R-5.3: Arrays/Python/Sizeof

R-5.4: Arrays/Python/DynamicArray

R-5.10: Arrays/Python/CaesarCipher