Programming in Python

Youjun Hu

Institute of Plasma Physics, Chinese Academy of Sciences

Email: yjhu@ipp.cas.cn

1.Introduction

Python's emphasis on flexibility and readability, combined with high-level builtin data structures, dynamically typed variables, and verstile third-party libraries that are easily accessible, make it attractive for rapid prototyping, as well as for use as a scripting language to connect existing components together.

Python is case sensitive. Python was designed for convenience and hence uses dynamically typed variables. Python is lexical scoped, which means the binding of a free variable inside a function can be infered from where the function is defined, without considering where the function is called (but the value of the free variable can depend on where the function is called because the value of the free variable can be modified somewhere).

Python functions are first-class citizens, which means: they can be treated like any other variable, can be passed to a function, and can be returned from a function.

Python's syntax emphasizes readability, similar to natural English (contrast to Perl). Python uses indentation for statement grouping. Whitespace (spaces and tabs) at the beginning of a logical line is used to compute the indentation level of the line, which is used to determine the grouping of statements (i.e. code blocks such as loops, branchings, function/class body). Indentation-based syntax is believed to enhances source code readability, considering the fact that visual layout is intuitively used by humans to perceive logical concepts.

Other reasons why python is popular: All Python (intepretor) releases are open source. Python has an active community and lots of libraries and documentation. Python works on many platforms (e.g. Linux, Windows, Mac).

Comments in python: (1) anything after a hash (#) is a comment. A comment signifies the end of the logical line unless the implicit line joining rules are invoked. (2) triple quotes introduce string literals, which can have multiple-lines, and thus can serve as multiple-line comments.

A python statement ends at a hash or at the end of a line (exception for open brackets, quotes, or parentheses). Semicolon can be used to put multiple statements in the same line. (This works only for some simples statements and does not work for compound statements due to the requirement of the layout syntax.) This syntax makes it legal to put a semicolon at the end of a single statement:

print("hello");

This statement means print(“hello”) and then do nothing. So, it's actually two statements where the second one is empty. This semantics is often used as a hint in iteractive environments for supressing output.

1.1.Python reserved words

The reserved words (keywords) in Python are as follows.

Value: True, False, None

Operator: and, or, not, in, is

Control flow: if, elif, else, for, while, break, continue, pass, return, yield

Structure definition: def, class, lambda, with, as

Variable handling: del, global, nonlocal

Import module: import, from, as

Exception-handling: try, except, raise, finally, else, with, assert

Asynchronous programming: async, await

The keywords else and as have additional uses beyond their initial use cases. The else keyword is used with conditionals and loops, as well as with try and except. The as keyword is used in import as well as in the with keyword.

Some identifiers are only reserved under specific contexts. These are known as soft keywords. The identifiers match, case and _ can syntactically act as keywords in contexts related to the pattern matching statement, but this distinction is done at the parser level, not when tokenizing.

2.Run Python code

2.1.Interactive mode

2.2.Script file mode

3.Python modules

In Python, a plain text file containing Python code that is intended to be directly executed by the user is usually called script, which is an informal term that means top-level program file. On the other hand, a plain text file, which contains Python code that is intended to be imported and used from another Python file (or from the interactive mode), is called a module.

3.1.Install a new module

Command tool python3-pip can be used to install a new package. For example, in a linux terminal,

pip3 install numpy

will install the numpy package.

3.2.Use modules

Use import to get access to python modules. For example:

import foo

Then python will look for a file with name being foo.py first in the current directory and then in other directories assumed by python. If the file is found, it will be loaded. Here foo is called module name.

It is customary to place all import statements at the beginning of a python file. More examples:

import numpy as np
import matplotlib.pyplot as plt

Modules can import other modules. A module usually only contain function definitions. But a module can also contain executable statements. These statements are intended to initialize the module. They are executed only the first time the module name is encountered in an import statement.

Each module has its own namespace, which is used as the global namespace by all functions defined in the module. Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user's global variables (this rule is just the lexical scoping). A user of a module can access global variables of a module by prefixing the variable with the namespace, i.e., modname.itemname.

We can also imports names from a module directly into the importing module's symbol table. For example:

from numpy import sin

which import a specific function to the current namespace. In this case, the namespace of numpy is not imported. Then sin can not be accessed by using numpy.sin.

Another form of import is

from math import *

This imports all names (except those beginning with an underscore) defined in the math package directly in the calling module's namespace. In most cases Python programmers do not use this form since it introduces an unknown set of names into the namespace, shadowing some things you have already defined if there are name collisions.

4.Objects, types, values, and name binding

Objects are Python's abstraction for data. All data in a Python program is represented by objects. Every object has an identity, a type, and a value.

An object's identity never changes once it has been created; you may think of it as the object's address in memory. The id() function returns an integer representing its identity. The ‘is' operator compares the identity of two objects.

An object's type determines the operations that the object supports and also defines the possible values for objects of that type. The type() function returns an object's type. Like its identity, an object's type is also unchangeable.

Some objects contain other objects; these are called containers. Examples of containers are tuples, lists, sets, and dictionaries.

Objects whose (first-level) elements are unchangeable once they are created are called immutable. An immutable object can contain an element that is a mutable object. When the elements of the contained mutable object are changed, the value of the imuutable object is changed. However the container is still considered immutable because the id() of its element are not changed. So, immutability is not the same as having an unchangeable value.

An object's mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.

For immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists.

All python's objects were based on the C data structure no matter the object is a simple object such as an integer, i.e., primitive, or something more complicated such as a class.

On high level of abstraction, concepts in python can be categorized into two concepts: names and objects. Names refer to objects. Names are usually called variables. A name is introduced by name binding operation, which binds the name to an object, i.e, naming the object. The following constructs bind names: assignments, class definitions, function definitions, formal parameters to functions, import statements, for loop header, a capture pattern in structural pattern matching.

(The physical representation of a name is most likely a pointer, but that's simply an implementation detail. Name is actually an abstract notion at heart.)

Objects have individuality, and multiple names can be bound to the same object. This is known as aliasing.

Let us first discuss the most basic name binding operation—assignements:

a = 1
b = 1

Then a and b may or may not refer to the same object (this can be checked with the build-in function id() or using a is b, which, in my case, returns True)

a = []
b = []

Here two objects (empty lists) are created in memory, and are named as a and b, respectively. Therefore a and b are guaranteed to refer to two different objects. Consider another example:

a = []
b = a

Here the second line give an alias to the object named a, rather than copying the objects. Therefore a and b refer to the same object.

Objects are never explicitly destroyed; the may be implicitly destoyed: when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.

Python assignment statements do not return values.

Python adopts dynamical typing, which means type checking happens at run time. In dynamically typed languages, typing is associated with the object that a variable refers to rather than the variable itself. Therefor there is no type declaration for variables. And a variable pointing to an object of a type can be later used to point to an object of another type.

Here is the type hierarchy of Python:

* Numbers:

* Integral: Plain Integers, Long Integers, Booleans.

* Non-Integral: Floating point numbers, Decimals & Fractions,

Complex numbers.

* Collections:

* Sequences: string, tuple, list, byte.

* Sets: set.

* Maps: Dictionary.

* Callables: functions, generators, classes, methods.

* Singleton: None, NotImplemented, Ellipsis (…), exceptions.

4.1.Parallel assignments, comma operator, unpacking

a, b = 0, 1

This is known as parallel assignment. If the right-hand side of the assignment is a single variable (e.g. a list or tuple), the feature is called unpacking or destructuring assignment:

a, b = (2, 3)

This unpacks a tuple of two elements into two variables.

In a for loop, argument unpacking can be used along the built-in zip(), which allows you to iterate through two or more sequences at the same time. On each iteration, zip() returns a tuple that collects one element from each of the sequences:

>>> first = ["a", "b", "c"]
>>> second = ["d", "e", "f"]
>>> third = ["g", "h", "i"]
>>> for one, two, three in zip(first, second, third):
…     print(one, two, three)
…
a d g
b e h
c f i

More examples of unpacking:

(a, b) = (10, (20,30))

a, b= (10, (20,30))

[x,y] = [2,3]

x,y= [2,3]

a,b,c = 'Hey'

p, *q = “Hello” # p will be “H” and q will be the rest

p, *q, j = [1,2,3,4,5]

This kind of unpacking appear often in everyday coding, in a disguised way when several variables are bound to the return value of a function which returns multiply items. For example:

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2)
fig, ((ax1, ax2),(ax3,ax4)) = plt.subplots(nrows=2, ncols=2)

5.Flow control

5.1.Logical expressions

Equals ==
Not Equals !=
Less than <
Less than or equal to <=
Greater than >
Greater than or equal to >=
Belongs to in
object identity is

Table 1. Relation operators

logical and and
logical or or
logicl not not

Table 2. Logical operators

The returned values by the above operators are of logical type (also called bool). (Logical type is a subclass of integral numbers.) Logical expressions can be used in several ways, most commonly in conditionals and loops.

5.2.Conditionals

Python if structure takes the following form:

if a==0:
    print("a is zero")
elif a<0:
    print("a is negative")
else:
    print("a is positive")

The blocks in a if structure does not introduce a new scope. Therefore, if new variables are defined in the blocks, they are defined in the global namespace and thus will be still visible outside the if structure. The same applies to the loop structure. (This is different from C)

One-line if structure:

if a > b: print("a is greater than b")

If you have only one statement to execute, one for if, and one for else, you can put it all on the same line:

print("A") if a > b else print("B")

The advantage of one-line if structure is that we can pass the return value seamlessly to other operation (functional style). For example:

abs_a = (a if a>=0 else -a)

Another usage of the one-line if structure is in the list comprehension (although not very readable):

>>> original_prices = [1.25, -9.45, 10.22, 3.78, -5.92, 1.16]
>>> prices = [i if i > 0 else 0 for i in original_prices]
>>> prices
[1.25, 0, 10.22, 3.78, 0, 1.16]

A more readable way would be defining a function and using that function in the comprehension:

def f(i):
    return i if i>0 else 0
prices = [f(i) for i in original_prices]

5.3.Loop structure

Python has two primitive loops: while loop and for loop.

5.3.1.While-loop

i = 1 # while-loop requires relevant variables to be ready
while i < 6:
     print(i)
     i = i + 1

The above while loop is similar to those in Fortran and C.

5.3.2.Collection-based loop: For-loop

Another python loop structure is the for loop, which iterates over an iterable object, capturing each element to a local variable for use by the attached block. For example:

fruits = ["apple", "banana", "cherry"]
for x in fruits:
    print(x)

In the above, x acts as a dummy variable, which does not need to be defined before the loop. The for loop does not form a new scope. This means that the variable x is acessible outside the loop. On exist from the for loop, the value of x is equal to the value got from the last iteration. If x is defined before the for loop, its value will be modified by the for loop.

Note that the keyword in is used in Python for two different (but related) purposes: (1) The in keyword is used to check if a value is present in a sequence (list, range, string etc.). (2) Combined with for, it is used to iterate through a sequence.

The continue statement skips the remainder of the current iteration and continues with the next. The break satement exits a loop.

The advantage of collection-based iteration is that it helps avoid the off-by-one bug (considered as the No. 1 bug by programmers) that is common in other programming languages. The collection-based iteration is borrowed from bash?

Collection-based loop can also be nested. For example

list_of_lists = [ [1, 2, 3], [4, 5, 6], [7, 8, 9]]
for list in list_of_lists:
    for x in list:
        print(x)

The collection-based for loop is less like the for loop in C, which looks like the following:

for(i=start; i<some_threthod; some_operation_modifying_i) {loop_body}

which is essentially a while loop discussed above.

6.Functions

6.1.Define and call a function

Use keyword def to define a function:

def func_name ( arg1, arg2, argN ):
    some_codes
    return something

Call a function:

func_name(arg1, arg2, argN) #the returned value is not captured
a = func_name(arg1, arg2, argN)

Note that a bare function name without a parameter tuple refers to a function object rather than calling a function.

Functions can return multiple items:

def func():
    return 'a', 3, (1,2,3)  # returns a tuple of 3 elements
x1, x2, x3 = func()  # unpacks the tuple of 3 elements into 3 vars
# x1: 'a'
# x2: 3
# x3: (1,2,3)

6.2.Arguments are passed by assignment

Some languages (e.g. Fortran) handle function arguments as references to existing variables, which is known as pass by reference. Other languages (e.g. C) handle them as independent values, an approach known as pass by value.

Python assigns a unique identifier to each object and this identifier can be found by using Python's built-in id() function. It is ready to verify that actual and formal arguments in a function call have the same id value, which indicates that the dummy argument and actual argument refer to the same object.

Note that the actual argument and the corresponding dummy argument are two names referring to the same object. If you re-bind a dummy argument to a new value/object in the function scope, this does not effect the fact that the actual argument still points to the original object because actual argument and dummy argument are two names.

Using the semantics of the assigenment in Python, the above facts can be summarized as “arguments are passed by assignment”, i.e.,

dummy_argument = actual_argument

If you re-bind dummy_argument to a new object in the function body, the actual_argument still refers to the original object. If you use dummy_argument[0] = some_thing, then this will also modify actual_argument[0]. Therefore the effect of “pass by reference” can be achieved by modifying the components/attributes of the object reference passed in. (In practice, returning multiple values from functions is usually better than employing the effect of passing by reference.)

To make comparison with other languages, you can say Python passes arguments by value in the same way as C does, where when you pass "by reference" you are actually passing by value the reference (i.e., the pointer)

6.3.Parameter list

The format of parameter list of a function defintion determines the possible ways in which the user provides actual arguments. The following are typical parameter lists:

def f(x, y):  #case1
def f(x, y=0): #case2: optional parameters with default values
def f(x, y, *, z): #case3
def f(x, y, /, w, q, *, z): #case4
def f(x, y, *arg, **kwarg): #case5
def f(x, y=0, *arg, **kwarg): #case6

For case1, f can be called with positional arguments (e.g. f(2,3)), keywords arguments (e.g., f(x=2, y=3)), or the mixed way (e.,g., f(x, y=3)). For the mixed way, positional arguments must go first and then keyword arguments. The advantage of keyword arguments is that they may be specified in arbitrary order.

In case2, we use par = value to define a parameter with a default value. Default parameters are introduced to allow some arguments to be omitted when the function is called (i.e., they are optional). Non-defualt parameters are not allowed after a default parameter. For example, def f(x, y=0, z): will generate SyntaxError.

Default parameter values are defined only once when the function is defined (that is, when the def statement is executed). If you specify a mutable object as the default value for a parameter and the function is called mutiple times with the default parameter omitted, then the operation on the object will be accumated (e.g., appending to a list).

Note that optional/default parameters and keyword arguments take similar froms. The difference is that optional/default parameters are for function defintion while keywords arguments are for function invokation.

Cases 3 and 4: A bare star “*” indicates all the following parameters are keyword-only parameters. A “/” indicates all parameter before it are position-only parameters.

Case 5 and case 6 are explained in the following section.

6.3.1.Arguments packing

A dummy argument, which appear in a function defintion, is often referred to as “parameter”. An actual argument, which appears in a function call, is often referred to as “argument”. But most people do not stick to this rule and use “argument”/“parameter” without specifying whether it is a dummy or actual argument. Therefore, when seeing the word “argument”, we need to judge from the contex whether it refers to a dummy argument or actual argument. I will use “formal”/“actual” to distinguish between them.

Besides standard positional arguments, there is a special formal argument that can accept a group of positional arguments (not keyword arguments) and pack them into a single iterable object:

def my_sum(*args):
    result = 0
    for x in args:
        result += x
    return result
>>> my_sum(1, 2, 3)
6

Here my_sum() takes all the actual arguments and packs them into a single iterable object (named args in this case). The name args does not matter. All that matters here is that you use the unpacking operator (*) before args.

**kwargs works just like *args, but instead of accepting positional arguments it accepts keyword (or named) arguments. Take the following example:

def concatenate(**kwargs):
    result = ""
    for arg in kwargs.values():
        result += arg
    return result
print(concatenate(a="Hello", b="World", c="!"))

Note that the keyword arguments are packed into a standard dict. If you iterate over the dictionary and want to return its values, like in the example shown, then you must use kwargs.values().

To recap, in a function defintion, use *args and **kwargs to accept a changeable number of positional arguments and keyword argument, respectively.

When defining a function, the correct order for parameters is:

  1. non-default parameters

  2. default parameters

  3. *args arguments

  4. **kwargs arguments

When calling a function, the correst order for argument is:

  1. positional arguments

  2. keyword arguments

6.3.2.Arguments unpacking

In the above, we see that, when a parameter name in a Python function definition is preceded by an asterisk (*), it indicates that any corresponding arguments in the function call are packed into a tuple that the function can refer to by the given parameter name. An analogous operation is available when calling a function. When an argument in a function call is preceded by an asterisk (*), it indicates that the argument is a tuple that should be unpacked and passed to the function as separate values. The asterisk (*) operator can be applied to any iterable in a Python function call.

Preceding a parameter in a Python function definition by a double asterisk (**) indicates that the corresponding keyward arguments should be packed into a dictionary. An analogous operation is available when calling a function. When the double asterisk (**) precedes an argument in a Python function call, it specifies that the argument is a dictionary that should be unpacked, with the resulting items passed to the function as keyword arguments.

6.4.Generator Function using yied

Generators, in essence, are a special kind of iterators that are evaluated lazily by rendering elements one by one without the need for loading all elements in the memory.

The regular definition of generators involves the use of the yield keyword, which means sending values back to the caller. For example:

# a generator yields items instead of returning a list
def firstn(n):
    num = 0
    while num < n:
        yield num
        num += 1
sum_of_first_n = sum(firstn(1000000))

When you call a function that contains a yield statement anywhere, you get a generator object, but no code runs. Then each time you extract an object from the generator, Python executes code in the function until it comes to a yield statement, then pauses and delivers the object. When you extract another object, Python resumes just after the yield and continues until it reaches another yield (often the same one, but one iteration later). This continues until the function runs off the end, at which point the generator is deemed exhausted.

Generator expressions provide a shortcut to build generators out of expressions.

doubles = list(2*n for n in range(50))

By allowing generator expressions, we don't have to write a generator function.

List comprehensions are closer to syntactic sugar for a generator expression inside a list().

6.5.Nested function, return function as value, closure

In Python, we can define a function inside the body of a function (nested function). The inner functions have access to all outer variables in the enclosing functions. The inner functions can be called in the body of the enclosing function body. This feature is commonplace and can be found in most languages. What is interesting is that the inner function can also be returned as the return value of the enclosing function. This feature is only available in languages that treat function as a first citizen. For example

def power_factory(exp):
    def power(base):
        return base**exp
    return power

square = power_factory(2)
square(10)  #give 100
cube = power_factory(3)
cube(10) #give 1000

Variables like exp in the inner fucntion are called free variables. They are variables that are used in a code block but not defined there. When you return the inner function as the return value of the enclosing function, python needs to remember the values of these free variables (otherwise the returned function is meaningless). In other words, when you handle a nested function as value, the inner function are packaged together with the environment in which they execute. The resulting object is known as a closure. In other words, a closure is an inner function that carries information about its enclosing scope, even though its ecnclosing scope has completed its execution.

Another famous example of making use of closure is to generate the derivative function of a given function:

>>> def derivative(f, dx):
        def prime(x):
            return (f(x+dx)-f(x))/dx
        return prime
>>> dx = 0.01
>>> mycos = derivative(math.sin, dx)
>>> mycos(2.0)

A more interesting example is an accumulator generator making use of nonlocal keyword:

def foo(n):
    def inc(x):
        nonlocal n
        n += x
        return n
    return inc

acc = foo(10)
print(acc(1)) # Output: 11
print(acc(2)) # Output: 13 

6.6.Function decorators

A decorator wrap a function, modifying its behavior. For example:

def my_decorator(say_hi):
    def wrapper(name):
        print('before')
        say_hi(name)
        print('after')
    return wrapper

The above function accepts a function and returns a function very similar to the orginal one. Next, define a function and pass it to the above function and capture the result with the same function name:

def say_hi(name):
    print(f"hi!, {name}")
say_hi = my_decorator(say_hi)

Python introduces the function decorator syntax to achive the same effect:

@my_decorator
def say_hi(name):
    print(f"hi!, {name}")

To recap, preceding a function defintion with @dec is equivalent to defining that function first and then passing the function name to the dec function, and capture the returned function with the orginal function name.

6.7.Function call vs keyword structure

Arguments in a function call are enclosed in round-brackets whereas arguments to a keyword statement are usually provided without round-brackets. For example:

del x

where del is a keyword statement and thus its argument x is not enclosed by round-brackets. In Python3, print becomes a function (in python2, it is a keyword statement). Therefore, arguments to python3 print must be enclosed by round-brackets: print(“hello”).

6.8.Built-in functions

Complete list of python built-in functions can be found at https://docs.python.org/3/library/functions.html. As an example of useful built-in functions, dir(), without arguments, return the list of names in the current local scope. With an argument, dir attempt to return a list of attributes and methods for that object. For example:

dir(__builtins__)

returns all names defined in __builtins__.

7.Namespace and Scoping rule

A variable defined inside a function's body is known as a local variable. Formal arguments identifiers also behave as local variables. A variable created outside of functions is known as a global variable, which can be called a free variable from the perspective of a function.

Python adopts the lexical scope, which means the binding of a free variable inside a function can be infered without considering where the function is called (but the value of the free variable can depend on where the function is called because the value of the free variable can be modified somewhere).

If a variable is assigned a value anywhere within the function's body, it's assumed to be a local unless explicitly declared as global or nonlocal. The locality of the variable will apply before the assignment creating the local variable, and thus if we use it before the assignment, we will get the UnboundLocalError: local variable 'x” referenced before assignment. I.e., you can not use a name as a global variable first and then use it as a local variable.

A global statement indicates the corresponding name live in the global scope. It does not require that the name is pre-bound.

In a nested scope, we can use keyword nonlocal to declare that a variable is from enclosing scopes (scopes enclosing the present scope) but not a global. In other words, nonlocal means “not a global or a local variable”. The nonlocal statement causes corresponding names to refer to previously bound variables in the nearest enclosing function scope. SyntaxError is raised at compile time if the given name does not exist in any enclosing function scope.

A namespace is a mapping from names to objects. Most namespaces are currently implemented as Python dictionaries, i.e., a namespace is a dictionary mapping names (as strings) to values. If you have access to a namespace, then you can access all the names in that namespace by using the dot notation, e.g., numpy.sin.

A scope rule defines which namespaces will be looked in and in what order. When you make a reference, e.g., print(foo), Python looks through a list of namespaces to try to find one with the name as a key. Note that we are not specifying which namespace foo belongs to (because we are not using the dot notation). Therefore Python will determine which namespace foo belongs to by searching a chain of namesapce. This is the scoping rule, i.e., the order in which a language looks up names. In python, the scoping rule is as follows. If you reference a name, then Python will look that name up sequentially in the Local, Enclosing, Global, and Built-in namespaces. If the name exists, then you'll get the first occurrence of it. Otherwise, you'll get an error. This rull is often called LEGB rule.

When we say x is in a function's namespace, we mean it is defined there, locally within the function. When we say x is in the function's scope, we mean x is either in the function's namespace or in any of the outer namespaces that the function's namespace is nested inside.

Whenever you define a function (using def or lambda), you create a new namespace and a new scope. The namespace is the new, local hash of names. The scope is the implied chain of LEGB.

When we say that a variable in a namespace is in scope in a code block, we mean it is directly accessible for the block. “Directly accessible” here means that you do not need to qualified the name with namespace name being suffix.

Classes defintion introduces a new namespace but does not introduce a new scope that functions defined in the class can see. Therefore, if a function defined in a class want to refer to a class attribute, the variable must be qualified with the class name using the dot notation.

Create a namespace using a dictionary:

>>> from argparse import Namespace
>>> x = {'a': 1, 'b': 2}
>>> ns = Namespace(**x)
>>> ns.a
1

Lexical scoping:

x = 1
def myfun():
    return x
x = 10
myfun()  # return 10 

Q: Is the above behavior consistent with lexical scoping?

Q.Yes, it is. myfun is using the x variable from the environment where it was defined, but that x variable now holds a new value. Lexical scoping means functions resovle variables from the scope where they were defined, but it doesn't mean they have to take a snapshot of the values of those variables at the time of function definition. The code line x = 1 can be dropped and the codes are still valid, i.e., x can be un-bound when the function is defined.

Lexical scoping means functions resolve free variables from the scope where they were defined, not from the scope where they are called.

The local namespace for a function is created when the function is called, and deleted when the function returns or raises an exception that is not handled within the function. Recursive invocations each have their own local namespace.

8.Class

Class concept is a way of bundling data and functionality together, and supporting extention (i.e., inheritance).

Define a new class define a new type, allowing new instances of that type to be created. The process of creating the object of that type is called instantiation. Class allows independent instances with different data (and the same functionality) to be created.

Compared with C++, Python introduces class concepts with a minimum of new syntax and semantics. Class definition is syntactically similar to function definition. The class statement executes a block of code and attaches its local namespace to a class. The following is an example of defining a class:

class Vector:
    scale = 1
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    def show(self):
        print(f'{self.x}+{self.y}j')
        print(f'scale={self.scale}')

The above define a class named Vector, with two methods and three attributes named scale, x, and y.

In practice, the statements inside a class definition will usually be function definitions. The function definitions inside a class must follow a peculiar form of argument list: The first dummy argument of a method is assumed by Python to be the object in question. The dummy object name is usually called self. This name is just a user convention: you can use an arbitry name here (i.e., what matters is its position in the argument list, not its name). When the method is called using the dot notation object.method(args), users must omit the object name from the argument list, and the name is figured out and inserted to the argument list by python behind the scene.

In a method, attributes of a class instance are created or referred to by using the dot notation: self.attribute. In the above, self.x and self.y are attributes of an instance, whereas scale is an attribute of the class. Note that, if the usual scoping rule applies (i.e., the class defintion forms a parent scope for its methods), the attribute scale should be assessible in the method by using the name scale. It turns out we can refer to scale but must use a different name: the attribute must be qualified with the class name, i.e., Vector.scale. This indicates that the scope of names defined in a class block is limited to the class block, i.e they don't create an enclosing scope for methods. This is one of the new rules introduced to facilitate class defintion, i.e., class defintion body is not used as a parent scope for methods (Otherwise, it would make class inheritance confusing: e.g., a method inherited by a subclass would have access to the subclass scope.) Put it another way: Although classes define a class local scope or namespace, they don't create an enclosing scope for methods.

Class Attribute vs. Instance Attribute

Python classes and objects have different namespaces. The name resolving rule is that the object namespace is before the class namespace: When you access an attribute of an object using the dot convention, it searches first in the namespace of that object for that attribute name. If it is found, it returns the value, otherwise, it searches in the namespace of the class. If nothing is found there as well, it raises an AttributeError. When you access an attribute as a class property, it searches directly in the class namespace for the name of that attribute. If it is found, it returns the value, otherwise, it raises an AttributeError.

Now we can use the class Vector defined above to create an object:

a = Vector(2,3)

Class instantiation uses function notation with the class name serving as a function name. The returned value is a new instance of the class.

Class functions that begin with double underscore __ are called special functions as they have special meaning. Of one particular interest is the __init__() function. This special function gets called whenever a new object of that class is instantiated. This type of function is also called constructors in other programming languages. We normally use it to initialize all the variables.

Next, examine the instance we created above:

a.show() #a is implitcitly provided as the first argument
print(a.x)

In Python, attributes may be referenced/created by methods as well as by users of an object (i.e., attributes of an class/object can be created on the fly). There are no “private” instance variables (variables that can only be accessed within methods of an object). In other words, it is imposible to enforce data hiding in python — it is all based upon convention. The convention (followed by most Python code) is: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a method or a data member).

Note that clients may add data attributes of their own to an instance object. As long as name conflicts are avoided, adding new data abttributs does not affect the validity of the methods.

The global scope associated with a method is the module containing its definition. (A class is never used as a global scope.) Typical global entites used in a methods are functions and modules imported in the global scope.

instance method vs. class method vs static method

Any method we create in a class will automatically be created as an instance method. We must explicitly tell Python that it is a class method or static method.

Use the @classmethod decorator or the classmethod() function to define the class method.

Use the @staticmethod decorator or the staticmethod() function to define a static method.

we'll find some good reasons why a method would want to reference its own class.

Finally, notice that the class .__dict__ and the instance .__dict__ are totally different and independent dictionaries. That's why class attributes are available immediately after you run or import the module in which the class was defined. In contrast, instance attributes come to life only after an object or instance is created.

Classes themselves are objects, which means that a class name can be rebound to new names. This makes importing easy, where we can rename a class name to a new simple name.

Most built-in operators with special syntax (arithmetic operators, subscripting etc.) can be redefined for class instances.

Python Iterators: An iterator is an object that contains a countable number of values and can be iterated upon, meaning that we can traverse through all the values. Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__(). The for loop actually creates an iterator object and executes the next() method for each loop.

Here is a famous problem mentioned in Paul Graham's essay: We want to write a function that generates accumulators – a function that takes a number , and returns a function that takes another number and returns accumulations each time it is invoked (each time, is added to the previous result, starting from ). Here is the Python code:

def Ac(n):
   class dd:
       def __init__(self,n):
           self.value = n
       def inc(self,i):
           self.value += i
           return self.value
       
   return dd(n).inc

ac = Ac(10)
print(ac(1), ac(2), ac(3))

class Ac:
    def __init__(self,n):
        self.value = n
    def __call__(self,i):
        self.value += i
        return self.value

ac = Ac(10)
print(ac(1),ac(2),ac(3))

Python 3 introduced the nonlocal keyword. Using this, the solution is:

def foo(n):
    def inc(x):
        nonlocal n
        n += x
        return n
    return inc

acc = foo(10)
print(acc(1)) # Output: 11
print(acc(2)) # Output: 13 

8.1.Subclass, inheritance

A language feature would not be worthy of the name “class” if it does not support inheritance.

class Shape:
    def __init__(self, color):
        self.color = color
 
    def area(self):
        pass
 
class Circle(Shape):
    def __init__(self, color, radius):
        super().__init__(color)
        self.radius = radius
 
    def area(self):
        return 3.14 * self.radius ** 2

By default, all Python classes are the subclasses of the object class.

8.2. Polymorphism

Function Polymorphism

Class Polymorphism

Inheritance Class Polymorphism

The following is the automatic differentiation using backward (or reverse) method.

class Expression:
    def __add__(exp1, exp2):
        return Plus(exp1,exp2)
    def __mul__(exp1, exp2):
        return Multiply(exp1, exp2)
        
class Variable(Expression):
    def __init__(self,value):
        self.value = value
        self.partial = 0
    def evaluate(self):
        return self.value
    def derive(self,seed):
        self.partial += seed

class Plus(Expression):
    def __init__(self,exp1,exp2):
        self.a = exp1
        self.b = exp2
    def evaluate(self):
        return self.a.evaluate() + self.b.evaluate()
    def derive(self,seed):
        self.a.derive(seed)
        self.b.derive(seed)
        
class Multiply(Expression):
    def __init__(self,exp1,exp2):
        self.a = exp1
        self.b = exp2
    def evaluate(self):
        return self.a.evaluate() * self.b.evaluate()
    def derive(self,seed):
        self.a.derive(seed * self.b.evaluate())
        self.b.derive(seed * self.a.evaluate())

x = Variable(2)
y = Variable(3)
z = x * (x + y) + y * y
z.derive(1)
print(x.partial)  # dz/dx Output:  7
print(y.partial)  # dz/dy Output:  8

This simple example illustrates several important concepts:

* Class, subclass, inheritance

* Operator overloading

* Polymorphism

* Recursion

9.Built-in data structure

Python primitive data structures: list, tuple, set, dict, string.

L = [3, "hello", 0.5] # list
t = (1,2,"apple",3) # tuple
t =  1,2, "apple",3  # tuple
s = {"apple", "banana", 1} #set
d = {"a":1, "b":2} #dict
st = "hello"

Examining the above codes, we can summarize the sytax for differnt data structure: (1) elements in the four data structures are all separated by commas; (2) lists are written with square-brackets, tuples are written with optional round-brackets, sets and dicts are written with curly-brackets; (3) a dict (or hash table) is a special set in which each element has two fields, key and value, which are separated by a colon.

Lists and tuples are ordered collections and thus support indexing (i.e., subscriptable) whereas sets and dicts are unordered and do not support indexing. (You can loop through the set items using a for loop, or ask if a specified value is present in a set, by using the in keyword.)

The primitive type string can also be considered a nontrivial data structure, which supports indexing, similar to a list.

An object that supports item assignment is called a mutable object. Lists, sets and dictionaries are mutable. Strings and tuples are immutable.

9.1.List

Since a list contains misc items, which are not just numbers, some operations on lists are different from array operations that we expect in an array Language. For example:

In [12]: a=[1,2,3]
In [13]: 2*a
Out[13]: [1, 2, 3, 1, 2, 3] #rather than doubling each element

Adding lists concatenates them, just as the “+” operator concatenates strings. The Python standard library defines an array type, which is still a list type, except that the type of objects stored in it is constrained to a single type. As a result, this array type is not as versatile, efficient, or useful as the NumPy array.

List elements can be accessed by using indexes. Indexes of a list start from zero. Elements of a list can also be accessed by using negative index. For example myList[-1] refers to the last element of the list, and myList[-2] refers to the next-to-last element of the list, etc.

We can use slicing notation to pick out a sublist, e.g., b = myList[0:2]. Python use the convention that the final element specified, i.e. myList[2] in this case, is not included in a list slice. If the upper and/or lower limit are omitted, the corresponding list limit will be used, e.g., myList[:] refers to the whole list.

Nested lists (multidimensional lists, lists of lists) can be referenced by using multiply index, such as myList[0][1], not myList[0,1]. The latter notation only works for numpy array objects.

Python is an object-orientated language, and as such it uses classes to define data types, including its primitive types. A list object has some predefined methods. These methods are invoked as using dot notation, e.g., instant.method(arg1,arg2,…). For example:

mylist.append('d') #will add 'd' to the list
mylist.pop(2) # will remove items by index, remove the 3rd item
mylist.remove(x) # Remove the first item from the list whose value is x.
mylist.index(x) #return the index of the first item whose value is x
mylist.count(x) # Return the number of times x appears
list.insert(i, x) #will insert an item before element with index i.
list2.extend(list2)

We can view all the methods defined for a object by using the built-in function dir. For example:

L = [] # define a list object
dir(L) # view all the methods of the list object

List Comprehensions

A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses. For example,

>>> ls = [1, 2, -3, -4]
>>> [math.sqrt(x) for x in ls if x>0]
[1.0, 1.414]

The following code combines the elements of two lists if they are not equal:

>>> [(x, y) for x in [1, 2, 3] for y in [3, 1, 4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

List comprehensions are more declarative than loops, which means they're easier to read and understand. Loops require you to focus on how the list is created. You have to manually create an empty list, loop over the elements, and add each of them to the end of the list. With a list comprehension in Python, you can instead focus on what you want to go in the list and trust that Python will take care of how the list construction takes place.

9.2.String

Single or double quotation marks are used to define a string value:

In [17]: a = 'hello, world' # string
In [18]: a = "hello, world" # string

Each character in a string corresponds to an index and can be accessed using index notation (similar to a list). String methods:

a = "hello, world" # string
a.split(",")  #Splits the string at the specified separator, and returns a list
a.find("o") #Searches the string for "o" and returns the position where it was found

Again, we can view all the methods defined for string object by using the built-in function dir. Note that all string methods returns new values. They do not change the original string.

f-strings,

9.3.Tuple

Another data structure similar to list is tuple, which is an ordered collection of items separated by comman and (optionally) enclosed in round-brackets (parentheses):

t=(1,2,"apple")

The round-brackets are optional. Slicing and addressing a tuple are similar to those of a list. Like a list, we can loop through the tuple items by using a for loop. Unlike a list, Tuples are unchangeable. Once a tuple is created, we cannot change its values.

Tuple methods

mytuble.count("apple") # Returns the number of times a specified value occurs in a tuple
mytuble.index("apple") # Searches the tuple for a specified value and returns the index

9.4.Dictionary

A dictionary is a collection of a pair of items enclosed in curly brackets:

d={"a":1, "b":2} 

each element has two filds, key and value, separated by :

In other languages, this type may be called “hashmaps” or “associative arrays”.

Dictionaries can be built up and added to in a straightforward manner:

In [8]: d = {}
In [9]: d["last name"] = "Alberts"
In [10]: d["first name"] = "Marie"
In [11]: d["birthday"] = "January 27"
In [12]: d
Out[12]: {'birthday': 'January 27', 'first name': 'Marie','last name': 'Alberts'}        

As shown above, the values in a dict can be referred to by using keys. The type of keys of a dictionary can be string, int, float, and even a tuple. For example:

In [15]: A={(1,2):4, "b":5}
In [16]: A[(1,2)]
Out[16]: 4

It is interesting to note that referencing a dictionary item is very similar to referencing a list element if the keys are of int type.

Properties and methods of a dict can be shown by using dir(dict). Some examples:

d.keys() # return all the keys 
d.values() # return all the values
d.items() # return all the tuples (key,value)

9.5.Set

You can think of sets in a technical sense, namely that they are unordered, unindexed, mutable, and contain unique (do not allow duplicate values) and hashable values.

Think of sets as dictionaries without any values. Whatever applies to dict keys also applies to the elements of a set. A list can not be an element of a set because lists are unhashable.

a={"dd", 1, (3,4)} #an items in set can be a tuple, but can not be a list

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() or __cmp__() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python's immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

We can not access an item in a set by referring to an index, since sets are unordered and have no index. But we can loop through a set using for loop, or ask if a specified value is present in a set by using the in keyword.

myset = {"apple", ‘‘banana", ‘‘cherry"}
for x in myset:
    print(x)
print("banana" in myset) #True

Some methods of the set object:

myset.add("apple") # adds an element to the set
myset.remove("apple") # removes a particular element from the set
myset.pop() # removes an random element from the set, retuns the removed item

Similar to list comprehension, there is set comprehension:

>>>s = {v for v in 'abcdabcd' if v not in 'cb'}
>>> print(s)
{'a', 'd'}

Summary of useage of various brackets in python:

10.File Handling

10.1.Create file object

Python builtin function open() takes two parameters; filename, and mode, and returns a file object. For example:

f = open('t.txt', 'r')

There are four different modes for opening a file:

"r" - Read - Default value. error if the file does not exist.

"a" - Append - Opens a file for appending, creates the file if it does not exist.

"w" - Write - Opens a file for writing, creates the file if it does not exist

"x" - Create - Creates the specified file, returns an error if the file exists

In addition you can specify if the file should be handled as binary or text mode

"t" - Text - Default value.

"b" - Binary - Binary mode

10.2.Methods of file object

A python file object has several predefined methods. To read a file, we can use its methods, such as read, readline, readlines. To write to a file, we can use write. However, write is limited to string data, which makes it inconventient to use in most cases. A more convenient way is to use print and use the file object as an argument:

print("This is x1: ", x1, file=f)

More examples:

txt = f.read() #readin the entire file, return a string
f.close()
f = open("t.txt", "w")
f.write("hello") #write to the file
f.write("\n")
f.close() 
f = open('t.txt', 'r') txt1 = f.readline() #readin a line from the file, return a string txt2 = f.readline() #readin another line f.close() f = open('t.txt', 'r') txt = f.readlines() #read the entire file, return a list of string #(one string=>one line)

A file object can also be converted to a list:

f = open('t.txt', 'r')
a = list(f) # return a list, the same as a = f.readlines()

A file object is also an iterator, which means that we can loop over it:

f=open('t.txt', 'r')
for line in f:
    print(line, end='')

Besides builtin functions, there are lots of third-party libriaries for handling files. For reading/writing arrays from/to a file, we can use numpy function np.loadtxt() for reading and np.savetxt() for writing:

data = np.loadtxt('myfile.txt')
np.savetxt('myfile.txt', (x,y,z))
np.savetxt('myfile.txt', np.transpose([x,y,z]))
np.savetxt('myfile.txt', np.column_stack([x,y,z]))
np.savetxt('myfile.txt', np.c_[x,y,z])

For reading files, we can also use np.genfromtxt(), which is more flexible.

11.With

The with statement clarifies code that previously would use try…finally blocks to ensure that clean-up code is executed.

with expression [as variable]:
    with-block

The expression is evaluated, and it should result in an object that supports the context management protocol (that is, has __enter__() and __exit__() methods). This kind of objects are called context managers.

The classic example is opening a file, manipulating the file, then closing it:

 with open('output.txt', 'w') as f:
     f.write('Hi there!')

The above with statement will automatically close the file after the nested block of code. The advantage of using a with statement is that it gaurantee that the file will be closed no matter how the nested block exits. If an exception occurs before the end of the block, it will close the file before the exception is caught by an outer exception handler. If the nested block were to contain a return statement, or a continue or break statement, the with statement would automatically close the file in those cases, too.

12.Numpy

NumPy is the reason why Python stands among the ranks of R, Matlab, and Julia, as one of the most popular languages for doing STEM-related computing. It is a third-party library (i.e. it is not part of Python's standard library) that facilitates numerical computing in Python by providing users with N-dimensional array objects for storing data, and mathematical functions for operating on them. NumPy makes use of a process known as vectorization, that enables a degree of computational efficiency that is otherwise unachievable by the Python language.

Many libraries depend on NumPy's N-dimensional arrays. NumPy also fundamentally impacts the designs of these libraries and the way that they interface with their users.

import numpy as np

The n-dimensional array object in NumPy is called ndarray, all elements of which are of the same type. The data of a ndarray object are stored in contiguous block of system memory, which makes NumPy arrays efficient.

ndarray objects can be created by using various functions, such as np.ndarray(), np.array(), np.zeros() np.empty(), np.linspace(), np.meshgrid(), np.mgrid[], np.ogrid[].

np.meshgrid is modelled after Matlab's meshgrid command. I stick to using indexing='ij', which is less confusing than the case with indexing='xy'.

>>> m = 4; n = 3;
>>> x = np.linspace(1,4,m)
>>> y = np.linspace(5,7,n)
>>> xx, yy = np.meshgrid(x,y, indexing='ij')

Numpy has another function ix_ that's similar to meshgrid:

xx, yy = np.ix_(x,y)

In this case, you get compact arrays that can be broadcasted between each other.

Another family of methods of geting mesh-grid is to use mgrid and ogrid, which are instances that returns mesh-grid when idexed. The difference between them is that mgrid returns a dense (i.e., fleshed out) mesh-grid while ogrid returns compact one.

>>> import numpy as np
>>> m=4;  n=3;
>>> XX, YY = np.mgrid[1:4:m*1j, 5:7:n*1j]
>>> XX
array([[1., 1., 1.],
       [2., 2., 2.],
       [3., 3., 3.],
       [4., 4., 4.]])
>>> YY
array([[5., 6., 7.],
       [5., 6., 7.],
       [5., 6., 7.],
       [5., 6., 7.]])
# Here the shape of XX and YY is (m,n). 
>>> a, b = np.ogrid[1:4:m*1j, 5:7:n*1j]
>>> a
array([[1.],
       [2.],
       [3.],
       [4.]])
>>> b
array([[5., 6., 7.]])
>>> XX, YY = np.broadcast_arrays(a,b)

mgrid and ogrid are less versatile than meshgrid in that they are limitted to uniform case and thus can not use arbitary grids (e.g. grids read from numerical files). mgrid and ogrid can be used to generate 1D grid, and can be used in place of linspace().

Numpy is ensentially about vectorization (avoid explicit looping/indexing in python so that they can be done in C efficiently). As a result, we want to make relevant arrays be of the same shape or be able to be boradcasted between each other.

When operating on two arrays, NumPy compares their shapes element-wise, starting with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when they are equal, or one of them is 1. When broadcasting, arrays do not need to have the same number of dimensions.

ndarray.tolist() Return a copy of the array data as a (nested) Python list.

––––––––––––

While a CPU-bound task is characterized by the computer's cores continually working hard from start to finish, an IO-bound job is dominated by a lot of waiting on input/output to complete.

non-blocking function

Over the last few years, a separate design has been more comprehensively built into CPython: asynchronous IO, enabled through the standard library's asyncio package and the new async and await language keywords. (async IO is not a newly invented concept, and it has existed or is being built into other languages and runtime environments, such as Go, C#, or Scala.)

Async IO is not threading, nor is it multiprocessing. It is not built on top of either of these. In fact, async IO is a single-threaded, single-process design: it uses cooperative multitasking.

––––-

When developing this document, I read the following materials:

https://docs.python.org/

https://www.w3schools.com/python/

https://physics.nyu.edu/pine/pymanual/html/

Bibliography

[1]

J. van der Hoeven et al. GNU TeXmacs. https://www.texmacs.org, 1998.