Python

Following this will default to the newest version of Python, which is version 3. For backwards compatibility reasons, python will refer to Python 2. However this version will be deprecated at the end of 2019.
On macOS

Installing Python on macOS via Homebrew, at the time of writing, will not install the most current version of Python available, despite it being released 5 months ago.

brew install python@3.8

# Add to .zshenv
path=(/usr/local/opt/python@3.8/libexec/bin ${path})

On Debian

sudo apt install python3.8
sudo apt install python3-pip
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1
sudo update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1

On Ubuntu

Do you really need to close?

In business, yes. In Python, however, no*. For more information, see below for a copy of the explanation I gave to a collaborator on GitHub:

TL;DR:

Once the reference count of file reaches zero, which happens at the end of this try block, CPython will call the __del__() method of the underlying path-like object, which makes the call to close() therein. An argument can be made that how Python manages memory depends on which Python you're using. Most people use CPython, which uses this reference count system, so we have the guarantee that the file will receive the necessary call to close(). But since this is running within a Docker container, and we know we're using CPython, I don't think it's worth changing this line of code to preserve niche cross-Python compatibility concerns, like maintaining support for Jython 🎤💧

Silencing the `python` Console Welcome Message

Normally, when you open the python console, the following welcome message will appear when you enter.

Python 3.7.3 (default, Jun 19 2019, 07:38:49)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

To disable it, use the following command to enter the python console:

python -q

`set()`

A set in Python is implemented as a hash table. It has O(1) lookup and insertion time.

Creating a set and inserting values:

# Creating an empty set
bag = {}
bag = set()
# Creating a set from a list of items
bag = {1, 2}
bag = set([1, 2])

Adding items to a set:

bag = {1, 2}
array = [1, 2, 3, 4, 4, 5]

# Adding 1 item with add()
set.add(3)

# Adding a list of items using update()
bag.update(array)

# Adding a set of items using |=
bag |= set(array)

Create a dictionary, using the elements of a set as the keys

names = set({'Tommy', 'Tina', 'Traveler'})
default_value = '+1 (123) 456-7890'
contacts = dict({name: default_value for name in names})

Create a dictionary from all the key: value pairs of an existing dictionary, excluding keys a, b, and c:

d = {'a': 'alpha', 'b': 'bravo', 'c': 'charlie', 'd': 'delta', 'e': 'echo', 'f': 'foxtrot'}
print({k:d[k] for k in d.keys() - {'a', 'b', 'c'}})

Output:

{'e': 'echo', 'd': 'delta', 'f': 'foxtrot'}

Testing for membership in dicts/sets:

hashmap = {
    'a': 1,
    'b', 2,
    'c', 3
}

keys = hashmap.keys()

# Test that {'a', 'b', 'c'} is a proper subset of hashmap's keys
print({'a', 'b', 'c'} < hashmap.keys())
# => False

# Test that {'a', 'b', 'c'} is a proper superset of hashmap's keys
print({'a', 'b', 'c'} > hashmap.keys())

# Test that 'a' and 'b' are a valid subset of hashmap's keys
print({'a', 'b'} <= hashmap.keys())
# => True

# Test that {'a', 'b', 'c'} is a superset of hashmap's keys
print({'a', 'b', 'c'} >= hashmap.keys())

`os.path()`

from os.path import *
from glob import glob

home = expanduser('~')
# => /Users/tommy

sym_folder = join(home, 'tmp')
# => /Users/tommy/tmp

abs_folder = realpath(sym_folder)
# => /Users/tommy/real/location/folder

files = glob(expanduser('~/notes/*.md'))
# => ['/Users/tommy/notes/file1.md', '/Users/tommy/notes/file2.md']

Jupyter Notebook

A Jupyter Notebook brew install jupyter is an open source web application that allows you to create and share documents containing snippets of pre-executed code.

The Jupyter Notebook supports the following languages, among others:

python
ruby
nodejs
c
c++
bash
zsh
go
perl
php
redis

The Jupyter Notebook will also render LaTeX and Markdown, allowing for flexible formatting of data.

You can execute bash scripts in your Jupyter notebook, simply by pre-pending any given cell with %%bash on the first row.
to add configuration files to your local machine's jupyter notebook, type the following command. This will generate the folder ~/.jupyter and insert a file into it called jupyter_notebook_config.py.

jupyter notebook --generate-config

pass="letmein"
python -c "from notebook.auth import passwd; print(passwd('${pass}'))"

`pylint`

Setting Up `pylint`

On macOS

pip install pylint
export PATH="~/Library/Python/3.7/bin:${PATH}"

On Debian
```
sudo apt install pylint
```

pylint --generate-rcfile > ~/.pylintrc

If pylint notifies you about a linting error that you don't like, add it as an entry, seperated by a comma, to disable. e.g.

disable=missing-docstring,
        invalid-name,
        bare-except

Plotly

Getting Started

Plotly is a visualization tool you can install with pip3 install plotly. You can create an account at plot.ly and generate an API key. Once you have, edit the ~/.plotly/.credentials and insert the following information.

{
  "username": "tommytrojan",
  "api_key": "Fou4dE18o4TUtCz91n6O"
}

Plotting Data

Plotting Online

import pandas as pd
import plotly.plotly as py
import plotly.figure_factory as ff

df = pd.read_csv("earnings.csv")
table = ff.create_table(df)
py.iplot(table, filename='table1')

Plotting Offline

import plotly.offline as py
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, plot, iplot
init_notebook_mode()

data = go.Bar(x=df.height, y=df.weight)
figure = [data]
py.iplot(data, filename='height_weight')

The only big difference between plotting online and plotting offline is whether to use plotly.plotly or plotly.offline for your import. Both will include the plot() and iplot() methods.

Regular Expressions

Use the re library for regular expressions in python. Regular expression patterns are created by calling re.compile() which accepts two arguments, a raw string, and flags. To specify multiple flags, specify each with the bitwise OR operator |

Regular Expression Flags

Flag	Function
`re.A`	Make the pattern match only ASCII characters
`re.I`	Make the pattern case insensitive
`re.M`	The `^` & `$` special characters match the start/end of each line in a string, instead of the start/end of the string itself
`re.S`	Allow the `.` character to match newline characters
`re.X`	Ignore whitespace in the pattern definition, and allow for comments

Special Characters

Special Character	Function
+	Match 1 or more
*	Match 0 or more
?	0 or 1
{k}	Match k consecutive occurances of the preceeding pattern
{m,n}	Match from m to n consecutive occurences (inclusive) of the preceeding pattern (as many as possible)
{m,n}?	Match from m to n consecutive occurrences (inclusive) of the preceeding pattern (as few as possible)
.	Match any character except a newline `\n`
^	Match the start of the string
$	Match the end of string
(	Specify the start of capture group
)	Specify the end of capture group

Escaped Characters

Escaped Character	Matches
`\t`	Horizontal tab
`\v`	Vertical tab
`\n`	Newline
`\f`	Form feed
`\r`	Carriage return
`\w`	`[a-zA-Z0–9_]`
`\d`	`[0–9]`
`\s`	`[\t\n\r\f\v]`
`\b`	Specify a word boundary

Coding example

# Import the RegEx library
import re

# Create a raw-string pattern
pattern = re.compile(r'http[s]?://([^/?:]+)', re.A|re.I)
# Search for the text matching the pattern in "sample"
text = 'https://helpful.wiki/python'

match = pattern.search(text)
print(match.group(0))

# => https://helpful.wiki

print(match.group(1))

# => helpful.wiki

Dates, Times, Timestamps

ISO8601 and RFC3339 are the documents that outline how date and time should be denoted on computers. 1996-12-20T00:39:57Z is an example of an ISO8601 timestamp. The Z denotes that this timestamp is in zulu time, the UTC timezone. The UTC timezone is Coordinated Universal Time and, depending on the time of year, is either 7 or 8 hours ahead of the time in California. The equivalent time in California would be represented as 1996-12-19T16:39:57-08:00

`strftime` and `strptime`

The datetime library has a few packages:

datetime.date
datetime.time
datetime.datetime
datetime.timedelta
datetime.timezone
datetime.tzinfo

Working with ISO formatted timestamp strings

import datetime
import time

event = datetime.date.fromisoformat("2018-12-31")
# -00:00 is an illegal format but Python will interpret it successfully
event = datetime.datetime.fromisoformat("2018-12-31T12:31:58-08:00")
# Using whitespace seperator instead of 'T' character
event = datetime.datetime.fromisoformat("2018-12-31 04:31:58+00:00")

Parsing a non-ISO date into a datetime object
```
moment = datetime.strptime('05-19-2018', '%m-%d-%Y')
print(moment)
```
datetime.datetime(year=2018, month=5, day,19, hour=0, second=0)

Printing a timezone-aware UTC timestamp in RFC3339 format

from datetime import datetime, timezone
datetime.now(timezone.utc).isoformat(sep=' ', timespec='seconds')
# => '2023-04-21 23:50:30+00:00'

Printing a timezone-aware UTC timestamp in ISO8601 format

from datetime import datetime, timezone
datetime.now(timezone.utc).isoformat(sep='T', timespec='seconds')
# => '2023-04-21T23:50:30+00:00'

Printing a timezone-unaware UTC timestamp in RFC3339 format

datetime.utcnow().isoformat(sep=' ', timespec='seconds')
# => '2023-04-21 23:47:55'

Printing a timezone-unaware UTC timestamp in ISO8601 format

datetime.utcnow().isoformat(timespec='seconds')
# => '2023-04-21T23:47:55'

Parsing a non-ISO timestamp into a datetime object

moment = datetime.strptime('10-05-2017 05:00:00 PM', '%m-%d-%Y %I:%M:%S %p')
print(moment)
# => datetime.datetime(year=2017, month=10, day=5, hour=17, second=0)

Working with daylight savings

# If it's currently daylight savings
if time.daylight:
tz = time.altzone
# => 25200 (# of seconds offset from UTC)

else:
tz = time.timezone
# => 28800 (# of seconds offset from UTC)

print(time.tzname)
# => ('PST', 'PDT')

Working with local timezones

print(time.localtime)
# => time.struct_time(tm_year=2019, tm_mon=7, tm_mday=17, tm_hour=11, tm_min=42, tm_sec=15, tm_wday=2, tm_yday=198, tm_isdst=1)

moment = datetime.datetime.utcnow()

print(moment.isoformat(timespec='seconds'))
# => 2019-07-17T11:25:07

print(moment.isoformat(timespec='milliseconds'))
# => 2019-07-31T02:21:15.125

print(moment.isoformat(sep=' ', timespec='microseconds'))
# => 2019-07-3102:21:15.125991

Getting the local timezone information

import datetime
timezone = datetime.datetime.now().astimezone().tzinfo

Setting the timezone

import time
import os
os.environ['TZ'] = 'US/Eastern'
time.tzset()
print(time.tzname)
# => ('EST', 'EDT')

Printing out all available timezones

```py
from zoneinfo import ZoneInfo
from pathlib import Path
for area in ['America', 'Europe', 'Asia', 'Africa', 'Australia', 'Antarctica', 'Etc']:
    print(area)
    for zone in Path('/usr/share/zoneinfo').glob(f'{area}/*'):
        print(f'\t/{zone.name}')
```

Some common timezones have been included below:

```py
from zoneinfo import ZoneInfo
# Constructing a timezone object
tz = timezone('America/Los_Angeles')

# Other valid timezones included below
ZoneInfo('Pacific/Honolulu') # -10:00
ZoneInfo('America/Juneau') # -09:00
ZoneInfo('America/Los_Angeles') # -08:00
ZoneInfo('America/Denver') # -07:00
ZoneInfo('America/Chicago') # -06:00
ZoneInfo('America/New_York') # -05:00
ZoneInfo('Europe/London') # +00:00
ZoneInfo('Europe/Paris') # +01:00
ZoneInfo('Europe/Athens') # +02:00
ZoneInfo('Europe/Moscow') # +03:00
ZoneInfo('Asia/Tehran') # +03:30
ZoneInfo('Asia/Dubai') # +04:00
ZoneInfo('Asia/Kabul') # +04:30 (capitol of Afghanistan)
ZoneInfo('Asia/Dushanbe') # +05:00 (capitol of Tajikstan)
ZoneInfo('Asia/Kathmandu') # +05:45 (capitol of Nepal)
ZoneInfo('Asia/Dhaka') # +06:00 (capitol of Bangladesh)
ZoneInfo('Asia/Bangkok') # +07:00
ZoneInfo('Asia/Shanghai') # +08:00
ZoneInfo('Asia/Tokyo') # +09:00
ZoneInfo('Australia/Sydney') # +10:00
ZoneInfo('Asia/Noumea') # +11:00 (capitol of New Caledonia)
ZoneInfo('Pacific/Fiji') # +12:00
```

Custom Module Locations

If you have your own modules that you want to use, there's a way to tell python where to look for a module that are added by the import keyword.

To do this, set the environment variable PYTHONPATH in your shell, and export that variable. For example, export PYTHONPATH=~/example/python/modules. Now, all of the folders within this directory will be considered a module. For instance, if ~/example/python/modules/ex was a folder containing python code, now you'd be able to type import ex in future python programs.

Sockets

Server-Side TCP Socket

import socket
# Create a TCP socket
mysocket = socket.socket(
    type=socket.SOCK_STREAM
)

# Create an address tuple, the interface to bind to, and a port number
address = ('0.0.0.0', 1234)

# Bind the socket to the address
mysocket.bind(address)

# Listen, allowing 1 pending connection
mysocket.listen(1)

# Listen until the process is killed
while True:

    # Save the accepted connection
    connection, address = mysocket.accept()

    print(f"Accepted connection from {address}")

    # Continue until the transmission has no more data
    while True:
        data, orig_address = connection.recvfrom(4096)

        # If there's no data being transmitted, exit
        if not data:
            break
        else:
            # For now, reply to the same connection, echoing the message
            reply = f"echo \'{data.decode()}\'".encode()
            connection.sendto(reply, address)

    # Close the connection now that the message has been replied to
    connection.close()

from socket import socket

mysocket = socket.socket()

address = ('127.0.0.1', 1234)

# Connect the socket to the server
mysocket.connect(address)

# Send a message to the server, registering the name of the client
message = f'register {args.name}'

# Send the message to the connection
mysocket.sendto(message.encode(), address)

# Save the reply from the response, as well as the address
reply, address = mysocket.recvfrom(4096)

# Decode the reply's binary encoding, store as UTF-8 string
reply = reply.decode()

print(reply)

mysocket.close()

ofile = open(args.logfile, 'w')
ofile.write('connected to server and registered\n')
ofile.write('waiting for messages...\n')
ofile.write('exit')

Find the IPv4 address for hostname google.com

import socket
print(socket.gethostbyname('google.com'))

Output

172.217.5.110

Pandas

from pandas import DataFrame, Series, read_json
barset["EMA12"] = barset["o"].ewm(span=12).mean()
barset.to_json('ofile.json', date_format='iso', date_unit='s', orient='index')
df = read_json('ofile1.json', orient='index', date_unit='s')
print(df.head())

Type Hints

from typing import List, Set, Dict, Tuple, Optional, Callable, Iterator, Union

# For simple built-in types, just use the name of the type
x: int = 1
x: float = 1.0
x: bool = True
x: str = "test"
x: bytes = b"test"

# For collections, the name of the type is capitalized, and the
# name of the type inside the collection is in brackets
x: List[int] = [1]
x: Set[int] = {6, 7}

# Same as above, but with type comment syntax
x = [1]  # type: List[int]

# For mappings, we need the types of both keys and values
x: Dict[str, float] = {'field': 2.0}

# For tuples, we specify the types of all the elements
x: Tuple[int, str, float] = (3, "yes", 7.5)

# Use Optional[] for values that could be None
x: Optional[str] = some_function()
# Mypy understands a value can't be None in an if-statement
if x is not None:
    print(x.upper())
# If a value can never be None due to some invariants, use an assert
assert x is not None
print(x.upper())
# This is how you annotate a function definition
def stringify(num: int) -> str:
    return str(num)

# And here's how you specify multiple arguments
def plus(num1: int, num2: int) -> int:
    return num1 + num2

# Add default value for an argument after the type annotation
def f(num1: int, my_float: float = 3.5) -> float:
    return num1 + my_float

# This is how you annotate a callable (function) value
x: Callable[[int, float], float] = f

# A generator function that yields ints is secretly just a function that
# returns an iterator of ints, so that's how we annotate it
def g(n: int) -> Iterator[int]:
    i = 0
    while i < n:
        yield i
        i += 1

# You can of course split a function annotation over multiple lines
def send_email(address: Union[str, List[str]],
               sender: str,
               cc: Optional[List[str]],
               bcc: Optional[List[str]],
               subject='',
               body: Optional[List[str]] = None
               ) -> bool:

Subprocesses

Using the subprocess library, you can execute other commands from within your script, and capture the standard input and standard output of those commands

Capturing the standard input and output of the command hello

Shell script

#!/bin/zsh
# `hello` program

# Print one & two, separated by newline, to stdout
print 'one\ntwo' >&1

# Print 'three' to stderr
echo 'three' >&2

Python script

from subprocess import run

# Capture the output in a variable named 'result'
result = run(args=['hello'], capture_output=True)

# Decode the output
standard_output = result.stdout.decode()
print(standard_output)
# => 'one'
# => 'two'

# Decode the error
standard_error = result.stderr.decode()
print(standard_error)
# => 'three'

Writing a program that prints to stdout and stderr

from sys import stdout, stderr

# Method 1
print('standard output', file=stdout)
print('standard error', file=stderr)

# Method 2
stdout.write("standard output\n")
stderr.write("standard error\n")

Filepaths

from pathlib import Path

filepath = Path.home() / 'Downloads' / 'meme.jpg'

Function Parameters

There is a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments

def f(a, b, /, c, d, *, e, f):
    print(a, b, c, d, e, f)

One use case for this notation is that it allows pure Python functions to fully emulate behaviors of existing C coded functions. For example, the built-in divmod() function does not accept keyword arguments:

def divmod(a, b, /):
    "Emulate the built in divmod() function"
    return (a // b, a % b)

Python Image Library

pip install pillow

Create Blank White PNG File

from PIL import Image
Image.new('RGB', (1000,1000), (0xff, 0xff, 0xff)).save("image.png", "PNG")

Comprehensions

List Comprehension

text = "some text"
letters = [char for char in text if x !=" "]

print(letters)
# output: ['s', 'o', 'm', 'e', 't', 'e', 'x', 't']

Perform set comprehension using a conditional statement:

cubed_even_numbers = set({value**3 for value in range(1,10) if value % 2 == 0})

print(cubed_even_numbers)
# output: {8, 64, 512, 216}

Iterables

Iterable is a “sequence” of data, you can iterate over using a loop.

The easiest visible example of iterable can be a list of integers, such as [1, 2, 3, 4, 5, 6, 7]

However, it’s possible to iterate over other types of data like a str(), dict(), tuple(), set(), etc.

Verify an object is iterable by checking that it has defined the iter() method

print(hasattr(str, '__iter__'))
# => "True"

print(hasattr(bool, '__iter__'))
# => "False"

`argparse`

Attached below is a program I made to import CSV data exported from my Apple Card into the budgeting software YNAB

import webbrowser
from datetime import datetime
from sys import exit
from json import load, dumps
from os import getenv
from sys import stdin, stdout, stderr, argv
from csv import DictReader, DictWriter
from urllib.request import Request, urlopen

from argparse import ArgumentParser, FileType

parser = ArgumentParser(
    prog='ynab',
    usage='%(prog)s [CSV_FILE]',
    description='%(prog)s: a data pipeline'
)
parser.add_argument(
    '-v',
    "--verbose",
    dest='verbose',
    action='store_true',
    help='option to print CSV to stdout'
)
options = parser.parse_args()

endpoint = 'https://api.youneedabudget.com/v1/budgets/last-used'


def get_account_id():
    account_request = Request(
        url=f'{endpoint}/accounts'
    )

    account_request.add_header('Authorization', f'Bearer {api_token}')

    account_response = urlopen(account_request)
    accounts = load(account_response)['data']['accounts']

    ynab_account_id = None

    for account in accounts:
        if account['name'] == 'Apple Card':
            ynab_account_id = account['id']

    if ynab_account_id is None:
        raise (ValueError('ynab: unable to find account "Apple Card"\n'))

    return ynab_account_id


if (api_token := getenv('YNAB_TOKEN')) is None:
    exit('ynab: expected environment variable ${YNAB_TOKEN}')

# Force input to be provided via file redirection
if stdin.isatty():
    if len(argv) == 1:
        stderr.writelines([
            'ynab: please supply the CSV file via standard input\n',
            '\tusage: `ynab < ./Downloads/apple.csv > ~/ynab.csv`\n'
        ])
        exit(2)

apple_csv = DictReader(
    f=stdin,  # The file to read from (standard input)
    fieldnames=None,  # Assume the CSV file's first row contains the field names
    dialect='unix',  # Specify the encoding method for the CSV file
)

expected_fields = [
    'Transaction Date',
    'Clearing Date',
    'Description',
    'Merchant',
    'Category',
    'Type',
    'Amount (USD)'
]

if apple_csv.fieldnames != expected_fields:
    stderr.writelines([
        'ynab: problem reading CSV header row\n',
        f'\texpected:\t{expected_fields}\n',
        f'\treceived:\t{apple_csv.fieldnames}\n'
    ])
    exit(1)

ynab_csv = DictWriter(
    f=stdout,  # The file to write to (standard output)
    fieldnames=['Date', 'Payee', 'Memo', 'Amount'],  # Specify the field names
    dialect='unix',  # Specify the encoding method for the CSV file
)

csv_transactions = list()

api_transactions = list()

account_id = get_account_id()

# Create a list of api_transactions
for row in apple_csv:

    # Format the date from 2020/01/13 to 2020-01-13
    date = datetime.strptime(
        row['Transaction Date'], '%m/%d/%Y'
    ).date().isoformat()

    # Write an entry to the CSV file
    csv_transactions.append({
        'Date': date,
        'Payee': row['Merchant'],
        'Amount': '{:.2f}'.format(float(row['Amount (USD)']) * -1),
        'Memo': ''
    })

    # Store the next transaction as a dictionary, append it to the list
    api_transactions.append({
        'account_id': account_id,
        'date': date,
        'payee_name': row['Merchant'],
        'cleared': 'cleared',
        'approved': False,
        'amount': int(float(row['Amount (USD)']) * -1_000)
    })


if options.verbose:
    # Write the header row to the CSV file (the field names)
    ynab_csv.writeheader()
    # Write each transaction in the list to a row in the CSV file
    ynab_csv.writerows(csv_transactions)

data = {
    'transactions': api_transactions
}

transaction_request = Request(
    headers={
        'Authorization': f'Bearer {api_token}',
        "Content-Type": 'application/json',
    },
    url=f'{endpoint}/transactions',
    data=dumps(data).encode('utf-8')
)

transaction_response = urlopen(transaction_request)
# print(f'ynab: successfully imported {len(api_transactions)} into YNAB')

# Open YNAB for the user on their default web browser
webbrowser.open('https://app.youneedabudget.com')

Strings

There's two main ways to substitute values into the contents of the template string. You can use formatted string literals (more commonly known simply as "f-strings"), or you can use the old string formatting method, which uses the modulo % operator, reminiscent of the C-style printf() syntax.

Example using old-school string formatting:

# Set variables
age_of_austin = 23
age_of_val = 22

# Format the string
output = 'Austin is %d and Val is %d' % (age_of_austin, age_of_val)

# Print the formatted string
print(output)

Austin is 23 and Val is 22

Example using formatted string literals:

# Set variables
age_of_austin=23
age_of_val=22

# Format the string
output = f'Austin is {age_of_austin} and Val is {age_of_val}'

# Print the formatted string
print(output)

Austin is 23 and Val is 22

Python Caching

You can disable python caching entirely, preventing .pyc files from being written when source modules are imported.

As an environment variable
```
PYTHONDONTWRITEBYTECODE=1
```
As a command line argument
```
python -B script.py
```

Python won’t try to write .pyc files on the import of source modules. See also PYTHONDONTWRITEBYTECODE.

Starting from Python 3.8, you can configure the environment to prevent Python from reading and writing __pycache__ directories, sourcing them instead from a separate location on the filesystem, specified by you.

As an environment variable
```
PYTHONPYCACHEPREFIX=path
```
As a command-line option
```
python -X pycache_prefix=path
```

Setting from within pythonrc.py

from pathlib import Path
sys.pycache_prefix

`plistlib`

Convert an Apple Property List.plist file into a Python dictionary dict() object


import plistlib
from pathlib import Path
filepath = '/System/Applications/Utilities/Terminal.app/Contents/Info.plist'
path = Path(filepath)
plist = plistlib.load(path.open('rb'))

Email

Gmail's API requires a MIME type, but teaches you how to create a MIME message in their documentation.

Creating a MIME type message:

def create_message(sender, to, subject, message_text):
"""Create a message for an email.

Args:
    sender: Email address of the sender.
    to: Email address of the receiver.
    subject: The subject of the email message.
    message_text: The text of the email message.

Returns:
    An object containing a base64url encoded email object.
"""
message = MIMEText(message_text)
message['to'] = to
message['from'] = sender
message['subject'] = subject
return {'raw': base64.urlsafe_b64encode(message.as_string())}

Adding attachments to a MIME type message


def create_message_with_attachment(
    sender, to, subject, message_text, file):
"""Create a message for an email.

Args:
    sender: Email address of the sender.
    to: Email address of the receiver.
    subject: The subject of the email message.
    message_text: The text of the email message.
    file: The path to the file to be attached.

Returns:
    An object containing a base64url encoded email object.
"""
message = MIMEMultipart()
message['to'] = to
message['from'] = sender
message['subject'] = subject

msg = MIMEText(message_text)
message.attach(msg)

content_type, encoding = mimetypes.guess_type(file)

if content_type is None or encoding is not None:
    content_type = 'application/octet-stream'
main_type, sub_type = content_type.split('/', 1)
if main_type == 'text':
    fp = open(file, 'rb')
    msg = MIMEText(fp.read(), _subtype=sub_type)
    fp.close()
elif main_type == 'image':
    fp = open(file, 'rb')
    msg = MIMEImage(fp.read(), _subtype=sub_type)
    fp.close()
elif main_type == 'audio':
    fp = open(file, 'rb')
    msg = MIMEAudio(fp.read(), _subtype=sub_type)
    fp.close()
else:
    fp = open(file, 'rb')
    msg = MIMEBase(main_type, sub_type)
    msg.set_payload(fp.read())
    fp.close()
filename = os.path.basename(file)
msg.add_header('Content-Disposition', 'attachment', filename=filename)
message.attach(msg)

return {'raw': base64.urlsafe_b64encode(message.as_string())}

Sending messages

def send_message(service, user_id, message):
"""Send an email message.

Args:
    service: Authorized Gmail API service instance.
    user_id: User's email address. The special value "me"
    can be used to indicate the authenticated user.
    message: Message to be sent.

Returns:
    Sent Message.
"""
try:
    message = (service.users().messages().send(userId=user_id, body=message)
            .execute())
    print 'Message Id: %s' % message['id']
    return message
except errors.HttpError, error:
    print 'An error occurred: %s' % error

Jupyter Notebook

To get started, you'll need to install some packages

pip install notebook ipywidgets

By default, when you launch a Jupyter notebook, it will be hosted at 127.0.0.1 on port 8888

Launching a Jupyter notebook without opening a browser window:
```
jupyter notebook --no-browser
```

Passwords

Changing a notebook's password the proper way

First, enter a Python shell
```
python
```

Run the passwd() function in the notebook library

from notebook.auth import passwd
passwd
# Enter password:
# Verify password:
# => 'sha1:67c9e60bb8b6:9ffede0825894254b2e042ea597d771089e11aed'

Edit your jupyter_notebook_config.py file

# The password should be of the form 'type:salt:hash'
c.NotebookApp.password = 'sha1:0827b2390e3d:b54ee3e38895aaccc182705ad174bfb3c6e86a10'

Changing a notebook's password the lazy way

Edit your jupyter_notebook_config.py file

```py
from jupyter.auth import passwd
c.NotebookApp.password = passwd('lol_nobody_will_see_this')
```

Plotly

Installing plotly
```
pip install plotly # lol go figure
```

Render a bar graph figure

import plotly.graph_objects.Figure
figure = Figure(data=go.Bar(y=[2, 3, 1]))
figure.show()

Pandas

import pandas as pd
import sys

# Define a dictionary containing employee data
df = pd.DataFrame(
  index=['a', 'b', 'c'],
  columns=['time', 'date', 'name']
  )

# access the first row
df.loc['a']
# equivalent
df.iloc[0]

# select the date column from all rows, starting after the row labeled 'b'
df.loc['b': , 'date']
# equivalent
df.iloc[1: , 1]

# select all rows from the column labeled "time"
df['time']
# equivalent
df.loc[:, 'time']

# select columns from two columns, 'time' and 'date'
print(df.index)

# select the 1st & 3rd rows only, and the column 'date'
bool_array = [True, False, True]
df.loc[bool_array , 'date']

# select the 1st & 3rd columns only, and all rows
df.loc[: , bool_array]

Package Management

`pyproject.toml`

Relevant Python documentation: Declaring Project Metadata

Relevant setuptools documentation: Configuring setuptools using pyproject.toml files

Example pyproject.toml boilerplate:

[project]
name = "myproject"
authors = [
    {name = "Austin Traver", email = "austintraver@gmail.com"},
]
maintainers = [
    {name = "Austin Traver", email = "austintraver@gmail.com"},
]
readme = "README.md"
license = {file = "LICENSE"}
description = "An example project to copy-paste into future projects."
# Reference:https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata
dynamic = ["version"]
requires-python = ">=3.10,<3.12"
dependencies = [
    "jsonschema ~= 3.2",
    "PyYAML ~= 6.0",
]
# Specify the PEP-508 "extras"
# This package, along with its extras,
# can be installed in 'editable mode' using the following command:
# `pip install --editable '.[dev]'`
[project.optional-dependencies]
dev = [
    "autopep8",
    "mypy",
    "pytest",
]

[project.urls]
homepage = "https://github.com/austintraver/myexample"
documentation = "https://github.com/austintraver/myexample/tree/main/docs"
repository = "https://github.com/austintraver/myexample"
changelog = "https://github.com/austintraver/myexample/releases"

[project.scripts]
# This will create the command `mycli` in your shell, and calling it will invoke the function `main()` within your package.
mycli = "mypackage.__main__:main"

[build-system]
requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"

# Reference: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata
[tool.setuptools]
packages = [
    "mypackage",
    "mypackage.cli",
    "mypackage.templates"
]
# Reference: https://setuptools.pypa.io/en/latest/userguide/datafiles.html
# Default value for 'include-package-data': true
include-package-data = true
# If not specified, setuptools will try to guess a reasonable default for the package
# Reference: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html
zip-safe = false

# Documentation: https://github.com/pypa/setuptools_scm
[tool.setuptools_scm]

# Documentation: https://docs.pytest.org/en/latest/reference/customize.html#pyproject-toml
# Configurations: https://docs.pytest.org/en/latest/reference/reference.html#ini-options-ref
[tool.pytest.ini_options]
minversion = "7.2"

Dynamic Version Numbering

The article Dynamic Versioning in the setuptools documentation explains how more about dynamic fields.

dynamic = ["version"]
[build-system]
requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"
[tool.setuptools_scm]

from argparse import ArgumentParser
from importlib.metadata import version
parser = ArgumentParser()
parser.add_argument(
    '--version',
    help='installed version of `mypackage`',
    action='version',
    version=f"v{version('mypackage')}"
)

Installing packages from GitHub

You can install packages from GitHub repositories (even private ones!) via SSH by using the following syntax:¹

pip install 'PACKAGE_NAME @ git+ssh://git@github.com/OWNER/REPO@REF'

Example of a branch endpoint:

package @ git+https://github.com/package@main

Example of a tag endpoint:

package @ git+https://github.com/package@v0.1.0

Example of a commit hash endpoint:

package @ git+https://github.com/package@fd80709

Example of a pull-request endpoint

package @ git+https://github.com/package@refs/pull/123/head

Where REF is the git reference (branch, tag, or commit hash) that you want to install. You can even omit the REF and it will default to the head of the main branch of the repository.

Note: You can check what the latest release is by going to this endpoint:

https://github.com/OWNER/REPO/releases/latest

Installing development dependencies

You can, according to PEP-508, specify extras, for a package. This allows you to install particular sets of extra dependencies or features for your package. For example, we can create an extra for the developer dependencies by adding the following section to pyproject.toml: pip install --editable '.[dev]'

[project.optional-dependencies]
dev = [
    "autopep8",
    "mypy",
    "pytest",
]

First-time contributors to the project can then initialize the repository on their local machine using the following command

python3.10 -m venv .venv \
    --clear \
    --upgrade-deps

.venv/bin/activate

python -m pip install \
    --upgrade \
    --upgrade-strategy 'only-if-needed' \
    wheel

python -m pip install \
    --editable \
    --upgrade \
    --upgrade-strategy 'only-if-needed' \
    '.[dev]'

`pytest`

Testing that an invocation of code causes a specific exit code:

import re
import pytest
from mymodule import mycli

def test_help(capsys):
    with pytest.raises(SystemExit) as tested_exit:
        mycli(['--help'])
        assert tested_exit.type == SystemExit
        assert tested_exit.value.code == 0
    captured = capsys.readouterr()
    assert '-h, --help' in captured.out

def test_version(capsys):
    with pytest.raises(SystemExit) as tested_exit:
        mycli(['--version'])
        assert tested_exit.type == SystemExit
        assert tested_exit.value.code == 0
    captured = capsys.readouterr()
    assert re.fullmatch(r'v\d+.\d+.\d+', captured.out.strip())

`mypy`

python -m mypy \
    --install-types \
    --non-interactive \
    ${ROOT_DIR}/package_1_top_folder/ \
    ${ROOT_DIR}/package_2_top_folder/ \

Python Wheels and Docker Images

You can use a multi-stage Docker build to build the wheels for your Python package and its dependencies in a dedicated stage, and then copy the wheels into the final image. This allows you to build a Docker image that is as small as possible. Additionally, installing dependencies from pre-built wheels dramatically speeds up the build process.

# ---------------------------------------------------
# Build the wheels in a dedicated builder stage.
FROM public.ecr.aws/docker/library/python:3.10-alpine AS builder

# Set environment values.
ENV WORKDIR /opt/worker
WORKDIR ${WORKDIR}

# Install needed system packages for `grpcio`
RUN apk add --no-cache build-base linux-headers

# Build wheels for the SDK and its dependencies.
COPY src/sdk ${WORKDIR}/sdk
RUN python -m pip wheel \
    --disable-pip-version-check \
    --no-cache-dir \
    --wheel-dir /wheel \
    ${WORKDIR}/sdk

# Build wheels for the dependencies of this package.
COPY src/requirements.txt ${WORKDIR}/
RUN python -m pip wheel \
    --disable-pip-version-check \
    --no-cache-dir \
    --wheel-dir /wheel \
    --requirement ${WORKDIR}/requirements.txt

# ---------------------------------------------------
# Build the final image.
FROM public.ecr.aws/docker/library/python:3.10-alpine

ENV WORKDIR /opt/worker
WORKDIR ${WORKDIR}

# Include 'nmap', which is needed by the SDK.
RUN apk add --no-cache git nmap nmap-scripts

# Copy the wheels over from the builder stage.
COPY --from=builder /wheel/ /wheel

# Install the dependencies of this package,
# using the wheels from the builder stage.
COPY src/requirements.txt ${WORKDIR}/
RUN python -m pip install \
    --disable-pip-version-check \
    --progress-bar off \
    --no-cache-dir \
    --no-index \
    --find-links /wheel \
    --requirement ${WORKDIR}/requirements.txt

# Install the SDK and its dependencies,
# using the wheels from the builder stage.
COPY src/sdk ${WORKDIR}/sdk
RUN python -m pip install \
    --disable-pip-version-check \
    --progress-bar off \
    --no-cache-dir \
    --no-index \
    --find-links /wheel \
    ${WORKDIR}/sdk

# Include source code
COPY src/pkg/ ${WORKDIR}/

ENTRYPOINT python ${WORKDIR}/main.py

In order to get it to run on multiple architectures, I had to consult the Docker documentation for buildx, which suggested that I deploy a custom builder:

docker buildx create \
    --name 'custom-builder' \
    --driver 'docker-container' \
    --platform 'darwin,linux/amd64,linux/arm64' \
    --bootstrap

Then, I could use this custom builder:

docker buildx use 'custom-builder'

Once I had done so, the following command successfully build images for multiple architectures, and no longer produced error messages on my M1 Mac:

docker buildx build \
    --platform 'linux/amd64,linux/arm64' \
    --progress plain \
    --build-arg BUILDKIT_CONTEXT_KEEP_GIT_DIR=1 \
    --build-arg BUILDKIT_MULTI_PLATFORM=1 \
    --file path/to/Dockerfile \
    --tag 'localhost/imagename' \
    --output=docker \
    .

If you need to push a multi-architecture image to Amazon ECR you'll first need to authenticate to the private registry:

aws ecr get-login-password --region region \
| docker login \
    --username 'AWS' \
    --password-stdin 'AWS_ACCOUNT_ID.dkr.ecr.REGION.amazonaws.com'

Note: The username AWS is supposed to stay unchanged. Don't replace it with your own username.

Once you have authenticated, you can push the Docker image to the Amazon ECR registry:

docker push AWS_ACCOUNT_ID.dkr.ecr.AWS_REGION.amazonaws.com/MY_ECR_REPO:TAG

If you can't get Podman to build AMD64 images on your M1 Mac, you can run the following commands:

podman machine ssh sudo rpm-ostree install qemu-user-static
podman machine ssh sudo systemctl reboot

After doing this, your build issues should go away.

podman build \
    --no-cache \
    --platform linux/amd64 \
    -tag myimage:latest \
    --file "path/to/Dockerfile" \
    .

pip documentation: VCS Support ↩︎

Python

Do you really need to close?

Silencing the python Console Welcome Message

set()

os.path()

Jupyter Notebook

pylint

Setting Up pylint

Plotly

Getting Started

Plotting Data

Plotting Online

Plotting Offline

Regular Expressions

Dates, Times, Timestamps

strftime and strptime

Custom Module Locations

Sockets

Server-Side TCP Socket

Pandas

Type Hints

Subprocesses

Filepaths

Function Parameters

Python Image Library

Comprehensions

Iterables

argparse

Strings

Python Caching

plistlib

Email

Jupyter Notebook

Passwords

Plotly

Pandas

Package Management

pyproject.toml

Dynamic Version Numbering

Installing packages from GitHub

Installing development dependencies

pytest

mypy

Python Wheels and Docker Images

Silencing the `python` Console Welcome Message

`set()`

`os.path()`

`pylint`

Setting Up `pylint`

`strftime` and `strptime`

`argparse`

`plistlib`

`pyproject.toml`

`pytest`

`mypy`