pugh_torch.modules package¶

Submodules¶

pugh_torch.modules.activation module¶

Easy interface for swapping out activation functions, especially those that may have different weight initialization methods.

weights <- initialization depends on activation function <—-

normalization |

activation <—————————————————

To create a new activation, do the following:

Inherit from ActivationModule to register
[optional] implement init_layer method
[optional] implement init_first_layer method

One this is done, your activation function will me available as:

Activation(“myactivationlowercase”, **kwargs)

pugh_torch.modules.activation.Activation(name, init_layers=None, *, first=False, **kwargs)[source]¶

Activation Factory Function

Parameters

name (str) – Activation function type
init_layers (nn.Module or list of nn.Module) – Weights that need initialization based on
kwargs (dict) – Passed along to activation function constructor.

class pugh_torch.modules.activation.ActivationModule[source]¶

Bases: torch.nn.modules.module.Module

Only used to automatically register activation functions.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_first_layer(m)[source]¶: Override this in child activation function

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.CELU(alpha: float = 1.0, inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.CELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

alpha = None¶

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.ELU(alpha: float = 1.0, inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.ELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

alpha = None¶

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.GELU[source]¶

Bases: torch.nn.modules.activation.GELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Hardshrink(lambd: float = 0.5)[source]¶

Bases: torch.nn.modules.activation.Hardshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

lambd = None¶

class pugh_torch.modules.activation.Hardsigmoid(inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.Hardsigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.Hardswish(inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.Hardswish, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.Hardtanh(min_val: float = -1.0, max_val: float = 1.0, inplace: bool = False, min_value: Optional[float] = None, max_value: Optional[float] = None)[source]¶

Bases: torch.nn.modules.activation.Hardtanh, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

max_val = None¶

min_val = None¶

class pugh_torch.modules.activation.LeakyReLU(negative_slope: float = 0.01, inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.LeakyReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

negative_slope = None¶

class pugh_torch.modules.activation.LogSigmoid[source]¶

Bases: torch.nn.modules.activation.LogSigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None)[source]¶

Bases: torch.nn.modules.activation.MultiheadAttention, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

bias_k = None¶

bias_v = None¶

class pugh_torch.modules.activation.Noop[source]¶

Bases: pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training = None¶

class pugh_torch.modules.activation.PReLU(num_parameters: int = 1, init: float = 0.25)[source]¶

Bases: torch.nn.modules.activation.PReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

num_parameters = None¶

class pugh_torch.modules.activation.RReLU(lower: float = 0.125, upper: float = 0.3333333333333333, inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.RReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

lower = None¶

upper = None¶

class pugh_torch.modules.activation.ReLU(inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.ReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.ReLU6(inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.ReLU6, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

max_val = None¶

min_val = None¶

class pugh_torch.modules.activation.SELU(inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.SELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

inplace = None¶

class pugh_torch.modules.activation.Sigmoid[source]¶

Bases: torch.nn.modules.activation.Sigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Sine(frequency=30)[source]¶

Bases: pugh_torch.modules.activation.ActivationModule

Implicit Neural Representations with Periodic Activation Functions https://arxiv.org/pdf/2006.09661.pdf

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_first_layer(m)[source]¶: Override this in child activation function

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Softplus(beta: int = 1, threshold: int = 20)[source]¶

Bases: torch.nn.modules.activation.Softplus, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

beta = None¶

init_layer(m)[source]¶: Override this in child activation function

threshold = None¶

class pugh_torch.modules.activation.Softshrink(lambd: float = 0.5)[source]¶

Bases: torch.nn.modules.activation.Softshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

lambd = None¶

class pugh_torch.modules.activation.Softsign[source]¶

Bases: torch.nn.modules.activation.Softsign, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Tanh[source]¶

Bases: torch.nn.modules.activation.Tanh, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Tanhshrink[source]¶

Bases: torch.nn.modules.activation.Tanhshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]¶: Override this in child activation function

training = None¶

class pugh_torch.modules.activation.Threshold(threshold: float, value: float, inplace: bool = False)[source]¶

Bases: torch.nn.modules.activation.Threshold, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

inplace = None¶

threshold = None¶

value = None¶

pugh_torch.modules.conv module¶

pugh_torch.modules.conv.conv1x1(in_planes, out_planes, stride=1)[source]¶: 1x1 convolution

pugh_torch.modules.conv.conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1)[source]¶: 3x3 convolution with padding

pugh_torch.modules.hash module¶

Right now, only RandHashProj is recommended for use.

PyTorch Hashing code is based on code from:: https://github.com/ma3oun/hrn

Values are stored as gradient-less parameters so they get properly saved.

class pugh_torch.modules.hash.BinaryMHash(*args, **kwargs)[source]¶

Bases: pugh_torch.modules.hash.MHash

Special case of MHash where the output is in the set {-1, 1}

Parameters: p (int) – Large prime number, larger than the size of the universe input set. The default value is a random large prime number that should be sufficient for most use-cases.

hash(x)[source]¶

training = None¶

class pugh_torch.modules.hash.Hash(m)[source]¶

Bases: torch.nn.modules.module.Module

Base module for other pytorch hash functions.

Variables: dim (torch.Tensor) – Scalar output dimensionality (output hash size)
Parameters: m (int) – Output size of this hash function

forward(x)[source]¶

Parameters: x (torch.Tensor) – Tensor of any shape; this hash function will be applied element-wise.

hash(x)[source]¶

training = None¶

class pugh_torch.modules.hash.MHash(m, p=None, a=None, b=None)[source]¶

Bases: pugh_torch.modules.hash.Hash

Multiplicative Universal Hashing

First described by Lawrence Carter and Mark Wegman Universal Hash Function

See:: https://jeffe.cs.illinois.edu/teaching/datastructures/notes/12-hashing.pdf

output = ((a * input + b) % p) % m

Parameters

m (int) – Size of output hash.
p (int) – Large prime number, larger than the size of the universe input set. Defaults to a random prime at least 10x bigger than m.
a (int) – Salt A. If not explicitly set, randomly initialized.
b (int) – Salt B. If not explicitly set, randomly initialized.

classmethod from_offset(m, p, *args, **kwargs)[source]¶

Set prime number via index into a list of primes starting from 3.

Parameters: p (int) – Index into list of primes to use.

hash(x)[source]¶

training = None¶

class pugh_torch.modules.hash.MHashProj(out_feat)[source]¶

Bases: torch.nn.modules.container.ParameterDict

Hashes and projects and arbitrary-feature-length input into a fixed-feature-length output.

Applies a random feature hashing function.

This is the function PHI described in section 2 of:: https://arxiv.org/abs/2010.05880

Parameters: out_features (int) – The output hash embedding size

forward(x)[source]¶

Parameters: x (torch.Tensor) – (b, input_feat) Tensor to hash
Returns: (b, output_feat) Hashed tensor
Return type: torch.Tensor

classmethod from_hashers(hash_h, hash_xi)[source]¶

More advanced initialization from externally defined hashers.

Parameters

hash_h (pugh_torch.modules.Hash) – Hashing function that outputs in set {0, 1, ..., out_feat-1}
hash_xi (pugh_torch.modules.Hash) – Binary hashing function that outputs in set {-1, 1}

prime_offset = 2000¶

to(*args, **kwargs)[source]¶: Records the device to self.device

training = None¶

class pugh_torch.modules.hash.RandHashProj(out_feat, sparse=None)[source]¶

Bases: torch.nn.modules.module.Module

We can just extend a single projection matrix without the need for two separate hash functions.

This algorithm deterministically maps an arbitrarily long in_feat vector into a fixed-length out_feat vector. It accomplishes this by the following algorithm:

For each element in the input feature vector:

Based on the index, deterministically multiply it by 1 or -1

Based on the index, deterministically map it to a single element in the output feature vector.

Each element in the output feature vector is the sum of all the input elements mapped to it.

Variables

proj (torch.nn.Parameter) – (out_feat, in_feat) where in_feat is the maximum input feature size fed through yet.

Parameters

out_feat (int) – Output feature size
sparse (bool) – Use a sparse representation for the internal projection matrix. Saves a good amount of memory when out_feat>5, which is a pretty typical use-case. Defaults to whichever representation would be more memory efficient.

forward(x)[source]¶

Parameters: x (torch.Tensor) – (B, N) feature vector

property sparse¶

training = None¶

pugh_torch.modules.hash.primes(n, copy=False, cache=True)[source]¶

Returns a array of primes, 3 <= p < n

This is very fast, the following takes <1 second:: res = primes(100_000_000) assert len(res) == 5_761_454 assert res[0] == 3 assert res[-1] == 99_999_989

Caches the largest n array for future calls.

Modified from:: https://stackoverflow.com/a/3035188/13376237

Parameters

n (int) – Generate primes up to this number
copy (bool) – Copy the output array from the internal cache. Only set to true if you intend to modify the returned array inplace. Defaults to False.
cache (bool) – Use the internal cache for generating/storing prime values. Defaults to True.

Returns

Array of output primes. Do not modify this array inplace unless you set copy=True

Return type

numpy.ndarray

pugh_torch.modules.hash.primes_index(i)[source]¶

Get the prime value at index.

Parameters: index (int) – Index into the list of primes (starting at 3) to get.

pugh_torch.modules.init module¶

Equivalent names:

he == kaming
xavier == glorot

Rules of thumb collected from various sources:

Use He for ReLU
Use xavier for tanh

pugh_torch.modules.init.he(m, mode='fan_in', **kwargs)[source]¶

pugh_torch.modules.init.xavier(m, **kwargs)[source]¶

pugh_torch.modules.lightning_module module¶

Extends pytorch-lightning’s LightningModule for some quality of life improvements.

class pugh_torch.modules.lightning_module.LightningModule(*args, **kwargs)[source]¶

Bases: pugh_torch.modules.load_state_dict_mixin.LoadStateDictMixin, pytorch_lightning.core.lightning.LightningModule

configure_optimizers()[source]¶: Pretty good defaults, can be easily overrided

training = None¶

pugh_torch.modules.load_state_dict_mixin module¶

class pugh_torch.modules.load_state_dict_mixin.LoadStateDictMixin[source]¶

Bases: object

load_state_dict(state_dict, strict=True)[source]¶

Confirms and logs the weights that you expect are loaded when strict=False.

Returns

changed (list of str) – Parameter names that were updated via loading the state_dict
unchanged (list of str) – Parameter names that were NOT updated via loading the state_dict
shape_mismatch (list of str) – Parameter names that were NOT updated via loading the state_dict due to the model’s paramemeter shape not matching the state_dict. This is strictly a subset of unchanged.

pugh_torch.modules.meta module¶

class pugh_torch.modules.meta.BatchLinear(in_features: int, out_features: int, bias: bool = True)[source]¶

Bases: torch.nn.modules.linear.Linear

Linear layer that can take batched weights and bias at runtime.

Technically a little wasteful because we might be allocating some parameters that aren’t used, but its usually a very small amount of memory.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, weight=None, bias=None)[source]¶

Parameters

x (torch.Tensor) – (B, *, feat_in) Some input tensor
weight (torch.Tensor) – (B feat_out, feat_in) If provided, doesn’t use internal weights
bias – (B, feat_out) If provided, doesn’t use internal bias

in_features = None¶

out_features = None¶

weight = None¶

pugh_torch.modules package¶

Submodules¶

pugh_torch.modules.activation module¶

pugh_torch.modules.conv module¶

pugh_torch.modules.hash module¶

pugh_torch.modules.init module¶

pugh_torch.modules.lightning_module module¶

pugh_torch.modules.load_state_dict_mixin module¶

pugh_torch.modules.meta module¶

Module contents¶