pugh_torch.modules package

Submodules

pugh_torch.modules.activation module

Easy interface for swapping out activation functions, especially those that may have different weight initialization methods.

  • weights <- initialization depends on activation function <—-

  • normalization |

  • activation <—————————————————

To create a new activation, do the following:
  • Inherit from ActivationModule to register

  • [optional] implement init_layer method

  • [optional] implement init_first_layer method

One this is done, your activation function will me available as:

Activation(“myactivationlowercase”, **kwargs)

pugh_torch.modules.activation.Activation(name, init_layers=None, *, first=False, **kwargs)[source]

Activation Factory Function

Parameters
  • name (str) – Activation function type

  • init_layers (nn.Module or list of nn.Module) – Weights that need initialization based on

  • kwargs (dict) – Passed along to activation function constructor.

class pugh_torch.modules.activation.ActivationModule[source]

Bases: torch.nn.modules.module.Module

Only used to automatically register activation functions.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_first_layer(m)[source]

Override this in child activation function

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.CELU(alpha: float = 1.0, inplace: bool = False)[source]

Bases: torch.nn.modules.activation.CELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

alpha = None
init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.ELU(alpha: float = 1.0, inplace: bool = False)[source]

Bases: torch.nn.modules.activation.ELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

alpha = None
init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.GELU[source]

Bases: torch.nn.modules.activation.GELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Hardshrink(lambd: float = 0.5)[source]

Bases: torch.nn.modules.activation.Hardshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

lambd = None
class pugh_torch.modules.activation.Hardsigmoid(inplace: bool = False)[source]

Bases: torch.nn.modules.activation.Hardsigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.Hardswish(inplace: bool = False)[source]

Bases: torch.nn.modules.activation.Hardswish, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.Hardtanh(min_val: float = -1.0, max_val: float = 1.0, inplace: bool = False, min_value: Optional[float] = None, max_value: Optional[float] = None)[source]

Bases: torch.nn.modules.activation.Hardtanh, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
max_val = None
min_val = None
class pugh_torch.modules.activation.LeakyReLU(negative_slope: float = 0.01, inplace: bool = False)[source]

Bases: torch.nn.modules.activation.LeakyReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
negative_slope = None
class pugh_torch.modules.activation.LogSigmoid[source]

Bases: torch.nn.modules.activation.LogSigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None)[source]

Bases: torch.nn.modules.activation.MultiheadAttention, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

bias_k = None
bias_v = None
class pugh_torch.modules.activation.Noop[source]

Bases: pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training = None
class pugh_torch.modules.activation.PReLU(num_parameters: int = 1, init: float = 0.25)[source]

Bases: torch.nn.modules.activation.PReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

num_parameters = None
class pugh_torch.modules.activation.RReLU(lower: float = 0.125, upper: float = 0.3333333333333333, inplace: bool = False)[source]

Bases: torch.nn.modules.activation.RReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
lower = None
upper = None
class pugh_torch.modules.activation.ReLU(inplace: bool = False)[source]

Bases: torch.nn.modules.activation.ReLU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.ReLU6(inplace: bool = False)[source]

Bases: torch.nn.modules.activation.ReLU6, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
max_val = None
min_val = None
class pugh_torch.modules.activation.SELU(inplace: bool = False)[source]

Bases: torch.nn.modules.activation.SELU, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

inplace = None
class pugh_torch.modules.activation.Sigmoid[source]

Bases: torch.nn.modules.activation.Sigmoid, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Sine(frequency=30)[source]

Bases: pugh_torch.modules.activation.ActivationModule

Implicit Neural Representations with Periodic Activation Functions https://arxiv.org/pdf/2006.09661.pdf

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_first_layer(m)[source]

Override this in child activation function

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Softplus(beta: int = 1, threshold: int = 20)[source]

Bases: torch.nn.modules.activation.Softplus, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

beta = None
init_layer(m)[source]

Override this in child activation function

threshold = None
class pugh_torch.modules.activation.Softshrink(lambd: float = 0.5)[source]

Bases: torch.nn.modules.activation.Softshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

lambd = None
class pugh_torch.modules.activation.Softsign[source]

Bases: torch.nn.modules.activation.Softsign, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Tanh[source]

Bases: torch.nn.modules.activation.Tanh, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Tanhshrink[source]

Bases: torch.nn.modules.activation.Tanhshrink, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_layer(m)[source]

Override this in child activation function

training = None
class pugh_torch.modules.activation.Threshold(threshold: float, value: float, inplace: bool = False)[source]

Bases: torch.nn.modules.activation.Threshold, pugh_torch.modules.activation.ActivationModule

Initializes internal Module state, shared by both nn.Module and ScriptModule.

inplace = None
threshold = None
value = None

pugh_torch.modules.conv module

pugh_torch.modules.conv.conv1x1(in_planes, out_planes, stride=1)[source]

1x1 convolution

pugh_torch.modules.conv.conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1)[source]

3x3 convolution with padding

pugh_torch.modules.hash module

Right now, only RandHashProj is recommended for use.

PyTorch Hashing code is based on code from:

https://github.com/ma3oun/hrn

Values are stored as gradient-less parameters so they get properly saved.

class pugh_torch.modules.hash.BinaryMHash(*args, **kwargs)[source]

Bases: pugh_torch.modules.hash.MHash

Special case of MHash where the output is in the set {-1, 1}

Parameters

p (int) – Large prime number, larger than the size of the universe input set. The default value is a random large prime number that should be sufficient for most use-cases.

hash(x)[source]
training = None
class pugh_torch.modules.hash.Hash(m)[source]

Bases: torch.nn.modules.module.Module

Base module for other pytorch hash functions.

Variables

dim (torch.Tensor) – Scalar output dimensionality (output hash size)

Parameters

m (int) – Output size of this hash function

forward(x)[source]
Parameters

x (torch.Tensor) – Tensor of any shape; this hash function will be applied element-wise.

hash(x)[source]
training = None
class pugh_torch.modules.hash.MHash(m, p=None, a=None, b=None)[source]

Bases: pugh_torch.modules.hash.Hash

Multiplicative Universal Hashing

First described by Lawrence Carter and Mark Wegman Universal Hash Function

See:

https://jeffe.cs.illinois.edu/teaching/datastructures/notes/12-hashing.pdf

output = ((a * input + b) % p) % m

Parameters
  • m (int) – Size of output hash.

  • p (int) – Large prime number, larger than the size of the universe input set. Defaults to a random prime at least 10x bigger than m.

  • a (int) – Salt A. If not explicitly set, randomly initialized.

  • b (int) – Salt B. If not explicitly set, randomly initialized.

classmethod from_offset(m, p, *args, **kwargs)[source]

Set prime number via index into a list of primes starting from 3.

Parameters

p (int) – Index into list of primes to use.

hash(x)[source]
training = None
class pugh_torch.modules.hash.MHashProj(out_feat)[source]

Bases: torch.nn.modules.container.ParameterDict

Hashes and projects and arbitrary-feature-length input into a fixed-feature-length output.

Applies a random feature hashing function.

This is the function PHI described in section 2 of:

https://arxiv.org/abs/2010.05880

Parameters

out_features (int) – The output hash embedding size

forward(x)[source]
Parameters

x (torch.Tensor) – (b, input_feat) Tensor to hash

Returns

(b, output_feat) Hashed tensor

Return type

torch.Tensor

classmethod from_hashers(hash_h, hash_xi)[source]

More advanced initialization from externally defined hashers.

Parameters
  • hash_h (pugh_torch.modules.Hash) – Hashing function that outputs in set {0, 1, ..., out_feat-1}

  • hash_xi (pugh_torch.modules.Hash) – Binary hashing function that outputs in set {-1, 1}

prime_offset = 2000
to(*args, **kwargs)[source]

Records the device to self.device

training = None
class pugh_torch.modules.hash.RandHashProj(out_feat, sparse=None)[source]

Bases: torch.nn.modules.module.Module

We can just extend a single projection matrix without the need for two separate hash functions.

This algorithm deterministically maps an arbitrarily long in_feat vector into a fixed-length out_feat vector. It accomplishes this by the following algorithm:

For each element in the input feature vector:
  1. Based on the index, deterministically multiply it by 1 or -1

  2. Based on the index, deterministically map it to a single element in the output feature vector.

Each element in the output feature vector is the sum of all the input elements mapped to it.

Variables

proj (torch.nn.Parameter) – (out_feat, in_feat) where in_feat is the maximum input feature size fed through yet.

Parameters
  • out_feat (int) – Output feature size

  • sparse (bool) – Use a sparse representation for the internal projection matrix. Saves a good amount of memory when out_feat>5, which is a pretty typical use-case. Defaults to whichever representation would be more memory efficient.

forward(x)[source]
Parameters

x (torch.Tensor) – (B, N) feature vector

property sparse
training = None
pugh_torch.modules.hash.primes(n, copy=False, cache=True)[source]

Returns a array of primes, 3 <= p < n

This is very fast, the following takes <1 second:

res = primes(100_000_000) assert len(res) == 5_761_454 assert res[0] == 3 assert res[-1] == 99_999_989

Caches the largest n array for future calls.

Modified from:

https://stackoverflow.com/a/3035188/13376237

Parameters
  • n (int) – Generate primes up to this number

  • copy (bool) – Copy the output array from the internal cache. Only set to true if you intend to modify the returned array inplace. Defaults to False.

  • cache (bool) – Use the internal cache for generating/storing prime values. Defaults to True.

Returns

Array of output primes. Do not modify this array inplace unless you set copy=True

Return type

numpy.ndarray

pugh_torch.modules.hash.primes_index(i)[source]

Get the prime value at index.

Parameters

index (int) – Index into the list of primes (starting at 3) to get.

pugh_torch.modules.init module

Equivalent names:
  • he == kaming

  • xavier == glorot

Rules of thumb collected from various sources:
  • Use He for ReLU

  • Use xavier for tanh

pugh_torch.modules.init.he(m, mode='fan_in', **kwargs)[source]
pugh_torch.modules.init.xavier(m, **kwargs)[source]

pugh_torch.modules.lightning_module module

Extends pytorch-lightning’s LightningModule for some quality of life improvements.

class pugh_torch.modules.lightning_module.LightningModule(*args, **kwargs)[source]

Bases: pugh_torch.modules.load_state_dict_mixin.LoadStateDictMixin, pytorch_lightning.core.lightning.LightningModule

configure_optimizers()[source]

Pretty good defaults, can be easily overrided

training = None

pugh_torch.modules.load_state_dict_mixin module

class pugh_torch.modules.load_state_dict_mixin.LoadStateDictMixin[source]

Bases: object

load_state_dict(state_dict, strict=True)[source]

Confirms and logs the weights that you expect are loaded when strict=False.

Returns

  • changed (list of str) – Parameter names that were updated via loading the state_dict

  • unchanged (list of str) – Parameter names that were NOT updated via loading the state_dict

  • shape_mismatch (list of str) – Parameter names that were NOT updated via loading the state_dict due to the model’s paramemeter shape not matching the state_dict. This is strictly a subset of unchanged.

pugh_torch.modules.meta module

class pugh_torch.modules.meta.BatchLinear(in_features: int, out_features: int, bias: bool = True)[source]

Bases: torch.nn.modules.linear.Linear

Linear layer that can take batched weights and bias at runtime.

Technically a little wasteful because we might be allocating some parameters that aren’t used, but its usually a very small amount of memory.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, weight=None, bias=None)[source]
Parameters
  • x (torch.Tensor) – (B, *, feat_in) Some input tensor

  • weight (torch.Tensor) – (B feat_out, feat_in) If provided, doesn’t use internal weights

  • bias – (B, feat_out) If provided, doesn’t use internal bias

in_features = None
out_features = None
weight = None

Module contents