pugh_torch.modules package¶
Submodules¶
pugh_torch.modules.activation module¶
Easy interface for swapping out activation functions, especially those that may have different weight initialization methods.
weights <- initialization depends on activation function <—-
normalization |
activation <—————————————————
- To create a new activation, do the following:
Inherit from ActivationModule to register
[optional] implement
init_layermethod[optional] implement
init_first_layermethod
- One this is done, your activation function will me available as:
Activation(“myactivationlowercase”, **kwargs)
-
pugh_torch.modules.activation.Activation(name, init_layers=None, *, first=False, **kwargs)[source]¶ Activation Factory Function
- Parameters
name (str) – Activation function type
init_layers (nn.Module or list of nn.Module) – Weights that need initialization based on
kwargs (dict) – Passed along to activation function constructor.
-
class
pugh_torch.modules.activation.ActivationModule[source]¶ Bases:
torch.nn.modules.module.ModuleOnly used to automatically register activation functions.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.CELU(alpha: float = 1.0, inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.CELU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
alpha= None¶
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.ELU(alpha: float = 1.0, inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.ELU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
alpha= None¶
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.GELU[source]¶ Bases:
torch.nn.modules.activation.GELU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Hardshrink(lambd: float = 0.5)[source]¶ Bases:
torch.nn.modules.activation.Hardshrink,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
lambd= None¶
-
-
class
pugh_torch.modules.activation.Hardsigmoid(inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.Hardsigmoid,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.Hardswish(inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.Hardswish,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.Hardtanh(min_val: float = -1.0, max_val: float = 1.0, inplace: bool = False, min_value: Optional[float] = None, max_value: Optional[float] = None)[source]¶ Bases:
torch.nn.modules.activation.Hardtanh,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
max_val= None¶
-
min_val= None¶
-
-
class
pugh_torch.modules.activation.LeakyReLU(negative_slope: float = 0.01, inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.LeakyReLU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
negative_slope= None¶
-
-
class
pugh_torch.modules.activation.LogSigmoid[source]¶ Bases:
torch.nn.modules.activation.LogSigmoid,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None)[source]¶ Bases:
torch.nn.modules.activation.MultiheadAttention,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
bias_k= None¶
-
bias_v= None¶
-
-
class
pugh_torch.modules.activation.Noop[source]¶ Bases:
pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training= None¶
-
-
class
pugh_torch.modules.activation.PReLU(num_parameters: int = 1, init: float = 0.25)[source]¶ Bases:
torch.nn.modules.activation.PReLU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
num_parameters= None¶
-
-
class
pugh_torch.modules.activation.RReLU(lower: float = 0.125, upper: float = 0.3333333333333333, inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.RReLU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
lower= None¶
-
upper= None¶
-
-
class
pugh_torch.modules.activation.ReLU(inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.ReLU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.ReLU6(inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.ReLU6,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
max_val= None¶
-
min_val= None¶
-
-
class
pugh_torch.modules.activation.SELU(inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.SELU,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
-
class
pugh_torch.modules.activation.Sigmoid[source]¶ Bases:
torch.nn.modules.activation.Sigmoid,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Sine(frequency=30)[source]¶ Bases:
pugh_torch.modules.activation.ActivationModuleImplicit Neural Representations with Periodic Activation Functions https://arxiv.org/pdf/2006.09661.pdf
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Softplus(beta: int = 1, threshold: int = 20)[source]¶ Bases:
torch.nn.modules.activation.Softplus,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
beta= None¶
-
threshold= None¶
-
-
class
pugh_torch.modules.activation.Softshrink(lambd: float = 0.5)[source]¶ Bases:
torch.nn.modules.activation.Softshrink,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
lambd= None¶
-
-
class
pugh_torch.modules.activation.Softsign[source]¶ Bases:
torch.nn.modules.activation.Softsign,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Tanh[source]¶ Bases:
torch.nn.modules.activation.Tanh,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Tanhshrink[source]¶ Bases:
torch.nn.modules.activation.Tanhshrink,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
training= None¶
-
-
class
pugh_torch.modules.activation.Threshold(threshold: float, value: float, inplace: bool = False)[source]¶ Bases:
torch.nn.modules.activation.Threshold,pugh_torch.modules.activation.ActivationModuleInitializes internal Module state, shared by both nn.Module and ScriptModule.
-
inplace= None¶
-
threshold= None¶
-
value= None¶
-
pugh_torch.modules.conv module¶
pugh_torch.modules.hash module¶
Right now, only RandHashProj is recommended for use.
- PyTorch Hashing code is based on code from:
Values are stored as gradient-less parameters so they get properly saved.
-
class
pugh_torch.modules.hash.BinaryMHash(*args, **kwargs)[source]¶ Bases:
pugh_torch.modules.hash.MHashSpecial case of MHash where the output is in the set {-1, 1}
- Parameters
p (int) – Large prime number, larger than the size of the universe input set. The default value is a random large prime number that should be sufficient for most use-cases.
-
training= None¶
-
class
pugh_torch.modules.hash.Hash(m)[source]¶ Bases:
torch.nn.modules.module.ModuleBase module for other pytorch hash functions.
- Variables
dim (torch.Tensor) – Scalar output dimensionality (output hash size)
- Parameters
m (int) – Output size of this hash function
-
forward(x)[source]¶ - Parameters
x (torch.Tensor) – Tensor of any shape; this hash function will be applied element-wise.
-
training= None¶
-
class
pugh_torch.modules.hash.MHash(m, p=None, a=None, b=None)[source]¶ Bases:
pugh_torch.modules.hash.HashMultiplicative Universal Hashing
First described by Lawrence Carter and Mark Wegman Universal Hash Function
output = ((a * input + b) % p) % m
- Parameters
m (int) – Size of output hash.
p (int) – Large prime number, larger than the size of the universe input set. Defaults to a random prime at least 10x bigger than
m.a (int) – Salt A. If not explicitly set, randomly initialized.
b (int) – Salt B. If not explicitly set, randomly initialized.
-
classmethod
from_offset(m, p, *args, **kwargs)[source]¶ Set prime number via index into a list of primes starting from 3.
- Parameters
p (int) – Index into list of primes to use.
-
training= None¶
-
class
pugh_torch.modules.hash.MHashProj(out_feat)[source]¶ Bases:
torch.nn.modules.container.ParameterDictHashes and projects and arbitrary-feature-length input into a fixed-feature-length output.
Applies a random feature hashing function.
- This is the function PHI described in section 2 of:
- Parameters
out_features (int) – The output hash embedding size
-
forward(x)[source]¶ - Parameters
x (torch.Tensor) – (b, input_feat) Tensor to hash
- Returns
(b, output_feat) Hashed tensor
- Return type
torch.Tensor
-
classmethod
from_hashers(hash_h, hash_xi)[source]¶ More advanced initialization from externally defined hashers.
- Parameters
hash_h (pugh_torch.modules.Hash) – Hashing function that outputs in set
{0, 1, ..., out_feat-1}hash_xi (pugh_torch.modules.Hash) – Binary hashing function that outputs in set
{-1, 1}
-
prime_offset= 2000¶
-
training= None¶
-
class
pugh_torch.modules.hash.RandHashProj(out_feat, sparse=None)[source]¶ Bases:
torch.nn.modules.module.ModuleWe can just extend a single projection matrix without the need for two separate hash functions.
This algorithm deterministically maps an arbitrarily long
in_featvector into a fixed-lengthout_featvector. It accomplishes this by the following algorithm:- For each element in the input feature vector:
Based on the index, deterministically multiply it by
1or-1Based on the index, deterministically map it to a single element in the output feature vector.
Each element in the output feature vector is the sum of all the input elements mapped to it.
- Variables
proj (torch.nn.Parameter) – (out_feat, in_feat) where in_feat is the maximum input feature size fed through yet.
- Parameters
out_feat (int) – Output feature size
sparse (bool) – Use a sparse representation for the internal projection matrix. Saves a good amount of memory when
out_feat>5, which is a pretty typical use-case. Defaults to whichever representation would be more memory efficient.
-
property
sparse¶
-
training= None¶
-
pugh_torch.modules.hash.primes(n, copy=False, cache=True)[source]¶ Returns a array of primes, 3 <= p < n
- This is very fast, the following takes <1 second:
res = primes(100_000_000) assert len(res) == 5_761_454 assert res[0] == 3 assert res[-1] == 99_999_989
Caches the largest
narray for future calls.- Modified from:
- Parameters
n (int) – Generate primes up to this number
copy (bool) – Copy the output array from the internal cache. Only set to
trueif you intend to modify the returned array inplace. Defaults toFalse.cache (bool) – Use the internal cache for generating/storing prime values. Defaults to
True.
- Returns
Array of output primes. Do not modify this array inplace unless you set
copy=True- Return type
numpy.ndarray
pugh_torch.modules.init module¶
- Equivalent names:
he == kaming
xavier == glorot
- Rules of thumb collected from various sources:
Use He for ReLU
Use xavier for tanh
pugh_torch.modules.lightning_module module¶
Extends pytorch-lightning’s LightningModule for some quality of life improvements.
pugh_torch.modules.load_state_dict_mixin module¶
-
class
pugh_torch.modules.load_state_dict_mixin.LoadStateDictMixin[source]¶ Bases:
object-
load_state_dict(state_dict, strict=True)[source]¶ Confirms and logs the weights that you expect are loaded when
strict=False.- Returns
changed (list of str) – Parameter names that were updated via loading the state_dict
unchanged (list of str) – Parameter names that were NOT updated via loading the state_dict
shape_mismatch (list of str) – Parameter names that were NOT updated via loading the state_dict due to the model’s paramemeter shape not matching the state_dict. This is strictly a subset of
unchanged.
-
pugh_torch.modules.meta module¶
-
class
pugh_torch.modules.meta.BatchLinear(in_features: int, out_features: int, bias: bool = True)[source]¶ Bases:
torch.nn.modules.linear.LinearLinear layer that can take batched weights and bias at runtime.
Technically a little wasteful because we might be allocating some parameters that aren’t used, but its usually a very small amount of memory.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward(x, weight=None, bias=None)[source]¶ - Parameters
x (torch.Tensor) – (B, *, feat_in) Some input tensor
weight (torch.Tensor) – (B feat_out, feat_in) If provided, doesn’t use internal weights
bias – (B, feat_out) If provided, doesn’t use internal bias
-
in_features= None¶
-
out_features= None¶
-
weight= None¶
-