methods

Shared method families for signal shaping, search pressure, and structural policy.

This folder is the library's reusable vocabulary shelf. The heavier controller chapters in neat/ decide when a policy should be applied. The methods/ folder defines what those policy choices actually are. That split keeps the rest of the repo readable: examples, architecture helpers, and evolutionary controllers can reuse the same small method objects without each subsystem inventing its own private dialect.

The quickest way to understand the chapter is to split it into three reader questions. How should a unit transform signal? That is Activation. How should error or learning tempo be interpreted? That is Cost and Rate. How should evolutionary pressure and wiring structure be adjusted? That is selection, mutation, crossover, gating, and groupConnection.

That boundary matters because it lets you change policy without changing orchestration. A Neat controller can switch from gentle to aggressive selection, or an architecture builder can swap activation families, without rewriting evaluation loops or graph code. The method objects stay small on purpose so experiments can compose them instead of hiding them in conditionals.

The structural pair deserves special attention because the names sound close while the responsibilities are different. groupConnection answers how two groups should be wired before any runtime signal exists. gating answers how an already-existing connection should be modulated once the network is running. One is about topology layout. The other is about runtime control.

Two compact background bridges help here. See Wikipedia contributors, Activation function, for the signal-shaping side of the shelf, and Wikipedia contributors, Selection (genetic algorithm), for the search-pressure side. Together they frame the two big forces this folder keeps in play: how nodes respond to signal and how search decides which traits survive.

Read the chapter in three passes:

  1. start with Activation, Cost, and Rate when you are thinking like a trainer tuning signal flow, error shape, and optimization tempo,
  2. continue to selection, mutation, and crossover when you are thinking like an evolutionary controller tuning search pressure,
  3. finish with gating and groupConnection when you need lower-level structural vocabulary and want to distinguish routing control from raw wiring layout.
flowchart TD
  classDef base fill:#08131f,stroke:#1ea7ff,color:#dff6ff,stroke-width:1px;
  classDef accent fill:#0f2233,stroke:#ffd166,color:#fff4cc,stroke-width:1.5px;

  Methods[methods chapter]:::accent --> Training[Training and optimization vocabulary]:::base
  Methods --> Evolution[Evolutionary search vocabulary]:::base
  Methods --> Structure[Structural control vocabulary]:::base
  Training --> Activation[Activation]:::base
  Training --> Cost[Cost]:::base
  Training --> Rate[Rate]:::base
  Evolution --> Selection[selection]:::base
  Evolution --> Mutation[mutation]:::base
  Evolution --> Crossover[crossover]:::base
  Structure --> Gating[gating]:::base
  Structure --> Connection[groupConnection]:::base
flowchart LR
  classDef base fill:#08131f,stroke:#1ea7ff,color:#dff6ff,stroke-width:1px;
  classDef accent fill:#0f2233,stroke:#ffd166,color:#fff4cc,stroke-width:1.5px;

  Signals["How should signal behave?"]:::accent --> SignalShelf["Activation + Cost + Rate"]:::base
  Search["How should search behave?"]:::accent --> SearchShelf["selection + mutation + crossover"]:::base
  Structure["How should wiring be controlled?"]:::accent --> StructureShelf["gating + groupConnection"]:::base

Example: assemble one compact training vocabulary for signal shape, loss, and tempo.

const trainingPolicy = {
  activation: Activation.relu,
  loss: Cost.mse,
  schedule: Rate.step(0.9, 100),
};

Example: assemble a stronger search-pressure and routing vocabulary without changing the surrounding controller code.

const evolutionaryPolicy = {
  parentSelection: { ...selection.POWER, power: 6 },
  gatePlacement: gating.SELF,
  denseBridge: groupConnection.ALL_TO_ALL,
};

methods/methods.ts

Activation

Runtime registry of built-in and custom activation functions.

Read this surface as a behavior shelf for neurons rather than as a loose bag of math helpers. The chosen activation determines what each node can express: whether it saturates, stays sparse, preserves negative values, or responds smoothly enough for gradient-based updates.

The built-in functions cluster into a few useful families:

flowchart TD
  Shelf[Activation shelf] --> Bounded[Bounded classics]
  Shelf --> Piecewise[Piecewise gates]
  Shelf --> Specialized[Shape-specialized]
  Shelf --> Smooth[Smooth modern]
  Bounded --> BoundedExamples[logistic sigmoid tanh]
  Piecewise --> PiecewiseExamples[relu hardTanh step]
  Specialized --> SpecializedExamples[gaussian sinusoid bentIdentity]
  Smooth --> SmoothExamples[softplus swish gelu mish]

Every activation shares the same calling convention: pass the input value as the first argument and optionally pass true as the second argument when you want the local derivative instead of the forward value. That derivative mode keeps the registry compatible with the classic Neataptic API shape while also making the individual implementations easy to test in isolation.

Minimal workflow:

const hiddenValue = Activation.relu(weightedSum);
const outputSlope = Activation.logistic(weightedSum, true);

registerCustomActivation(
  'cube',
  (inputValue, shouldComputeDerivative = false) =>
    shouldComputeDerivative ? 3 * inputValue * inputValue : inputValue ** 3,
);

const customValue = Activation.cube(0.5);

A practical chooser for first experiments:

crossover

Crossover methods for genetic algorithms.

These methods implement the crossover strategies described in the Instinct algorithm, enabling the creation of offspring with unique combinations of parent traits.

Read this file as an inheritance-policy shelf: each method answers a different question about how aggressively two parents should be mixed.

A practical chooser for first experiments:

Minimal workflow:

const broadMixing = crossover.UNIFORM;

const oneCut = crossover.SINGLE_POINT;

const twoCut = {
  ...crossover.TWO_POINT,
  config: [0.25, 0.75],
};

const blendedOffspring = crossover.AVERAGE;
flowchart LR
  Parents[Two parent genomes] --> Segment[Segment-preserving crossover]
  Parents --> GeneWise[Gene-wise crossover]
  Parents --> Blend[Numeric blending]
  Segment --> Single[SINGLE_POINT]
  Segment --> Double[TWO_POINT]
  GeneWise --> Uniform[UNIFORM]
  Blend --> Average[AVERAGE]

gating

Defines the small routing shelf that decides where a gater applies control.

Gating is one of the lightest structural policies in the library: the graph stays the same, but another neuron or group gets to modulate how strongly a connection participates in the current computation. That makes gating useful when a network needs context-sensitive routing, soft memory behavior, or a way to expose only part of an otherwise valid intermediate result.

Read this file as an answer to one placement question: which part of the connection should the gater influence?

Those choices matter because they create different control surfaces. Some experiments need a gate that behaves like an evidence filter, some need a gate that behaves like an output valve, and some need the weight itself to become state-dependent instead of fixed.

A practical chooser for first experiments:

flowchart LR
  Source[Source neuron] --> Connection[Connection weight]
  Connection --> Target[Target neuron]
  Gater[Gater]
  Gater -. INPUT .-> Target
  Gater -. OUTPUT .-> Target
  Gater -. SELF .-> Connection

Minimal workflow:

const routingShelf = {
  incomingGate: gating.INPUT,
  outgoingGate: gating.OUTPUT,
  adaptiveWeightGate: gating.SELF,
};

groupConnection

Defines the small wiring-policy shelf for connecting one node group to another.

Read this file as a topology chooser rather than a bag of connection names. These policies do not decide weights, learning, or mutation pressure; they answer a narrower structural question first: what edge pattern should exist between the source group and the target group before later optimization details matter?

The three built-ins answer three different wiring intents:

Those choices matter because they create very different starting biases. A dense bridge maximizes routing freedom, a dense-without-self-links bridge is often the cleanest way to describe intra-group recurrence, and one-to-one wiring preserves explicit alignment instead of encouraging cross-talk.

A practical chooser for first experiments:

flowchart LR
  Dense[Dense mesh] --> AllToAll[ALL_TO_ALL]
  Dense --> AllToElse[ALL_TO_ELSE]
  Paired[Positional pairing] --> OneToOne[ONE_TO_ONE]

Minimal workflow:

const wiringShelf = {
  denseBridge: groupConnection.ALL_TO_ALL,
  denseWithoutSelfLoops: groupConnection.ALL_TO_ELSE,
  alignedBridge: groupConnection.ONE_TO_ONE,
};

mutation

Defines various mutation methods used in neuroevolution algorithms.

Mutation introduces genetic diversity into the population by randomly altering parts of an individual's genome (the neural network structure or parameters). This is crucial for exploring the search space and escaping local optima.

Common mutation strategies include adding or removing nodes and connections, modifying connection weights and node biases, and changing node activation functions. These operations allow the network topology and parameters to adapt over generations.

The methods listed here are inspired by techniques used in algorithms like NEAT and particularly the Instinct algorithm, providing a comprehensive set of tools for evolving network architectures.

Read this file as a mutation toolbox organized by what kind of change you want evolution to make:

A practical reading order is:

  1. start with MOD_WEIGHT and MOD_BIAS to understand the gentlest search moves,
  2. then compare ADD_CONN and ADD_NODE to see how structure starts to grow,
  3. then read the recurrent and gating operators when you want temporal behavior or context-sensitive routing,
  4. finish with ALL and FFW, which summarize which operators belong in a broad search versus a strictly feedforward one.

A practical chooser for first experiments:

flowchart TD
  Mutation[Mutation toolbox] --> Grow[Grow structure]
  Mutation --> Prune[Prune structure]
  Mutation --> Tune[Tune parameters]
  Mutation --> Shape[Reshape behavior]
  Mutation --> Memory[Add memory blocks]
  Grow --> GrowItems[ADD_NODE ADD_CONN ADD_SELF_CONN ADD_BACK_CONN]
  Prune --> PruneItems[SUB_NODE SUB_CONN SUB_SELF_CONN SUB_BACK_CONN]
  Tune --> TuneItems[MOD_WEIGHT MOD_BIAS REINIT_WEIGHT]
  Shape --> ShapeItems[MOD_ACTIVATION ADD_GATE SUB_GATE SWAP_NODES]
  Memory --> MemoryItems[ADD_LSTM_NODE ADD_GRU_NODE]

Minimal workflow:

const safeFeedforwardShelf = mutation.FFW;

const structuralSearchShelf = [
  mutation.ADD_CONN,
  mutation.ADD_NODE,
  mutation.MOD_WEIGHT,
  mutation.MOD_BIAS,
];

const recurrentSearchShelf = [
  ...structuralSearchShelf,
  mutation.ADD_GATE,
  mutation.ADD_BACK_CONN,
];

Supported mutation families:

Summary shelves:

selection

Defines various selection methods used in genetic algorithms to choose individuals for reproduction based on their fitness scores.

Selection is a crucial step that determines which genetic traits are passed on to the next generation. Different methods offer varying balances between exploration (maintaining diversity) and exploitation (favoring high-fitness individuals). The choice of selection method significantly impacts the algorithm's convergence speed and the diversity of the population. High selection pressure (strongly favoring the fittest) can lead to faster convergence but may result in premature stagnation at suboptimal solutions. Conversely, lower pressure maintains diversity but can slow down the search process.

Read this file as a compact pressure ladder:

Those strategies are not only different implementations. They encode different ideas about what "deserves another child" means in an evolutionary run.

A practical chooser for first experiments:

Minimal workflow:

const conservativeSelection = selection.FITNESS_PROPORTIONATE;

const aggressiveSelection = {
  ...selection.POWER,
  power: 6,
};

const bracketSelection = {
  ...selection.TOURNAMENT,
  size: 7,
  probability: 0.75,
};
flowchart LR
  Population[Scored population] --> Proportionate[FITNESS_PROPORTIONATE<br/>probability tracks score share]
  Population --> Power[POWER<br/>front of ranking gets extra pressure]
  Population --> Tournament[TOURNAMENT<br/>small local bracket decides]

default

binary

binary(
  targets: number[],
  outputs: number[],
): number

Calculates the Binary Error rate, often used as a simple accuracy metric for classification.

This function calculates the proportion of misclassifications by comparing the rounded network outputs (thresholded at 0.5) against the target labels. It assumes target values are 0 or 1, and outputs are probabilities between 0 and 1. Note: This is equivalent to 1 - accuracy for binary classification.

Returns: The proportion of misclassified samples (error rate, between 0 and 1).

cosineAnnealing

cosineAnnealing(
  period: number,
  minimumRate: number,
): (baseRate: number, iteration: number) => number

Implements a Cosine Annealing learning rate schedule.

This schedule varies the learning rate cyclically according to a cosine function. It starts at the baseRate and smoothly anneals down to minimumRate over a specified period of iterations, then potentially repeats. This can help the model escape local minima and explore the loss landscape more effectively. Often used with "warm restarts" where the cycle repeats. The mental model is deliberate breathing: ramp down to settle, then restart high enough to explore again.

Formula: learning_rate = minimumRate + 0.5 * (baseRate - minimumRate) * (1 + cos(pi * current_cycle_iteration / period))

Parameters:

Returns: A function that calculates the learning rate for a given iteration based on the cosine annealing schedule.

cosineAnnealingWarmRestarts

cosineAnnealingWarmRestarts(
  initialPeriod: number,
  minimumRate: number,
  periodGrowthMultiplier: number,
): (baseRate: number, iteration: number) => number

Cosine Annealing with Warm Restarts (SGDR style) where the cycle length can grow by a multiplier after each restart.

This variant keeps the exploratory reset behavior of cosine annealing while allowing later cycles to last longer. That makes it useful when early exploration should be frequent but later training should settle for longer stretches between restarts.

Parameters:

Returns: A function that replays cosine cycles whose length can grow after each restart.

crossEntropy

crossEntropy(
  targets: number[],
  outputs: number[],
): number

Calculates the Cross Entropy error, commonly used for classification tasks.

This function measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label.

It uses a small epsilon (PROB_EPSILON = 1e-15) to prevent log(0) which would result in NaN. Output values are clamped to the range [epsilon, 1 - epsilon] for numerical stability.

Returns: The mean cross-entropy error over all samples.

exp

exp(
  decayFactor: number,
): (baseRate: number, iteration: number) => number

Implements an exponential decay learning rate schedule.

The learning rate decreases exponentially after each iteration, multiplying by the decay factor decayFactor. This provides a smooth, continuous reduction in the learning rate over time. Compared with step decay, the policy is less about distinct phases and more about a steady fade in aggressiveness.

Formula: learning_rate = baseRate * decayFactor ^ iteration

Parameters:

Returns: A function that calculates the exponentially decayed learning rate for a given iteration.

fixed

fixed(): (baseRate: number, iteration: number) => number

Implements a fixed learning rate schedule.

The learning rate remains constant throughout the entire training process. This is the simplest schedule and serves as a baseline, but may not be optimal for complex problems. Use it when you want the rest of the system, not the schedule, to carry the full burden of training stability.

Parameters:

Returns: A function that takes the base learning rate and the current iteration number, and always returns the base learning rate.

focalLoss

focalLoss(
  targets: number[],
  outputs: number[],
  focalGamma: number,
  focalAlpha: number,
): number

Calculates the Focal Loss, which is useful for addressing class imbalance in classification tasks. Focal loss down-weights easy examples and focuses training on hard negatives.

Returns: The mean focal loss.

hinge

hinge(
  targets: number[],
  outputs: number[],
): number

Calculates the Mean Hinge loss, primarily used for "maximum-margin" classification, most notably for Support Vector Machines (SVMs).

Hinge loss is used for training classifiers. It penalizes predictions that are not only incorrect but also those that are correct but not confident (i.e., close to the decision boundary). Assumes target values are encoded as -1 or 1.

Returns: The mean hinge loss.

inv

inv(
  decayFactor: number,
  decayPower: number,
): (baseRate: number, iteration: number) => number

Implements an inverse decay learning rate schedule.

The learning rate decreases as the inverse of the iteration number, controlled by the decay factor decayFactor and exponent decayPower. The rate decreases more slowly over time compared to exponential decay. Use it when you want long training runs to keep some learning energy instead of cooling too quickly.

Formula: learning_rate = baseRate / (1 + decayFactor * iteration ** decayPower)

Parameters:

Returns: A function that calculates the inversely decayed learning rate for a given iteration.

labelSmoothing

labelSmoothing(
  targets: number[],
  outputs: number[],
  smoothingFactor: number,
): number

Calculates the Cross Entropy with Label Smoothing. Label smoothing prevents the model from becoming overconfident by softening the targets.

Returns: The mean cross-entropy loss with label smoothing.

linearWarmupDecay

linearWarmupDecay(
  totalStepCount: number,
  warmupStepCount: number | undefined,
  endRate: number,
): (baseRate: number, iteration: number) => number

Linear Warmup followed by Linear Decay to an end rate. Warmup linearly increases LR from near 0 up to baseRate over warmupStepCount, then linearly decays to endRate at totalStepCount. Iterations beyond totalStepCount clamp to endRate.

This schedule is common when the earliest steps are the most unstable: start gentle, reach full speed, then taper predictably.

Parameters:

Returns: A function that warms the learning rate up, then decays it toward a fixed floor.

mae

mae(
  targets: number[],
  outputs: number[],
): number

Calculates the Mean Absolute Error (MAE), another common loss function for regression tasks.

MAE measures the average of the absolute differences between predictions and actual values. Compared to MSE, it is less sensitive to outliers because errors are not squared.

Returns: The mean absolute error.

mape

mape(
  targets: number[],
  outputs: number[],
): number

Calculates the Mean Absolute Percentage Error (MAPE).

MAPE expresses the error as a percentage of the actual value. It can be useful for understanding the error relative to the magnitude of the target values. However, it has limitations: it's undefined when the target value is zero and can be skewed by target values close to zero.

Returns: The mean absolute percentage error, expressed as a proportion (e.g., 0.1 for 10%).

mse

mse(
  targets: number[],
  outputs: number[],
): number

Calculates the Mean Squared Error (MSE), a common loss function for regression tasks.

MSE measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. It is sensitive to outliers due to the squaring of the error terms.

Returns: The mean squared error.

msle

msle(
  targets: number[],
  outputs: number[],
): number

Calculates the Mean Squared Logarithmic Error (MSLE).

MSLE is often used in regression tasks where the target values span a large range or when penalizing under-predictions more than over-predictions is desired. It measures the squared difference between the logarithms of the predicted and actual values. Uses log(1 + x) instead of log(x) for numerical stability and to handle inputs of 0. Assumes both targets and outputs are non-negative.

Returns: The mean squared logarithmic error.

reduceOnPlateau

reduceOnPlateau(
  options: { factor?: number | undefined; patience?: number | undefined; minDelta?: number | undefined; cooldown?: number | undefined; minRate?: number | undefined; verbose?: boolean | undefined; } | undefined,
): (baseRate: number, iteration: number, lastError?: number | undefined) => number

ReduceLROnPlateau style scheduler (stateful closure) that monitors error signal (third argument if provided) and reduces rate by 'factor' if no improvement beyond 'minDelta' for 'patience' iterations. Cooldown prevents immediate successive reductions. NOTE: Requires the training loop to call with signature (baseRate, iteration, lastError).

This is the chapter's reactive option. Instead of following a pre-planned calendar, the schedule listens for stalled improvement and responds only when the run appears to flatten out.

Parameters:

Returns: A stateful schedule function that may lower the learning rate when the monitored error stops improving.

softmaxCrossEntropy

softmaxCrossEntropy(
  targets: number[],
  outputs: number[],
): number

Softmax Cross Entropy for mutually exclusive multi-class outputs given raw (pre-softmax or arbitrary) scores. Applies a numerically stable softmax to the outputs internally then computes -sum(target * log(prob)). Targets may be soft labels and are expected to sum to 1 (will be re-normalized if not).

step

step(
  decayFactor: number,
  decayStepSize: number,
): (baseRate: number, iteration: number) => number

Implements a step decay learning rate schedule.

The learning rate is reduced by a multiplicative factor (decayFactor) at predefined intervals (decayStepSize iterations). This allows for faster initial learning, followed by finer adjustments as training progresses. It is a good fit when you want training to move through a few deliberate phases rather than one perfectly smooth curve.

Formula: learning_rate = baseRate * decayFactor ^ floor(iteration / decayStepSize)

Parameters:

Returns: A function that calculates the decayed learning rate for a given iteration.

Generated from source JSDoc • GitHub