ASCII Maze (NeatapticTS)
This folder is the repository's clearest answer to a different practical question from Flappy Bird: what should neuroevolution look like when the hard part is not reflex timing, but deliberate navigation under sparse rewards?
The example uses an ASCII maze, but the maze is not the whole point. The real point is to show how NeatapticTS handles compact perception, reward shaping, curriculum transfer, telemetry-rich evolution, and browser or terminal presentation without turning the whole exercise into a single opaque training loop.
If Flappy Bird is the repo's lesson in fast control systems, ASCII Maze is its lesson in disciplined decision-making.
What This Folder Is Trying To Teach
This example is organized around four reader questions:
- How small can the observation space stay before the policy stops being learnable?
- How do you shape reward for a sparse-goal navigation problem without making the score meaningless?
- How do you evolve through a curriculum of increasingly harder mazes without throwing away useful structure each phase discovered?
- How do you make long-running search legible through dashboards, telemetry, and browser integration instead of waiting blindly for a lucky winner?
The folder matters because it treats those questions as one system. Perception, movement, scoring, curriculum evolution, and presentation are separate boundaries on purpose. That separation is what makes the example useful as a reference architecture rather than just a clever test.
Choose Your Route
Different readers arrive with different questions. Use the route that matches yours.
| If you want to... | Start here | Then read |
|---|---|---|
| Run a maze evolution programmatically | evolutionEngine.ts | evolutionEngine/README.md, asciiMaze.e2e.test.ts |
| Reuse the example from code | index.ts | evolutionEngine.ts, mazeUtils.ts, interfaces.ts |
| Understand how one agent episode works | mazeMovement.ts | mazeMovement/README.md, fitness.ts |
| Understand the browser demo and telemetry surface | browser-entry/browser-entry.ts or index.html | browser-entry.ts, browser-entry/README.md, dashboardManager/README.md |
| Tune reward shaping or progress semantics | fitness.ts | mazeMovement/README.md, mazeUtils.ts |
| Change curriculum, warm-start, or evolution policy | evolutionEngine/README.md | asciiMaze.e2e.test.ts |
| Understand the whole example as a system | this README | the module READMEs listed in Recommended Reading Order |
Run The Example
From the repo root:
Run the browser demo
npm run start:local-server
Then open:
http://localhost:8080/examples/asciiMaze/index.html
Important note:
index.htmlis a lightweight browser shell that loads the prebuilt demo bundle fromdocs/assets.- if you want the real host orchestration source, start with browser-entry/browser-entry.ts.
- after browser-code changes, run
npm run docsornpm run build:ascii-mazebefore expecting the hosted page to reflect them.
Run the curriculum-style end-to-end example with logs
npm run test:e2e:logs
Build the browser bundle directly
npm run build:ascii-maze
Refresh docs and copied example assets
npm run docs
Node engine requirement in this repo is >=22.
The Core Idea In One Glance
The architectural rule is simple: keep the policy input small, keep the simulation honest, keep the reward story inspectable, and keep long-running search observable.
flowchart LR
subgraph EpisodePath[Single-episode path]
Maze["maze state\nencoded layout + positions"] --> Vision["mazeVision.ts\n6-value observation"]
Vision --> Network["Network.activate\n4 direction scores"]
Network --> Movement["mazeMovement/\nmovement + shaping + stop rules"]
Movement --> Fitness["fitness.ts\nscalar run fitness"]
end
subgraph EvolutionPath[Population path]
Fitness --> Engine["evolutionEngine/\ncurriculum, warm-start, adaptive loop"]
Engine --> Dashboard["dashboardManager/\nlive telemetry + archive"]
Engine --> Browser["browser-entry/\nbrowser host and telemetry"]
Engine --> NextMaze["curriculum phase outcome\nseed next maze with current best"]
end
Interfaces["interfaces.ts\nshared contracts"] -.-> Vision
Interfaces -.-> Movement
Interfaces -.-> Engine
Mazes["mazes.ts\nstatic and procedural mazes"] -.-> Maze
classDef boundary fill:#001522,stroke:#0fb5ff,color:#9fdcff,stroke-width:2px;
classDef runtime fill:#03111f,stroke:#00e5ff,color:#d8f6ff,stroke-width:2px;
classDef highlight fill:#2a1029,stroke:#ff4a8d,color:#ffd7e8,stroke-width:3px;
class Maze,Vision,Movement,Fitness,Engine,Dashboard,Browser,Interfaces,Mazes boundary;
class Network,NextMaze runtime;
class Engine highlight;Read the diagram as two connected stories:
- one network experiences one maze episode through perception, action, movement, and scoring,
- the evolution engine turns many such episodes into curriculum progress, telemetry, and next-phase seeding.
That split is the key to understanding why this folder teaches more than a single reward function ever could.
What Each Boundary Protects
`mazeVision.ts`: the policy's tiny sensor window
MazeVision compresses the maze into a six-value observation. That choice is deliberate. The example is not trying to win by giving the policy the whole map. It is trying to show how far a compact, carefully designed signal can go.
This boundary exists so observation design can evolve independently from movement rules, reward shaping, and UI code.
`mazeMovement/`: one episode of decision-making
The movement boundary owns direction selection, movement legality, visit bookkeeping, shaping-sensitive runtime state, and episode finalization.
This is where a raw policy stops being abstract and starts paying for bad local decisions.
`fitness.ts`: the reward story
The fitness layer turns one episode into a scalar score that the evolutionary loop can rank. In sparse-goal tasks, this layer matters enormously because a pure success/fail score often gives evolution too little gradient to learn from.
This boundary exists so reward design stays inspectable instead of getting buried in movement code.
`evolutionEngine/`: curriculum and population policy
The evolution engine is the outer orchestration layer. It normalizes run options, prepares the maze environment, seeds or warms the population, runs the generation loop, applies adaptive dynamics, and interprets when a phase has succeeded well enough to move forward.
This boundary exists so curriculum logic, deterministic mode, scratch-buffer management, telemetry, and stopping behavior can stay coherent at the population level.
`dashboardManager/`: make search visible
The dashboard boundary keeps long-running search understandable. It owns live summaries, archive views, telemetry snapshots, and redraw logic for browser and non-browser hosts.
This boundary exists because evolutionary search is much easier to trust when you can see trend lines, progress plateaus, and solved artifacts instead of only waiting for a final boolean.
`browser-entry/`: the browser host
The browser entry boundary assembles host elements, telemetry fan-out, resize behavior, and curriculum execution for the browser demo. It does not own the evolution algorithm itself. It owns the host experience around it.
That separation keeps the browser integration useful without making the engine depend on DOM concerns.
`interfaces.ts`, `mazes.ts`, and the facade files: shared footing
The supporting files matter too.
interfaces.tskeeps cross-cutting contracts stable while implementation ownership shifts across dedicated subfolders.mazes.tsprovides both static scenarios and procedural generation.asciiMaze.tsandindex.tskeep the example easier to import from the outside.
Two Execution Stories
The same example tells two different runtime stories depending on where you enter.
Evolution story
- evolutionEngine.ts receives run options.
- evolutionEngine/README.md normalizes configuration, prepares the maze, creates or seeds NEAT state, and runs the generation loop.
- mazeMovement.ts simulates candidate episodes.
- fitness.ts converts those episodes into comparable scores.
- The engine updates telemetry, decides whether the current phase is solved, and either stops or advances the curriculum with the best network so far.
Host story
- index.html or a programmatic caller starts the browser host.
- browser-entry.ts exposes the stable
start(...)surface. - browser-entry/README.md wires host elements, telemetry hubs, and curriculum control.
- dashboardManager/README.md keeps the run legible through live and archived summaries.
- The host exposes a lifecycle handle for stop, status, completion, and telemetry consumption.
The important teaching point is that browser and test hosts are consumers of the engine, not hidden owners of the search loop.
The Most Important Design Bets
Several design choices explain why this folder is shaped the way it is.
A tiny observation space can still support interesting behavior
The policy sees only six values. That makes the network easier to evolve, but it also forces the designer to decide what information genuinely matters. The example is teaching observation design, not just maze solving.
Reward shaping is necessary, but must stay interpretable
The example does not rely on success alone. It layers progress-sensitive shaping, exploration incentives, proximity-aware signals, and success or efficiency rewards. That helps learning, but the structure stays explicit enough to inspect and tune.
Curriculum transfer is treated as a first-class workflow
The engine is designed to carry forward useful policy structure as mazes grow harder. That is why the example reads less like one isolated run and more like a sequence of related phases.
Determinism and telemetry are educational tools, not just debugging extras
Deterministic mode, trend snapshots, and structured dashboard output make the system teachable. They let you compare runs, spot regressions, and understand plateaus with less guesswork.
The engine stays orchestration-first
The public EvolutionEngine facade stays thin while the heavy work lives in dedicated helpers under evolutionEngine/. That is an architectural choice: the top-level reader sees the policy flow first and the hot-path machinery second.
Observation And Reward Cheat Sheet
The policy is intentionally small enough to fit in your head.
Inputs
MazeVision builds a six-value observation:
| Input | Meaning | Why it exists |
|---|---|---|
compassScalar |
coarse direction toward the exit | Gives the policy a global hint without revealing the whole maze. |
openN |
whether North is traversable | Encodes immediate local affordance. |
openE |
whether East is traversable | Encodes immediate local affordance. |
openS |
whether South is traversable | Encodes immediate local affordance. |
openW |
whether West is traversable | Encodes immediate local affordance. |
progressDelta |
recent progress change toward the goal | Helps the policy distinguish movement that is productive from movement that merely changes position. |
Outputs
The network emits four directional scores:
- North
- East
- South
- West
MazeMovement interprets those outputs into a chosen move and also records diagnostics such as entropy and saturation so the run is easier to inspect.
Fitness ingredients
The score composes several ideas rather than one blunt scalar:
- base movement or progress score,
- exploration reward for newly visited cells,
- proximity-sensitive weighting near promising areas,
- success reward for reaching the exit,
- efficiency reward when the successful path is short relative to the maze's baseline geometry.
The goal is not to produce the prettiest formula. The goal is to make sparse-goal search learnable without hiding the tradeoffs.
If You Want To Change Something, Read This First
This is the shortest route to the right boundary when you are modifying the example.
| Change goal | Read first | Why |
|---|---|---|
| Change what the network sees | mazeVision.ts | Observation design lives here. |
| Change movement legality, action selection, or per-step shaping behavior | mazeMovement/README.md | The full episode runtime boundary lives here. |
| Change score composition | fitness.ts | Reward logic should stay explicit and separate from movement. |
| Change curriculum, warm-start, deterministic mode, or generation policy | evolutionEngine/README.md | This is the population-policy layer. |
| Change dashboards or telemetry presentation | dashboardManager/README.md | Search visibility lives here. |
| Change browser lifecycle or embed API behavior | browser-entry/README.md | The browser host surface lives here. |
| Change test-driven curriculum expectations | asciiMaze.e2e.test.ts | This file shows the current end-to-end usage pattern. |
Programmatic Starting Point
If you want to run the example from code instead of through the browser host, this is the smallest useful shape:
import { EvolutionEngine } from './evolutionEngine';
import { MazeGenerator } from './mazes';
import { DashboardManager } from './dashboardManager';
import { TerminalUtility } from './terminalUtility';
const dashboard = new DashboardManager(
TerminalUtility.createTerminalClearer(),
(...args) => console.log(...args),
);
const result = await EvolutionEngine.runMazeEvolution({
mazeConfig: { maze: new MazeGenerator(24, 24).generate() },
agentSimConfig: { maxSteps: 2000 },
evolutionAlgorithmConfig: {
popSize: 40,
maxGenerations: 100,
maxStagnantGenerations: 50,
minProgressToPass: 95,
allowRecurrent: true,
},
reportingConfig: {
dashboardManager: dashboard,
logEvery: 1,
label: 'demo-24x24',
},
});
console.log(result.exitReason, result.bestResult?.progress);
Safe Tuning Knobs
If you are teaching, benchmarking, or experimenting, these are usually the highest-value first changes:
agentSimConfig.maxStepscontrols how much trajectory a policy gets before termination.evolutionAlgorithmConfig.popSizetrades compute cost for search breadth.evolutionAlgorithmConfig.maxGenerationschanges how patient the run is.evolutionAlgorithmConfig.minProgressToPasschanges how strict each curriculum phase is.lamarckianIterationsandlamarckianSampleSizeadjust local supervised-style refinement pressure.- deterministic mode and seed settings make comparisons reproducible.
For deeper experiments:
- swap or tune the fitness evaluator,
- inspect movement penalties and rewards in the maze movement boundary,
- study dashboard telemetry trends rather than only final success states.
Common Failure Modes
The most common ways to make this example confusing or brittle are practical, not exotic:
- a maze without
SorEbreaks setup because start or exit lookup cannot resolve, - inconsistent row widths undermine encoding and path assumptions,
- overly harsh penalties can suppress exploration before the policy learns useful structure,
- very small
maxStepsvalues on large mazes prevent meaningful trajectories, - expecting fast convergence from a sparse-goal task leads to bad tuning decisions.
Recommended Reading Order
If you want the cleanest ramp into the example, this order usually pays off:
- this README for the system-level mental model,
- evolutionEngine/README.md for the outer training loop and curriculum policy,
- mazeMovement/README.md for single-episode runtime behavior,
- fitness.ts and mazeVision.ts for the observation and reward story,
- dashboardManager/README.md for telemetry and live visibility,
- browser-entry/README.md for browser hosting and embed behavior,
- asciiMaze.e2e.test.ts for the current end-to-end usage pattern.
If you only care about one slice:
- compact policy design: start with
mazeVision.ts,mazeMovement.ts, andfitness.ts, - curriculum evolution: start with
evolutionEngine/, - browser or telemetry work: start with
browser-entry/anddashboardManager/.
Exercises For Curious Readers
If you want to use this folder as a lab rather than only as a demo, these experiments pay off quickly:
- Remove one input from
MazeVisionand measure how much learning quality drops. - Reduce exploration reward and inspect whether the agent becomes more myopic or more stable.
- Compare direct training on a harder procedural maze against phased curriculum transfer.
- Fix a deterministic seed and compare outcomes while toggling one heuristic at a time.
- Compare a winner before and after refinement to see what local supervised pressure actually improves.
Why Start Here
Sparse-goal problems are where many neuroevolution demos become unconvincing. They either hide the scoring story, overfeed the policy with information, or offer too little telemetry to understand why a run worked.
This example matters because it does the opposite. It keeps the policy small, the reward story explicit, the evolutionary loop observable, and the host surfaces useful.
If you want one folder that shows how NeatapticTS approaches compact perception, curriculum transfer, and telemetry-rich search in a deliberate navigation task, this is the one.