SIMEON GARRATT | Neural Vision

Visual Snapshots

Frozen Moments of Machine Perception

These are visual snapshots captured mid-thought—neural activations extracted from the hidden layers of deep learning models and transformed into art. Each image is a window into how machines decode reality, translating pixels into patterns, edges into meaning, chaos into understanding.

Digital Rain

Matrix-style code visualization rendering neural activations as cascading characters—early layer edge detection translated into the language of terminals.

RGB Channel Fusion

Raw multi-channel activations rendered as color chaos—each hue represents a different feature detector firing simultaneously across the network.

Waveform Memory

Audio input transformed into mel-spectrogram and processed through convolutional layers—sound frozen as texture, rhythm visualized as structure.

Thermal Activation Map

Heatmap rendering showing intensity of neural responses—hotter colors indicate stronger feature activations in mid-level convolutional layers.

Circuit Board Trace

Edge detection algorithms rendered as PCB schematics—the network sees in circuits, finding connections and pathways through visual data.

Hexadecimal Decomposition

Activation values encoded as hex memory dumps—raw neural data visualized as the network's literal internal representation, byte by byte.

Pointillist Synthesis

High-level semantic features rendered as dots—the network's understanding abstracted into impressionistic patterns, seeing wholes through parts.

Binary Threshold

Activations reduced to pure binary—neurons either firing or silent. The network's decision-making stripped to its most fundamental: on or off, 1 or 0.

Every image began as light hitting a sensor. But somewhere between input and output, the neural network transforms that light into abstract representations—patterns we can extract, freeze, and visualize as something entirely new.

STEP 1

INPUT

Data Ingestion & Preprocessing

Image tensors normalized via ImageNet statistics or audio converted to mel-spectrogram representation

• Shape: (3, 224, 224)
• Normalize: μ=[0.485,0.456,0.406]
• Audio: n_mels=128, sr=22050Hz

STEP 2

CNN

Neural Network Forward Pass

Convolutional feature extraction through progressively deeper layers with non-linear activations

• Models: ResNet18 | CSM-1B
• Layers: conv1 → layer1...layer4
• Activation: f(x) = max(0, x)

STEP 3

HOOK

Activation Map Extraction

PyTorch forward hooks capture intermediate feature tensors before pooling, batch norm, or output layers

• Hook: register_forward_hook()
• Outputs: (B, C, H, W) tensors
• Layers: 6 convolutional blocks

STEP 4

RENDER

Visualization & Encoding

Feature maps transformed via custom codecs—spatial gradients, colormap encoding, pattern synthesis algorithms

• Modes: 8 render pipelines
• Normalize: (x - min) / (max - min)
• Output: RGB uint8 [0-255]

The Invisible Made Visible

Extracting the In-Between

Neural Vision captures the moments before a model makes its decision—the raw, unfiltered activations that exist in the hidden layers. What we see is normally invisible: billions of neurons firing, transforming input into understanding. By extracting these activation maps and rendering them as visual art, we reveal the alien beauty of machine perception.

Upload Anything

Feed the network an image or record live audio. Watch as it dissects what you give it—layer by layer, neuron by neuron—and renders its internal processing as visual art.

8 Render Styles

Matrix rain, circuit boards, hexadecimal memory, ASCII terminals, thermal spectrums, binary code, pointillist dots, raw RGB channels—each mode tells a different story about how the machine interprets your input.

Deep Layer Extraction

Neural Vision hooks directly into ResNet18 and CSM-1B models, extracting activation maps from 6 convolutional layers. You're seeing the network's raw thoughts before it reaches a conclusion.

Creative Control

Adjust temperature, intensity, noise, edge detection, contrast, pattern density. Turn machine learning into generative art—controllable, repeatable, endlessly curious.

Technology Stack

Built With PyTorch & Flask

Full-stack application combining deep learning visualization with real-time audio processing. Flask backend handles model inference while WebAudio API enables live microphone capture and waveform display.

Neural Models

ResNet18

CSM-1B

PyTorch

Activation Hooks

Layer Visualization

Audio Processing

Librosa

Mel-Spectrogram

FFmpeg

WebAudio API

MediaRecorder

Real-time Waveforms

Backend & Processing

Flask

OpenCV

NumPy

Pillow

Matplotlib

Visualization Modes

Matrix Code Rain

Circuit Board PCB

Hexadecimal Memory

ASCII Terminal Art

Thermal Spectrum

Binary Patterns

Pointillist Dots

Neural RGB

Multi-Layer Analysis

Progressive Feature Extraction

Neural Vision captures activations from 6 key convolutional layers, showing how the network builds increasingly complex representations from simple edges to high-level semantic features.

Layer 01

Early Convolution

Edge detection, basic shapes, color gradients

Layer 02-03

Mid-Level Features

Textures, patterns, simple object parts

Layer 04-06

High-Level Semantics

Complex shapes, object recognition, contextual understanding

Applications

Educational & Creative Tool

Neural Vision makes deep learning interpretable and accessible. Perfect for understanding how convolutional neural networks process visual and audio information, teaching AI concepts, or creating generative art from neural activations.

Education

Demonstrate how CNNs extract features at different layers, from edge detection to high-level patterns.

Research

Analyze model behavior, compare activation patterns, and debug neural network architectures visually.

Generative Art

Create unique artwork from neural activations with customizable aesthetic parameters and render modes.

Audio Visualization

Transform speech, music, or ambient sound into visual representations through mel-spectrogram analysis.

Frequently Asked Questions

Neural Vision FAQ

What is Neural Vision?

Neural Vision is an interactive tool that extracts hidden-layer activations from deep learning models and renders them as visual art. It reveals how convolutional neural networks perceive and process images and audio.

Does Neural Vision require a GPU?

No. Neural Vision runs entirely in the browser using pre-computed activations served from a Flask backend. The server handles PyTorch inference, so users need only a modern web browser.

What CNN concepts does Neural Vision demonstrate?

Neural Vision demonstrates convolutional feature extraction, activation maps, forward hooks, progressive layer abstraction from edges to semantics, and 8 distinct rendering pipelines for visualizing internal network states.

Can Neural Vision process audio inputs?

Yes. Audio is converted to mel-spectrograms using Librosa at 22050 Hz sample rate with 128 mel bands, then fed through the same convolutional pipeline as images for visualization.

Is Neural Vision available to use online?

Neural Vision is currently a research and development project. A public demo is planned. The codebase uses Flask, PyTorch, OpenCV, and WebAudio API for real-time processing.

NEURAL VISION