Work The Guts About Contact

Technical Infrastructure

The guts

Production infrastructure that powers the things I design and build. I use a wide variety of tools, systems and custom built models that adapt specifically to what YOU need.

Technology Stack

Core capabilities

Backend

  • Python (FastAPI, async/await, Celery)
  • Go/Golang for high-performance services
  • Node.js / Express for real-time systems
  • SQLAlchemy + Alembic migrations

Frontend

  • React 18+ / Next.js 14
  • TypeScript for type safety
  • React Native for mobile
  • TailwindCSS, shadcn/ui, Radix UI
  • Vite & Webpack build systems

AI & Machine Learning

  • Ollama local model deployment
  • OpenAI & Anthropic Claude APIs
  • LangChain orchestration
  • PyTorch & TorchVision
  • OpenCV for computer vision
  • scikit-learn for ML pipelines

Databases

  • PostgreSQL (optimized queries, partitioning)
  • Redis (caching, pub/sub, queues)
  • Neo4j for graph relationships
  • Weaviate vector database (2.8M+ vectors)

Infrastructure

  • Docker Swarm (30+ services, 6 nodes)
  • Docker Compose for orchestration
  • Terraform for IaC
  • Nginx reverse proxy & load balancing
  • CI/CD pipelines (GitHub Actions)
  • Linux server management

Security

  • JWT & OAuth 2.0 authentication
  • bcrypt encryption
  • Rate limiting & DDoS protection
  • Vulnerability scanning (SCAFU)
  • SSL/TLS certificate management

System Architecture

Production infrastructure

30+
Docker Services
8.5K
Daily Executions
150+
n8n Workflows
2.8M
Vector Embeddings

A modular production stack: gateway at the edge, services in the middle, knowledge and storage beneath. The specifics change per project—the structure stays consistent.

[01] ENTRY

Edge & Gateway

Traffic routing · Rate limiting · TLS termination

Receives all external traffic, routes to internal services, enforces global policies, and handles TLS. First line of defense and traffic control.

Nginx
CDN
WAF
[02] COMPUTE

Application Services

APIs · WebSocket · Background jobs

Stateless services handling REST/GraphQL APIs, real-time channels, and async workers. Message queues decouple workloads for resilience.

FastAPI
WebSocket
Redis Queue
[03] AUTOMATION

Orchestration

Workflows · Scheduling · Cross-service coordination

150+ workflows orchestrate data movement, scheduled tasks, and event-driven automations. 8.5K daily executions coordinate the entire stack.

n8n
Cron
Webhooks
[04] INTELLIGENCE

AI & ML Layer

Local models · Embeddings · Retrieval

Privacy-first AI with local Ollama models and 2.8M vector embeddings. Semantic search, code generation, and intelligent automation—all running in-house.

Ollama
Weaviate
Embeddings
[05] PERSISTENCE

Data Plane

Relational · Graph · Cache

Multi-model persistence: PostgreSQL for transactions, Neo4j for relationships, Redis for sub-millisecond reads. Purpose-built for each data pattern.

PostgreSQL
Neo4j
Redis
[06] OBSERVABILITY

Monitoring & Security

Metrics · Tracing · Alerts

End-to-end observability with Prometheus metrics, Grafana dashboards, and real-time alerting. Security policies enforced at every layer.

Prometheus
Grafana
Alerts

All services communicate via internal Docker network with automatic service discovery. Prometheus monitors health metrics, Grafana visualizes performance, and automated backups run daily.

AI Infrastructure

Local AI deployment

Privacy-first AI architecture. All models run locally via Ollama—no data leaves your infrastructure. Cloud APIs (OpenAI, Claude) used only for non-sensitive workloads.

llama3.1:8b
General Intelligence
Primary model for SCAFU, AURA, and general reasoning tasks. 8B parameter sweet spot for speed/quality balance. Runs at ~40 tokens/sec on consumer hardware.
deepseek-coder:6.7b
Code Generation
Security remediation code generation in SCAFU. Framework-specific fixes (React, Django, Express). 85% accuracy on CVE patch suggestions.
mistral:7b
Fast Inference
Quick responses for conversational systems. Optimized for low-latency applications. Used in AURA for real-time style adaptation.
voyage-large-2
Embeddings
Vector embeddings for semantic search in Weaviate. 1024 dimensions, optimized for retrieval accuracy. Powers Nuculair's context matching.
gpt-4o
Complex Reasoning
Cloud fallback for multi-step reasoning, security analysis deep-dives, and complex chain-of-thought tasks when local models insufficient.
claude-3.5-sonnet
Long Context
200K context window for document analysis, large codebase reasoning, and comprehensive security audits. Used in SCAFU's full-stack scanning.
# Model routing logic if task.sensitive_data: model = "llama3.1:8b" # Local only elif task.requires_code: model = "deepseek-coder:6.7b" elif task.context_length > 10000: model = "claude-3.5-sonnet" else: model = "mistral:7b" # Fast default

Vector Database

RAG & semantic search

Weaviate vector database with 2.8M+ embeddings enables sub-100ms semantic search across millions of entities. Hybrid search combines vector similarity with keyword filtering for precision.

2.8M+
Vector Embeddings
<100ms
Query Latency
1024d
Vector Dimensions
95%+
Recall Accuracy

Embedding Pipeline: Documents chunked to 512 tokens → Voyage AI embeddings → Weaviate index → Redis cache hot queries. Nuculair uses this for instant profile context retrieval across 300+ data sources.

Hybrid Search: Vector similarity (cosine distance) + BM25 keyword matching + metadata filters. Weighted fusion algorithm combines scores for optimal relevance ranking.

Workflow Orchestration

n8n automation platform

150+ n8n workflows handle data ingestion, processing, and delivery. 8,500+ daily executions power OSINT aggregation, security scanning, and AI model orchestration.

OSINT Workflows

  • Social media scraping (120+ platforms)
  • Professional network aggregation (80+ sources)
  • Public records collection (60+ databases)
  • Real-time alerts & monitoring
  • Data enrichment pipelines

Security Automation

  • Scheduled vulnerability scans
  • CVE database synchronization
  • Exploit code generation triggers
  • Remediation report delivery
  • False positive filtering

AI Orchestration

  • Model selection routing
  • Prompt template management
  • Response quality validation
  • Context aggregation
  • Embedding generation batch jobs