- research
- journey
- bug-fixes
- reflection
•
•
•
-
Findings: Karpathy-style autoresearch on a crypto backtester (local LLM)
Local Qwen 3.5 autoresearch on my crypto DB + Nautilus-style backtester (~2h, 30+ iter, $0 API): tool-calling blocker, run observations, human-in-the-loop steering, GA contrast, diversity, gates.
-
Qwen 3.6 35B-A3B on vLLM: do the Qwen 3.5 tool-calling fixes carry over?
Follow-up testing: same qwen3_xml parser, qwen3.5-enhanced.jinja template, and mixed-GPU tuning as Qwen 3.5-27B—plus three agentic runs comparing official vs enhanced configs on Qwen3.6-35B-A3B-FP8.
-
Claude Code with local vLLM: client validation, model aliases, and a working settings.json
Run Claude Code against local vLLM without Anthropic API access: why common env-only recipes fail, the alias + settings.json pattern that works, and when this matters if you cannot register or use the Claude API.
-
Stable tool calling for Qwen 3.5 27B/35B on vLLM: template, parser, and mixed-GPU fixes
Debugging notes on Jinja chat templates, qwen3_xml vs qwen3_coder parsers, mixed-GPU FP8 drift, and SFT-distilled checkpoints when running Qwen 3.5 27B/35B-class models for long agentic sessions on vLLM.
-
Workaround for Enabling NCCL P2P Communication for NVIDIA RTX 4090 Workstations
What NCCL P2P means, why it matters on multi-GPU workstations, how Resizable BAR fits in, and a concrete setup path for RTX 4090.