April 20, 202611 min readЧитать на русском

Agentic Generation of Music-Processor Templates

LLM-Driven Patch Design for a Hardware Guitar Multi-FX, Without Training

Author: Alex Nix Status: Working draft — for public release

Abstract

This paper reports a three-day hardware-in-the-loop study of letting a language model design and apply concrete multi-effects patches to a hardware guitar processor (Valeton GP-100) from plain-English prompts. The model receives no training and no dataset; it leverages its general priors on genre, artist, and sound-design, mapped onto a whitelist of pedal-specific effect types and knob values produced by reverse-engineering the firmware's parameter table. A thin Python layer handles MIDI transport, validation, and crash-safety. Twenty-plus patches were generated and validated live on hardware across jazz fusion, pop-punk, sci-fi, horror, and pure sound design. The contributions are: (a) confirmation that the LLM can do this without training when given a structured, validated parameter space; (b) a crash taxonomy of three distinct out-of-range error classes specific to this firmware, each reproducible; (c) the empirical insight that parsing one real preset reshapes the model's defaults more than any prompt-engineering effort; (d) the observation that interactive iteration consistently beats single-shot autonomous generation for creative musical work. We describe the architecture, findings, and the open gaps that would require additional reverse-engineering to close.

1. Motivation

Hardware guitar multi-FX processors expose a parameter space of low intrinsic complexity (a few hundred values across nine signal-chain blocks) but high semantic difficulty: "make it sound like Frusciante on Californication" maps to a specific configuration of effect types and knob values that no menu in the device documents directly. The user community treats this as a craft skill — patches are traded as exported files; tutorials walk through one tone at a time; no general algorithm bridges plain-English description to working patch.

The hypothesis explored here is narrow and testable: a sufficiently capable LLM, given a structured and validated parameter space, can produce musically coherent patches from natural-language descriptions, without a trained model, dataset, or learned mapping. The LLM's priors on music, artists, and sound design substitute for training; the validated parameter space prevents the model from emitting values the device cannot accept; the hardware-in-the-loop test confirms or rejects the output by ear.

This is a small claim. Its interest is in what it implies for a broader class of problems: any hardware with a documentable parameter space and a human listening to outputs might be addressable by the same architecture, without any model training.

2. Target hardware

Valeton GP-100. Class-compliant USB-MIDI, firmware v1.5.0, internal algorprefxconstdata version 1.9.
Exposes SysEx (parameter editing), Program Change (preset selection), and Control Change (block enable/disable, pedals, tempo, tuner).
9 FX blocks in fixed signal-chain order: PRE, DST, AMP, NR, CAB, EQ, MOD, DLY, RVB.

The choice of platform is incidental — the architecture transfers to any MIDI-controllable multi-FX with a documented parameter table. The GP-100 was selected for availability and for the fact that its parameter table was partially reverse-engineered by the open-source community.

3. System architecture

Three layers, each trivially replaceable by a stub or mock. Only the middle and top layers depend on the reverse-engineered SysEx map; the transport layer is a dumb pipe.

flowchart TB
    subgraph L3[Layer 3 — LLM]
        LLM[llm.py<br/>Claude tool-use<br/>text → PatchSpec]
    end
    subgraph L2[Layer 2 — typed API]
        PS[patches.py<br/>PatchSpec + apply_patch]
        DEV[device.py<br/>GP100 class<br/>typed, validated]
        EFF[(effects.py<br/>146 effect types<br/>from algorithm.xml)]
    end
    subgraph L1[Layer 1 — transport]
        TR[transport.py<br/>mido wrapper<br/>raw SysEx / CC / PC]
    end
    HW[(Valeton GP-100<br/>USB-MIDI)]

    LLM -->|JSON| PS
    PS -->|typed calls| DEV
    DEV -->|lookup| EFF
    DEV -->|bytes| TR
    TR -->|USB-MIDI| HW

3.1 Ground-truth source selection

Three reverse-engineering sources were evaluated for the parameter table:

| Source | Content | Outcome | |---|---|---| | rubberplayer/gp100expressioncc | Recovered 2026-04-20 from Codeberg mirror. Not a parameter dump (earlier secondary references misrepresented it). Narrow Linux/ALSA utility that decodes the expression-pedal SysEx into a normal MIDI CC. | Confirms firmware v1.9 (matches our algorprefxconstdata data-file version). Documents expression-pedal SysEx format: 34-byte message, position packed as 8 × 4-bit nibbles at offset 25, assembled MSB-first into a 32-bit value (appears IEEE 754 float). 31 discrete positions recorded. Strong hypothesis: this nibble-packing is the encoding the pedal uses for all special-range knobs (MOD Rate in Hz, delay Time in ms, pitch in semitones, EQ bands in dB). Testable with a focused 8-byte-payload bisection next session. | | navytuner/gp100-editor | PySide6 editor in Python, 152 effect-model tuples. | Systematically inaccurate on our firmware. Dumped from a later algorprefxconstdata version. All CAB tuples shifted; NR Gate 1/2 swapped; 6 DST models don't exist on v1.9. Retained as a starting skeleton only. | | joseloc300/GP-100-App-Fix v1.2.0 algorithm.xml | 1255-line XML, 146 named effect types with per-knob metadata (idx, name, control_kind, Type 1/2/3, Dmin, Dmax, default, step). | Authoritative for firmware v1.5. Used as source of truth. Parsed into a typed EFFECTS table at build time. |

Later additions:

mikeliddle/PRSTDecoder — decoder/encoder for Valeton .prst preset files. GP-100 uses plain XML (GP-200 is binary). Yielded one real parsed preset (Frusciante / RHCP rock) for ground-truth knob-value patterns. The role of this single preset turned out to be larger than expected — see §4.2.
Amish-03/Patch-Switcher — independent pcap-based protocol analysis. Byte-level confirmation of the SysEx header F0 21 25 7F 47 50 2D 64 12 ... F7, plus an alternative preset-change path via address 00 02 00.

3.2 Transport layer

Python 3.13, python-rtmidi + mido. Minimal wrapper with context-managed port open/close, SysEx header prepending, CC/PC helpers with MIDI range checks, input capture. No pedal-specific semantics — keeps wire-level diffs provable against captures of the official desktop app.

3.3 Device layer

A GP100 class with typed helpers (load_preset(bank, slot), set_block_enabled, set_block_type, set_param_raw, set_param(value_0_10), set_patch_volume, set_bpm, …). Each public method validates inputs against the XML-derived EFFECTS table before anything hits MIDI, because the pedal's firmware has been observed to crash on out-of-range values (see §5).

3.4 Patch specification

A Pydantic PatchSpec — strict schema per block (enabled, type, params) with validators that:

Reject effect-model names not in the legal table for that block.
Reject params longer than the block's Type-1 knob count.
Reject float values outside 0.0–10.0.

An apply_patch(gp, spec) function dispatches blocks in signal-chain order. The write formula (empirically derived on hardware) per block:

CC-enable  →  set_block_type (3 priming SysEx + body + trailing enable)
           →  set_param per knob  →  CC-enable (re-assert)  →  final CC enable/disable

3.5 LLM integration

Two paths:

Interactive (primary, used in all experiments so far) — the assistant acts as the generator in a conversational loop. The user describes a sound in plain language; the assistant writes a PatchSpec JSON that validates against the schema; an apply script loads it and writes it to the device. Iteration cycle is seconds.
Autonomous (scaffolded, deferred) — an LLM client builds a system prompt from the live EFFECTS table and uses Anthropic's tool-use API to force JSON matching the PatchSpec schema. Requires an API key; not exercised in this session due to lack of deployment target.

4. Findings

4.1 The LLM can do this without training

20+ patches were produced and validated on hardware across jazz fusion, pop-punk, sci-fi, horror, animal sounds, pastiche (Takanaka / Blink-182 / Sababa 5), and pure sound-design (UFO, cathedral organ, acid 303). Zero dataset. No fine-tuning. The model's priors on music, artists, and tone history map onto the whitelist of effect types and knob ranges well enough to produce musically coherent results on the first or second try.

4.2 Real-preset mining teaches tone-design that the LLM was getting wrong

Parsing one real preset (a Frusciante rock tone from a .prst file) immediately changed the LLM's defaults. Before: knobs clustered around 50, producing the "factory-default sound". After: asymmetric values (a Gain=15 + Master=99 pattern, for example) producing actual character. This is a calibration problem the LLM solves on its own once it sees one authoritative example, no retraining needed.

Six patterns were extracted from the Frusciante preset and added to the project's reference library:

Asymmetric knob values are the norm; everything-at-50 sounds like factory default.
Effects are often defined-but-disabled in real presets (player toggles live).
Time-based effect Mix values tend to be LOW (reverb Mix 19, delay Mix 30).
Special-range knobs use real units in the file: delay Time in ms, MOD Rate in Hz, EQ bands in dB, pitch in semitones.
Effect names sometimes carry trailing whitespace ("COMP ").
Firmware drift is visible in effect-model inventory (CAB Marshall4x12 exists in v1.8 but not v1.5).

The implication for the broader claim: a single authoritative reference often moves the LLM further than any volume of prompt engineering. Reference-shifting beats prompt-shifting in this domain.

4.3 LUT[n] ≈ display value n (linear 1:1), within the safe ceiling

The initial assumption — that the navytuner editor's 100-entry PARAM_VALUE LUT was non-monotonic — was wrong. Empirical test (LUT[70] on AMP Tweedy Volume → pedal display shows 70) confirmed direct mapping. The safe ceiling is SAFE_LUT_MAX = 90 (avoiding the known top-of-LUT crash; LUT[1] is also skipped as a likely navytuner copy-paste bug).

4.4 Signal-chain write order matters in a non-obvious way

set_block_type's trailing-enable SysEx does not flip the block's CC-level enable state. Follow-up param writes on a CC-off block are accepted but silent — the block stores values but produces no sound until a CC-enable hits. The working order is:

CC-enable first (ensures audibility)
  →  set_block_type (3 priming region-unlock SysEx + body + trailing enable)
  →  set_param per knob
  →  CC-enable (re-assert, belt-and-braces)

This was one of the longest debug loops in the three-day cycle — multi-block patches silently produced near-dry sound until the order was inverted.

4.5 Physical limits of a guitar multi-FX for synthesis

Negative results, kept visible in the patch library for calibration honesty:

sax — a guitar multi-FX cannot produce convincing wind-instrument tones. Pluck attack is inherent to the input signal; we can shape harmonics and pitch but cannot synthesize a reed/breath envelope. Documented as an explicit failure mode after multiple attempts.
russian_accordion — similar limitation. Reed-based continuous-tone instruments are out of reach without polyphonic pitch synthesis.
Microtonal / quarter-tone tones — impossible on a standard-fret guitar without hardware modification. The tone profile can be approximated but the pitch content cannot.

5. Crash taxonomy — three distinct out-of-range error classes, each reproducible

The pedal's firmware was observed to crash on three distinct classes of out-of-range write, each with a different error and a different mitigation. Empirically mapped by bisection on the physical unit:

| Error | Cause | Mitigation | |---|---|---| | algorprefxconstdata at various lines | Writing to a knob idx that doesn't exist for the active effect type, or a param value outside the firmware's acceptable range for that specific knob | Validate knob count + range against the EFFECTS table before sending | | algorcabsonstdata line 398 | CAB param writes use a different SysEx format than every other block. The standard knob-write payload is rejected. | apply_patch silently skips CAB params. Type + enable are honoured. | | audio.c line 1804 getparamaxval | Writing an LUT-encoded value to a bipolar/fractional knob (MOD Rate, delay Time in ms, EQ bands). The value decodes to an internal integer > the knob's max. | Raise NotImplementedError for non-0-99 knobs. The LUT is calibrated for continuous knobs with Dmax=99 only. |

Decision tree implemented by the validators:

flowchart TD
    Start[About to write to the pedal] --> Q1{Is the block CAB?}
    Q1 -->|yes & writing params| CAB_CRASH[🔥 algorcabsonstdata<br/>line 398]
    Q1 -->|yes & type/enable only| Q2
    Q1 -->|no| Q2
    Q2{Knob Type 1<br/>with Dmax 99 or 100?}
    Q2 -->|no: bipolar / Hz / ms / semitones| RATE_CRASH[🔥 audio.c line 1804<br/>getparamaxval]
    Q2 -->|yes| Q3
    Q3{Knob idx valid<br/>for this effect type?<br/>LUT index &le; 90?}
    Q3 -->|no| PREFX_CRASH[🔥 algorprefxconstdata<br/>line 226+]
    Q3 -->|yes| OK[✅ safe write]

A separately useful side-effect of the crash taxonomy: it gives the LLM a structured surface on which to fail safely. The model is told the schema; the schema rejects invalid writes; the pedal is never asked to crash. This is the architectural reason the LLM can be trusted in the loop — its mistakes are caught at the validator layer, not at the hardware layer.

6. Open gaps — all require MIDI-sniffing the official desktop app

| Gap | Why | What would unlock | |---|---|---| | CAB knob encoding | Different SysEx command / address than standard | Programmable CAB volume per patch | | Type 2 Combox / Type 3 Switch writes | LUT payload is continuous-knob only | Control mode comboxes (AC Sim mode, Flex OD mode) and switches (Bright, Sync, Trail) from a patch | | Bipolar and special-range knob encoding | Internal representation unknown (EQ ±50 dB, MOD Rate 0.1–10 Hz, delay Time 20–4000 ms, pitch 0–24 semitones) | Programmable tempo-synced effects, EQ shaping, precise delay times, pitch intervals | | Preset state read-back | No RQ1 (Roland-style 0x11) observed | Dump current state, verify patches, A/B against known-good | | "Save preset" SysEx | Not attempted (risk of corrupting flash with a bad guess) | Persist generated patches to named user slots |

7. Patch library as artefact

Twenty-plus patches were generated and validated live on hardware. A snapshot of the library by category:

Musical tones: ufo, takanaka, takanaka_brazilian_skies, takanaka_tengo_suerte, jazz_fusion, jazz_solo, tele_solo, meow_fusion, shoegaze_wall, cathedral_organ, bells, sababa5_nasnusa, hawaii, blink_182, angine_de_poitrine, brazilian, brazilian_funk, synth_lead (5 variants covering analog pad, digital FM dream, Moog sub-lead, arp sequencer, tape-saturated dream), acid_303.
Character / sound-effect: soviet_scifi, laser_gun, theremin, music_box, lightsaber, female_scream, goblin, robot_guitar, frog, dubstep_wobble, bad_trip, ecstasy, telephone, cowbell, glass_bottle (air-resonance variant), sinister_score (a happy accident that started as "toilet flush"), ufo_abduction, balalaika, taina_3_planet, soviet_wave.
Reference library: frusciante_from_prst (parsed real .prst from a firmware v1.8 desktop-app export).

Each patch carries a notes field documenting design intent (effect-chain rationale, play technique, reference artist / track). The notes field doubles as a dataset for the next iteration of the LLM layer's system prompt.

8. Process observations

Interactive iteration beats autonomous generation for creative work. The assistant regenerating a patch in response to "more reverb", "heavier", "not quite right", "like Takanaka's Brazilian Skies album but brighter" is faster and more musical than a single-shot natural-language-to-patch call. The back-and-forth is the point. This is consistent with the broader observation that creative LLM applications work best as conversation, not RPC.
Reference-based correction is high-leverage. One real preset reshaped the assistant's subsequent defaults more than any prompt-engineering effort. More references would presumably compound, but are gated on source availability (99 factory presets are not freely downloadable; they live in device flash).
Play technique is part of the patch. Many patches only land when the player knows to hit staccato palm-muted, or do slow volume swells, or bend a quarter-tone. The notes field carries this; a future version would formalise a "play hints" field that the LLM emits alongside the patch.

9. Conclusion

The architecture described here is a small instance of a more general pattern: any hardware with a documentable parameter space and a human listening to outputs can be addressed by an LLM, given the work of building a typed, validated bridge between the model's natural-language priors and the device's parameter table. No training, no dataset, no learned mapping is needed. The work is in the schema and the safety net.

The most interesting empirical result is not that the LLM can generate patches — it is that one real preset reshapes the model more than dozens of prompt-engineering attempts. This suggests that for similar reverse-engineering-meets-LLM domains, investment in ground-truth reference acquisition compounds faster than investment in prompt design. We expect this pattern to hold for adjacent hardware classes (synthesisers, drum machines, modular eurorack, even DAW plugin parameter spaces) where parameter tables are similar in shape.

The open gaps are all of the form "more SysEx reverse-engineering would unlock more of the parameter space." They are tractable but require continued protocol-sniffing work against the official desktop application.

Citation

Nix, A. (2026). Agentic Generation of Music-Processor Templates: LLM-Driven Patch Design for a Hardware Guitar Multi-FX, Without Training. Working paper.