Feb 27, 2025

5.3 Assisted Development (Program Synthesis and the Future of Software Development)

Assisted Development ^{^0}

Program synthesis and the future of software development

by Hobson Lane (mesa_python.gitlab.io)

shoutout to Maria Dyshel, Vishvesh Bhat

Agenda

Assisted development
Neurosymbolic reasoning
Program synthesis example
Shy fun

Assisted development

Copilot+VSCode|JetBrains|Neovim
Cursor.sh+VSCode|NeoVim|Jetbrains - $20/mo - just bought Supermaven CLI
Windsurf+Codium (VSCode FOSS fork) - “agentic IDE”
Zed - FOSS Rust by maker of Github’s Atom - focus on fast and eventually AI - 3 yr old almost feature parity with Sublime and VSCode
Aider+terminal
Cline+VSCode|Jetbrains
shy-sh - FOSS Python CLI developer assisstant
Gemini, Phind, Perplexity (Type 2 Assisted Development)

Program synthesis

Deduction (or Compiling) - High level languages (Logic, Haskel, Python)
DSL - including declarative languages (f-strings, Jinja2, Ansible/puppet/Terraform yaml)
Induction (inference) - ML set of input output examples, verifier, or reverse engineering an existing system, like George hotz tiny corp jailbreaking an iphone

Examples of deduction

Python 2to3
regular expressions: r'(for|while|if|else|with|try|except|def)(.*):
Juan’s HDMI screen capture OCR Python->Rust

Next token prediction

Assisted Development ^{^1} ^{^2}

Autocomplete (Tab-Nine, Sublime, Augment Code)
Search (Phind.com, Perplexity.ai)
Conversation (ChatGPT, shy-sh)
Neurosymbolic reasoning - Shoulder-surfing AI (Anthropic “computer use”)

Execute a command in Bash?

cmd="ls -l"
eval " $cmd"

David Crenshaw

”Only 5% of developers use conversational AI to code” ^¹(#links) ^²(#links)

(Type 3 Assisted Development)

Assisted Misalignment

Your favorite LLM was poisoned (fine-tuned) on misaligned and insecure code.³

LLM:

eval $(echo $command_from_some_guy_on_the_internet)

StackOverflow:

Links

TBD proai.org shortURLs and mesa_python.gitlab.io blog post

neurosymbolic reasoning

examples from ARC challenge dsl graph search diagram (exponential complexity) if search on tokens with beam search it’s just depth-first stochastic search of the wrong graph neurosymbolic reasoning searches the graph of valid programs - like AlphaFold, AlphaGo, or AlphaCode

neurosymbolic reasoning
[-] optional slide comparing computer use, puppet and Ansible agents vs “operator” 200/mo https://git.lolcat.ca/lolcat/4get
slide listing 3 uses of LLMs and shout out to tailscale CEO blog post
1. slide on code completion in sublime
2. slide on phind and perplexity and compare to metager and https://git.lolcat.ca/lolcat/4get
3. slide on cognitive load of translating concept to conversation
snake game reinforcement learning demo example with Claude
Gemini vs claude vs augment code
slide on zed

slide on Regex to match python keywords and a compiler optimization vs ml ngram approach

Backup (notes)

Deduction or Compilers or transpilers Fortran Haskell python decorators & factories high level languages, Wikipedia max example
Declarative & domain specific languages and fstrings Jinja2 Django templates and terraform Ansible puppet yaml declarative languages generating functional programs
Induction or inference - ML set of input output examples, verifier, or reverse engineering an existing system, like George hotz tiny corp jailbreaking an iphone

4. Neurosymbolic reasoning agents - Shoulder-surfing AI ("computer use")

code synthesis slides

Math function synthesis from examples or running of a verifier (each example is expensive to verify, using an optimizer or simulator).
Simulation verifier would work for game play or puzzle/maze solving program generation. Generating a program that can solve any maze, input=maze, output=path
Program synthesis, [automatic] code generation, compiler, templating engine, high level [templating] programming language, UML, low code platforms
Deriving a regex from example matches
Deriving a dialog engine or chatbot from example conversations
Math function example max() from logical specification, build on it to create sort pair from max, then max of list, then merge 1 list with 1 val (binary search), then merger sort verifier for both. Functional programming
Machine learning, generalization, regularization, ocams razor,
A star, depth first search, breadth first search, beam search, but need continuous heuristic or metric to say how close you are to optimum program. LLMs really good at generating almost correct programs, so great for depth first search.
If you can break problem into smaller function you can do intermediate depth search for exponential speedup
Current research is focused on improving the accuracy rate on complete programs which will never succeed because Portugal space is far larger than dataset, not spanned by possible programs
Neural network architectures and weights (dl training) are the ultimate in program synthesis, which could be very modular by focusing on layers that are similar
Genetic algorithms for program synthesis
Toy problem regex to recognize a valid pep8 compliant Python…
- keyword
- int
- float
- variable/function name
- raw string literal
- string literal
- dict, tuple, list, argument, type hints, function signature
- for loop, list comprehension, generator
A regex is a grammar and a grammar can be generated/enumerated.
A big refinement would be to constrain the llm token generator, with the Python grammar and vocabulary so that it can never generate a syntax error or non-pep8 code.
Other heuristics could be encoded in the grammar like cyclomatic complexity variable name vocabulary and length limitations and abbreviation control and Hungarian notation, or NL grammar rules (active verbs for functions, is for bool retvals, to for transformation, no pattern not already available in standard Python library of builtins, no list comprehensive, just accumulators, all functions with type hints and args on one line.

proai.org/3-types - How I Program with LLMs by David Crenshaw (CEO Tailscale) ↩
changelog.com/podcast/629 - Jerod interviews David Crenshaw (Tailscale) ↩
proai.org/assisted-misalignment - Mastodon Toot by @Techmeme ↩