5.3 Assisted Development (Program Synthesis and the Future of Software Development)
Assisted Development ^0
Program synthesis and the future of software development
by Hobson Lane (mesa_python.gitlab.io)
shoutout to Maria Dyshel, Vishvesh Bhat
Agenda
- Assisted development
- Neurosymbolic reasoning
- Program synthesis example
- Shy fun
Assisted development
- Copilot+VSCode|JetBrains|Neovim
- Cursor.sh+VSCode|NeoVim|Jetbrains - $20/mo - just bought Supermaven CLI
- Windsurf+Codium (VSCode FOSS fork) - “agentic IDE”
- Zed - FOSS Rust by maker of Github’s Atom - focus on fast and eventually AI - 3 yr old almost feature parity with Sublime and VSCode
- Aider+terminal
- Cline+VSCode|Jetbrains
- shy-sh - FOSS Python CLI developer assisstant
- Gemini, Phind, Perplexity (Type 2 Assisted Development)
Program synthesis
- Deduction (or Compiling) - High level languages (Logic, Haskel, Python)
- DSL - including declarative languages (f-strings, Jinja2, Ansible/puppet/Terraform yaml)
- Induction (inference) - ML set of input output examples, verifier, or reverse engineering an existing system, like George hotz tiny corp jailbreaking an iphone
Examples of deduction
- Python 2to3
- regular expressions:
r'(for|while|if|else|with|try|except|def)(.*): - Juan’s HDMI screen capture OCR Python->Rust
Next token prediction
Assisted Development ^1 ^2
-
Autocomplete (Tab-Nine, Sublime, Augment Code)
-
Search (Phind.com, Perplexity.ai)
-
Conversation (ChatGPT, shy-sh)
-
Neurosymbolic reasoning - Shoulder-surfing AI (Anthropic “computer use”)
Execute a command in Bash?
cmd="ls -l"
eval " $cmd"
David Crenshaw
”Only 5% of developers use conversational AI to code” 1(#links) 2(#links)
(Type 3 Assisted Development)
Assisted Misalignment
Your favorite LLM was poisoned (fine-tuned) on misaligned and insecure code.3
LLM:
eval $(echo $command_from_some_guy_on_the_internet)
StackOverflow:
Links
TBD proai.org shortURLs and mesa_python.gitlab.io blog post
neurosymbolic reasoning
examples from ARC challenge dsl graph search diagram (exponential complexity) if search on tokens with beam search it’s just depth-first stochastic search of the wrong graph neurosymbolic reasoning searches the graph of valid programs - like AlphaFold, AlphaGo, or AlphaCode
-
neurosymbolic reasoning
-
[-] optional slide comparing computer use, puppet and Ansible agents vs “operator” 200/mo https://git.lolcat.ca/lolcat/4get
-
slide listing 3 uses of LLMs and shout out to tailscale CEO blog post
-
1. slide on code completion in sublime
-
2. slide on phind and perplexity and compare to metager and https://git.lolcat.ca/lolcat/4get
-
3. slide on cognitive load of translating concept to conversation
-
snake game reinforcement learning demo example with Claude
-
Gemini vs claude vs augment code
-
slide on zed
- slide on Regex to match python keywords and a compiler optimization vs ml ngram approach
Backup (notes)
- Deduction or Compilers or transpilers Fortran Haskell python decorators & factories high level languages, Wikipedia max example
- Declarative & domain specific languages and fstrings Jinja2 Django templates and terraform Ansible puppet yaml declarative languages generating functional programs
- Induction or inference - ML set of input output examples, verifier, or reverse engineering an existing system, like George hotz tiny corp jailbreaking an iphone
4. Neurosymbolic reasoning agents - Shoulder-surfing AI ("computer use")
code synthesis slides
- Math function synthesis from examples or running of a verifier (each example is expensive to verify, using an optimizer or simulator).
- Simulation verifier would work for game play or puzzle/maze solving program generation. Generating a program that can solve any maze, input=maze, output=path
- Program synthesis, [automatic] code generation, compiler, templating engine, high level [templating] programming language, UML, low code platforms
- Deriving a regex from example matches
- Deriving a dialog engine or chatbot from example conversations
- Math function example max() from logical specification, build on it to create sort pair from max, then max of list, then merge 1 list with 1 val (binary search), then merger sort verifier for both. Functional programming
- Machine learning, generalization, regularization, ocams razor,
- A star, depth first search, breadth first search, beam search, but need continuous heuristic or metric to say how close you are to optimum program. LLMs really good at generating almost correct programs, so great for depth first search.
- If you can break problem into smaller function you can do intermediate depth search for exponential speedup
- Current research is focused on improving the accuracy rate on complete programs which will never succeed because Portugal space is far larger than dataset, not spanned by possible programs
- Neural network architectures and weights (dl training) are the ultimate in program synthesis, which could be very modular by focusing on layers that are similar
- Genetic algorithms for program synthesis
- Toy problem regex to recognize a valid pep8 compliant Python…
- keyword
- int
- float
- variable/function name
- raw string literal
- string literal
- dict, tuple, list, argument, type hints, function signature
- for loop, list comprehension, generator
- A regex is a grammar and a grammar can be generated/enumerated.
- A big refinement would be to constrain the llm token generator, with the Python grammar and vocabulary so that it can never generate a syntax error or non-pep8 code.
- Other heuristics could be encoded in the grammar like cyclomatic complexity variable name vocabulary and length limitations and abbreviation control and Hungarian notation, or NL grammar rules (active verbs for functions, is for bool retvals, to for transformation, no pattern not already available in standard Python library of builtins, no list comprehensive, just accumulators, all functions with type hints and args on one line.
Footnotes
-
proai.org/3-types - How I Program with LLMs by David Crenshaw (CEO Tailscale) ↩
-
changelog.com/podcast/629 - Jerod interviews David Crenshaw (Tailscale) ↩
-
proai.org/assisted-misalignment - Mastodon Toot by @Techmeme ↩
