Amazing applications of GenAI

Here is my list of amazing applications of Generative UI, it’s like a mini blog of all the cool things I’d seen that made me think: “holy s**t”

Text to video

The latest model is from Runway called Gen-3 but it’s been a week of competition:

just 4 days after Luma Labs announced Dream Machine:

and we STILL don’t have OpenAI’s Sora model:

tldraw: sketch to running code!

The tldraw demo’s are amazing, it takes a rough sketch and generates working html with styles and JavaScript, to implement your sketch!

Then with webcam support:

Then recently as a dig at apple’s AI calculator:

Whisper: speech to text

Whisper is a modern voice to text model that can even run in your browser: fast cheap transcription.

Text to code

From GitHub Copilot to the futuristic IDE we have had text to code for a while, but it’s STILL amazing to go from prompt to running code. Here is the launch tweet, but since then they have added so many amazing features. A must try for any software engineer.

Text to application

A agent style tool that takes a prompt and turns it into an entire application: design, code, and github pull request!

Vision to pose detection

Modern vision models can do realtime gesture recognition, in the first demo your webcam can capture your hand position and you can use it to draw in the air!

and more:

Vision (screen) to context

At the GPT-4o Launch OpenAI promised a sidekick that can see your screen and answer questions with that context available. As at 18th June we’re still waiting for this to become available, but others have already launched the same idea across many models:

Here is OpenAI’s “coming soon” version:

Stanford on the GenAI Productivity Boost

Compared to a group of workers operating without the tool, those who had help from the chatbot were 14% more productive. Notably, the effect was largest for the least skilled and least experienced workers, who saw productivity gains of up to 35%.

The rapid and ongoing improvement of AI

An ex OpenAI researcher wrote an analysis of the “stacking” improvements in AI and projected forward to say we could be seeing remarkable “AGI” like capabilities as soon as 2027!

Interpreting how LLMs “Think”

Anthropic has some really impressive demos of tools they have built to understand and even modify how an LLM thinks:

And the famous demo where they made Claude think it was literally the Golden Gate Bridge!

