Wednesday 13 December 2023

New best story on Hacker News: Show HN: Open-source macOS AI copilot using vision and voice

Show HN: Open-source macOS AI copilot using vision and voice
424 by ralfelfving | 154 comments on Hacker News.
Heeey! I built a macOS copilot that has been useful to me, so I open sourced it in case others would find it useful too. It's pretty simple: - Use a keyboard shortcut to take a screenshot of your active macOS window and start recording the microphone. - Speak your question, then press the keyboard shortcut again to send your question + screenshot off to OpenAI Vision - The Vision response is presented in-context/overlayed over the active window, and spoken to you as audio. - The app keeps running in the background, only taking a screenshot/listening when activated by keyboard shortcut. It's built with NodeJS/Electron, and uses OpenAI Whisper, Vision and TTS APIs under the hood (BYO API key). There's a simple demo and a longer walk-through in the GH readme https://ift.tt/EzInCK6 , and I also posted a different demo on Twitter: https://twitter.com/ralfelfving/status/1732044723630805212