Tuesday 26 March 2024

New best story on Hacker News: Launch HN: Aqua Voice (YC W24) – Voice-driven text editor

Launch HN: Aqua Voice (YC W24) – Voice-driven text editor
374 by the_king | 127 comments on Hacker News.
Hey HN! We’re Jack and Finn from Aqua Voice ( https://withaqua.com/ ). Aqua is a voice-native document editor that combines reliable dictation and natural language commands, letting you say things like: “make this a list” or “it’s Erin with an E” or “add an inline citation here for page 86 of this book”. Here is a demo: https://youtu.be/qwSAKg1YafM . Finn, who is big-time dyslexic, has been using dictation software since the sixth grade when his dad set him up on Dragon Dictation. He used it through school to write papers, and has been keeping his own transcription benchmarks since college. All that time, writing with your voice has remained a cumbersome and brittle experience that is riddled with painpoints. Dictation software is still terrible. All the solutions basically compete on accuracy (i.e. speech recognition), but none of them deal with the fundamentally brittle nature of the text that they generate. They don't try to format text correctly and require you to learn a bunch of specialized commands, which often are not worth it. They're not even close to a voice replacement for a keyboard. Even post LLM, you are limited to a set of specific commands and the most accurate models don’t have any commands. Outside of these rules, the models have no sense for what is an instruction and what is content. You can’t say “and format this like an email” or “make the last bullet point shorter”. Aqua solves this. This problem is important to Finn and millions of other people who would write with their voice if they could. Initially, we didn't think of it as a startup project. It was just something we wanted for ourselves. We thought maybe we'd write a novel with it - or something. After friends started asking to use the early versions of Aqua, it occurred to us that, if we didn't build it, maybe nobody would. Aqua Voice is a text editor that you talk to like a person. Depending on the way that you say it and the context in which you're operating, Aqua decides whether to transcribe what you said verbatim, execute a command, or subtly modify what you said into what you meant to write. For example, if you were to dictate: "Gryphons have classic forms resembling shield volcanoes," Aqua would output your text verbatim. But if you stumble over your words or start a sentence over a few times, Aqua is smart enough to figure that out and to only take the last version of the sentence. The vision is not only to provide a more natural dictation experience, but to enable for the first time an AI-writing experience that feels natural and collaborative. This requires moving away from using LLMs for one-off chat requests and towards something that is more like streaming where you are in constant contact with the model. Voice is the natural medium for this. Aqua is actually 6 models working together to transcribe, interpret, and rewrite the document according to your intent. Technically, executing a real-time voice application with a language model at its core requires complex coordination between multiple pieces. We use MoE transcription to outperform what was previously thought possible in terms of real-time accuracy. Then we sync up with a language model to determine what should be on the screen as quickly as possible. The model isn't perfect, but it is ready for early adopters and we’ve already been getting feedback from grateful users. For example, a historian with carpal tunnel sent us an email he wrote using Aqua and said that he is now able to be five times as productive as he was previously. We've heard from other people with disabilities that prevent them from typing. We've also seen good adoption from people who are dyslexic or simply prefer talking to typing. It’s being used for everything from emails to brainstorming to papers to legal briefings. While there is much left to do in terms of latency and robustness, the best experiences with Aqua are beginning to feel magical. We would love for you to try it out and give us feedback, which you can do with no account on https://withaqua.com . If you find it useful, it’s $10/month after a 1000-token free trial. (We want to bump the free trial in the future, but we're a small team, and running this thing isn’t cheap.) We’d love to hear your ideas and comments with voice-to-text!

New best story on Hacker News: Baltimore's Key Bridge struck by cargo ship, collapses

Baltimore's Key Bridge struck by cargo ship, collapses
368 by tbihl | 788 comments on Hacker News.


New best story on Hacker News: Show HN: Glossarie – a new, immersive way to learn a language

Show HN: Glossarie – a new, immersive way to learn a language
346 by jonathanb88 | 150 comments on Hacker News.
Hi HN, For over two years I've been working on an App to learn languages (currently French, Italian and Spanish), together with my partner, a language teacher. I think it is finally ready to share with this community! The idea is to introduce vocabulary and grammar whilst you read eBooks in your own language. I've found that it is easier to remember vocabulary 'in context' and with regular repetition. Plus you don't have to carve out dedicated time for language learning. Other apps require you to build a habit around various exercises or ‘games’, whereas lots of people already read books. From testing with early users so far it's proving effective for building a basic understanding of a language and quickly getting to the point where you can read and broadly understand text in the target language. It’s even better in combination with other apps that help with listening/speaking like Pimsleur. There were lots of technical challenges making this. It turned out to be (reassuringly) hard to get accuracy to an acceptable level, requiring a rabbit-hole into machine translation. There was a lot of testing required to optimise the engine that chooses the translations to show and to reduce the friction when reading books. And the backend to support uploading books is a beast in itself. I’d love to share details if there is interest. Roadmap - Accuracy - 100% accuracy is the target, but at present there can be errors. Feedback from users will be important here so that accuracy issues can be generalised and solved at scale. Errors can be reported within the app - please do so if you spot anything! - Dynamic difficulty - rather than have a progression of difficulty levels I’d prefer to introduce vocabulary and grammar automatically in response to user progress, balancing against the friction of seeing unfamiliar words. There’s a lot ‘under the hood’ to manage this today, but plenty of room to improve. - More practice features - to reinforce vocabulary/grammar and support writing, listening and speaking. - Better eBook support - improving the formatting of eBooks within the app and providing more methods for finding good books to read. Use of AI - LLMs provided a step change in accuracy and have enabled a feature that explains translations and grammar to the user - vastly improving the utility versus a year ago. - I believe apps like this, which use AI to enhance or scale functionality rather than simply acting as a wrapper over APIs, will be the major beneficiaries as LLMs improve. Take a look, and let me know your thoughts or questions!

New best story on Hacker News: New Aztec Codices Discovered: The Codices of San Andrés Tetepilco

New Aztec Codices Discovered: The Codices of San Andrés Tetepilco
370 by dzdt | 150 comments on Hacker News.


New best story on Hacker News: The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed

The Francis Scott Key Bridge in Baltimore, Maryland Has Collapsed
517 by repelsteeltje | 374 comments on Hacker News.


Friday 22 March 2024

New best story on Hacker News: Show HN: Memories – FOSS Google Photos alternative built for high performance

Show HN: Memories – FOSS Google Photos alternative built for high performance
695 by radialapps | 201 comments on Hacker News.
Memories is a FOSS Google Photos alternative that you can self-host (it runs as a Nextcloud plugin). Website: https://ift.tt/2efCWDj GitHub: https://ift.tt/BrO596z Demo Server: https://ift.tt/xstMCXi (demo runs in San Francisco on a free-tier cloud vm) Memories has been built ground-up for high performance and is extremely fast when configured correctly. In our testing environment, it can load a timeline view with 100k photos in under 500ms, including query and rendering time! Some features to highlight: * A timeline similar to Google Photos where you can skip to any time in history instantly. * AI-based tagging that runs locally on your server, identifying and tagging people and objects. * Albums and external sharing. * Metadata editing support * A world map of your photos, supported both on mobile and the web * Did I mention it's extremely fast? Would love to hear feedback from the HN community! :)

Saturday 9 March 2024

New best story on Hacker News: Show HN: Hatchet – Open-source distributed task queue

Show HN: Hatchet – Open-source distributed task queue
546 by abelanger | 180 comments on Hacker News.
Hello HN, we're Gabe and Alexander from Hatchet ( https://hatchet.run ), we're working on an open-source, distributed task queue. It's an alternative to tools like Celery for Python and BullMQ for Node.js, primarily focused on reliability and observability. It uses Postgres for the underlying queue. Why build another managed queue? We wanted to build something with the benefits of full transactional enqueueing - particularly for dependent, DAG-style execution - and felt strongly that Postgres solves for 99.9% of queueing use-cases better than most alternatives (Celery uses Redis or RabbitMQ as a broker, BullMQ uses Redis). Since the introduction of SKIP LOCKED and the milestones of recent PG releases (like active-active replication), it's becoming more feasible to horizontally scale Postgres across multiple regions and vertically scale to 10k TPS or more. Many queues (like BullMQ) are built on Redis and data loss can occur when suffering OOM if you're not careful, and using PG helps avoid an entire class of problems. We also wanted something that was significantly easier to use and debug for application developers. A lot of times the burden of building task observability falls on the infra/platform team (for example, asking the infra team to build a Grafana view for their tasks based on exported prom metrics). We're building this type of observability directly into Hatchet. What do we mean by "distributed"? You can run workers (the instances which run tasks) across multiple VMs, clusters and regions - they are remotely invoked via a long-lived gRPC connection with the Hatchet queue. We've attempted to optimize our latency to get our task start times down to 25-50ms and much more optimization is on the roadmap. We also support a number of extra features that you'd expect, like retries, timeouts, cron schedules, dependent tasks. A few things we're currently working on - we use RabbitMQ (confusing, yes) for pub/sub between engine components and would prefer to just use Postgres, but didn't want to spend additional time on the exchange logic until we built a stable underlying queue. We are also considering the use of NATS for engine-engine and engine-worker connections. We'd greatly appreciate any feedback you have and hope you get the chance to try out Hatchet.

New best story on Hacker News: Home Lab Beginners guide

Home Lab Beginners guide
592 by ashitlerferad | 366 comments on Hacker News.


New best story on Hacker News: My favourite animation trick: exponential smoothing (2023)

My favourite animation trick: exponential smoothing (2023)
653 by atan2 | 371 comments on Hacker News.


New best story on Hacker News: Sweden Is a NATO Member

Sweden Is a NATO Member
541 by belter | 626 comments on Hacker News.