Notes from a CTO #14: LLM Awaiting Promise Land, How Founder Learn Sales
LLMs at Docsumo, A Year of Experimentation and Waiting for Breakthroughs, and book I am reading on how to sale.
1. Thought of the Month
This week, let's discuss our experiment on LLM over the last year and what has been bothering me.
When LLM burst onto the scene, ChatGPT-3 blew my mind with its capacity for creativity. I remember sending contracts and invoices and receiving JSON responses that were tricky even for humans. This really hyped our hopes to solve some of the pesky problems that had been bothering us for quite some time, from mapping, generic classification, nested tables, and other tasks. However, the results from the last year have been interesting but not revolutionary.
At Docsumo, we don't expect 100% accuracy; we never overemphasize accuracy too much but focus on straight-through processing (the number of documents that humans don't have to review). Yet, containing LLM and making it provide data in a structured format has been challenging, even after fine-tuning models from small 7B to large commercial models. Don't get me wrong, LLMs are going to be life-changing for some use cases, but for a lot of cases where the final output must have a fixed schema and structure, we are still experimenting and waiting for that mind-blowing moment.
Limitations to overcome:
Fine-tune 7B and 14B models for specific tasks: One of my hypotheses with the rise of new, smaller models is that they will blow our minds on narrow tasks, but the results have not been that great.
Fine-tuning of a large model for a better result, but the result has large variation.
Holy grail of unstructured to structured data: When I wrote Unleashing The Power Of Unstructured Data: The Rise Of Large AI Models, I was expecting this to come faster than I thought, but it might take a few years of continuous work.
No reproducibility and hallucinations: Here is something interesting. We have multiple standard datasets that we use for benchmarking models. When LLM entered the space, we were achieving 60% accuracy, but after three months, it dropped to 35% on the same model, same dataset, and same prompt. For document processing, explainability and repeatability are crucial. We can't have customers retrying a document and receiving different values.
We have used LLMs in Documo to integrate many features, such as AI assist and a variety of internal tool usages, to improve our processes. LLMs are still new to the scene; we don't know where we are on the curve. A discovery tomorrow could turn the tables upside down. We are working hard to achieve these mind-blowing moments. We hope they arrive sooner rather than later.
2. Podcasts/Essays
Lately, I have been interested in sales. As a tech co-founder, this is a skill that I am not very proficient in. One highly recommended book that I came across on this topic is Founding Sales: The Early Stage Go-to-Market by Peter Kazanjy. You can read freely on his website too.
3. Interesting links
Repos:
patroni: I ran MongoDB inside K8s for two years as paying $300 per month was not a luxury we could afford. Any library that simplifies deploying and monitoring of DB on K8s is always close to my heart.
piper: Text-to-speech is something I always think about when I read new articles. I used to love Pocket TTS. I have a hobby project that I want to work on, may be in the coming week.
go-links: Do you often forget the link to an important file and spend a lot of time searching for it? Here's a solution: simply create a link like go/morningmeeting.
DSPy: DSPy is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline.
For more, follow me on Github: bkrmdahal
Articles:
Language Modeling Reading List (to Start Your Paper Club): Everyone wants to go with the hype of LLM and very few want to understand the details. For those rare breeds, here is a good list.
My boss says we don’t need any engineering managers. Is he right?: I ran Docsumo without EM for 4 years. After we received funding, the first thing we did was hire more leaders. A good EM or any leader can make things a lot easier for everyone, including me and the team. I have seen examples of managers who do a better job than I could ever do, as well as managers who make things worse. It's all about how you handle the transition.
Budgeting with ChatGPT: We were also doing something similar on a larger scale. The results were interesting, although not what we expected. We are currently discussing how to proceed.
Simple is not Easy: From article, I agree wholeheartedly: “Simplicity is possibly the single most important thing on the technical side of software development.”
Evaluations are all we need: With so many evaluations flying in the LLM space, I doubt when someone says we are better than X. The only way I have come to judge LLM is to use it for your case and see if it works best for your model in use.
4. Quotes/ Books
Technical Debt is debt you repay with your soul
- Sr. Software Developer
People often ask me about the architecture that Docsumo 📄 has. My response is that it is a combination of the worst aspects of a monolithic architecture and the worst aspects of microservices. I combined both to create a monster!
It's mostly my fault as an industrial engineer trying to head the tech team of a hardcore AI/ML company. Honestly, I wouldn't have hired myself, but somehow, we have managed to survive until now with a team of 80+ members. That being said, our team has done a fabulous job of improving a lot since I left active coding (the team was just waiting for me to leave active coding, maybe even praying😅). We have big plans for this year, to have the best software development system and process. If that sounds like something you'd be interested to be a part of, we're hiring-
Meme from our Slack
That’s it for this edition. I hope you find it useful.
Best,
Bikram Dahal
P.S If you learned something new today, please share “Notes from a CTO” with your friends and spread the love. ✌🏻