Personal AI and why RAG is my best friend

Urania Logico - AI EN tech


One of the biggest perplexity around AI in my community has been whether or not we’d be able to own the means, and so how to train models if you need supercomputers and how to use them uncensored or on local machines.

I almost went to buy a new computerdog, because of course I wanted to resist this disparity and have a powerful machine to be able to run some big models. I thought that -maybe- 64GB of RAM and/or an RTX Nvidia were needed to be up to date with tech. We couldn’t let them win because of the hardware abandonment of the opponent and use a SaaS product only because our pcs are too old. Anyway, I will postpone this purchase a little further ‘cause I am now exploring a new approach and this emerging scene.

First of all, AI is not a tool, AI is a technique. So it can be neutral. There is a little scene on HuggingFace with amazing devs and a lot of MIT models and datasets. Of course it all starts from Llama, from Meta and other companies who gave the models that required a lot of computational power to be trained, but models can be also fine tuned and there is a surge in tools designed to streamline AI training process.

Secondly, even stupid models, smaller than 5B parameters can do something better than you do. For Creative writing you don’t need the biggest model in the globe, you need a creative one, and models like Qwen2.5-Coder:3B are good enough for autocomplete in VSCode and runs fast on low end GPUs. Other fantastic things you can do include hive minds and speculative decoding. Small models with RAG(Retrieval-Augmented Generation) enabled and your books and datasets basically gives you a personal expert in a particular topic. RAG uses a vector database to yoink in additional context during a generation. Even 3B models can be quite capable if you’re providing them the entire context to work on. They don’t need to relate on their own knowledge.

The best software so far from a UX perspective is GPT4All by Nomic.ai, introduced to me by L L. The embedding of the context you provide takes a while on my computer and migration of the sql database from Windows to Linux is not available, even updating the sql db with the correct paths. You can probably find the esoterical database in the .local/share/nomic.ai directory. Please, contact me if you know better solutions or if you just want to share your knowledge.

To organize chats is to organize our minds

The chat-based design in LLM software mirrors the good old command line’s simplicity and directness, offering users a natural language interface to issue commands and retrieve results. Like the command line, it relies on a sequential input-output structure, making interactions linear and easy to follow. However, where the command line demands precision and familiarity with syntax, chat interfaces embrace the flexibility of conversational language, enabling users to express queries more intuitively (unless you are a prompt engineer). Some SaaS give you chats that allow multimedia content and mix different type of generative models.

Organizing multiple chats in LLM-based software is a key design challenge, as it requires balancing clarity, usability, and quick access to relevant information. Will people be too lazy and directly ask the AI to build and navigate into the hierarchy of their thoughts? An AI capable of structuring thoughts, building hierarchies, and suggesting connections offers immense convenience. Humans naturally offload complex tasks to tools or systems.

That’s why the Personal AI scene is important, AI-driven organization might not always align with the user’s unique cognitive preferences, leading to frustration. People will probably lose the already poor skill of organizing their own thoughts, leading to dependency to companies and their products on a hyperintimate level. If you dare put your hands on python and models, you can have your personal AI, or at least some hybrid solutions. Fine tuning, merging models and RAG are some of the strategy, and they say it’s a bit like alchemy.

Remember

A computer can never be held accountable, therefore a computer must never make a management decision

1979, IBM presentation





☽ ❍ ☾

hello[@]gefn[.]net
A086 90CB C185 A113 F963 1EFC 9E24 7733 CE4C 8DB4