Some think we’re in an AI bubble where AI is being over-hyped. Massive data centers are being built specifically for AI hardware (aka NVIDIA). Power consumption is so extreme that Big Tech wants to go nuclear.
I’ve used the foundational models. I have also used LLMs on the edge. I watched the GTC March 2024 Keynote with NVIDIA CEO Jensen Huang.
Here are the things that stuck with me:
- Gen-ai is good at extracting context from unstructured data. This is a game changer.
- There is a ton of investment $$$ flowing into this space. This will impact how we build software and what/how users interact with computers. This is very much like the previous Dot Com boom where the Internet/Ecommerce would fundamental change commerce… it was just the huge investments were too early for the “other” technical changes needed to make ecommerce ubiquitous: mobile phones and cheap networking.
- Data centers and hyper-scale are needed for some workflows, but you can’t overcome physics and the round-trip latency of data transfer, which makes true interactive multimodal interactions challenging.
- AI at the Edge is where I think the future is headed. We will want to use our phone or tablet with multi-modal gen-ai to assist us with “things.” This requires low-latency. For example, the speech-to-text models running on your phone performing real-time transcription of a voice mail is immensely useful to avoid spam calls.
- According to Jensen Huang’s keynote, the future of programming is tokens. You feed tokens around to different “AI’s” that are specialized in a particular domain. This is the basis for Agentic AI.
I’ve seen shifts in tech and the software industry over my career. Historically, it has been about higher and higher level of abstractions:
- Assembly abstracted machine op codes. C/C++ abstracted assembly.
- Managed languages abstracted unmanaged languages which conversely abstracted the host CPU and hardware architecture (write once/run anywhere).
- Operating systems abstracted the computer hardware
- Databases abstracted the file system
- Sockets abstracted the network.
But in each one of those abstractions, you still wrote code to sequence all that shizzle. It was deterministic logic based on discrete math. Now, in this new gen-ai world you decompose your domain into tokens. And within this domain, you have sub-domains that are specialized/optimized AIs. Each one of these gen-ai domains produces a probabilistic result based on linear algebra and statistics. Basically, it is a very smart guesstimate.
So now, tokens are abstracting the programming languages. And the holy grail everyone is chasing is Artificial General Intelligence (AGI) where the AI can do its own planning/reasoning. This abstracts out the programming language because the computer can figure out its own shizzle.
If you are a software engineer, you will absolutely need to add these tools to your toolkit. And I am not talking about just using a gen-ai API LLM wrapper. You really should dig behind that LLM API wrapper and:
- Learn how machine learning models are created and build your own “hello world” model. You could build the Not Hotdog App and become the next tech unicorn.
- Learn about distillation techniques to reduce model size.
- Learn about how Low-Rank Adaptation of Large Language Models (LoRA) can be used for fine tuning LLMs.
- Learn/experiment with how quantized LMMs behave.
- How to use Ollama to load your own LLM and how the various settings influence model behavior and performance.