From Pixels to Paragraphs, Embeddings and Vector Spaces

Hi everyone! I'm Mojtaba Maleki, an AI Researcher and Software Engineer at The IT Solutions Hungary. Born on February 11, 2002, I hold a BSc in Computer Science from the University of Debrecen. I'm passionate about creating smart, efficient systems, especially in the fields of Machine Learning, Natural Language Processing, and Full-Stack Development. Over the years, I've worked on diverse projects, from intelligent document processing to LLM-based assistants and scalable cloud applications. I've also authored four books on Computer Science, earned industry-recognized certifications from Google, Meta, and IBM, and contributed to research projects focused on medical imaging and AI-driven automation. Outside of work, I enjoy learning new things, mentoring peers, and yes, I'm still a great cook. So whether you need help debugging a model or seasoning a stew, I’ve got you covered!
Hey there, curious minds! 👋
If you're anything like me, you're constantly tinkering, breaking things (on purpose... mostly 😅), and learning how all this amazing AI magic works behind the scenes. My latest deep dive? Embeddings and wow, what a rabbit hole.
Let me take you along the ride through a course I just finished on Vector Databases, Embeddings and Applications. If you're trying to wrap your head around how machines “understand” images and language, you’ll love this.
🧩 What Even Is an Embedding?
An embedding is like a secret code a way of turning things like images or text into a vector of numbers. Why? Because numbers are the only language models truly speak. Once something is embedded, you can do all sorts of cool math on it, like measuring how similar two things are!
Companies like Google, OpenAI, Meta, and basically every modern AI product use embeddings to power search, recommendation systems, question answering, and more. Embeddings are everywhere, even if you don’t see them.
🧪 My First Hands-On: Embedding MNIST with a VAE
Okay, time to get nerdy. I started off with the MNIST dataset, those iconic 28x28 grayscale digits. I built a Variational Autoencoder (VAE) using Keras to compress these images down into just 2 dimensions. That’s right, each handwritten number got squeezed into a 2D vector.
Here’s the magic moment: plotting those vectors. Suddenly, I could literally see how the model perceived similarity, zeros clustered together, ones in their own corner, and so on.
That was my first “aha!” moment. These numbers? They weren’t random. They captured meaning. ✨
🔍 Measuring Similarity: How Close Are Two Zeros?
Once I had embeddings, I got to play scientist, comparing digits using different distance metrics:
- Euclidean (L2): the straight-line distance.
- Manhattan (L1): step-by-step grid movement.
- Dot Product: projection of one vector onto another.
- Cosine Similarity: angle between vectors, my personal favorite.
I compared two “0” digits and one “1”. Not surprisingly, the two zeros were closer, across all metrics. But seeing it quantified? That was powerful. It wasn’t just intuition anymore, it was math. And I was wielding it.
✍️ Then Came Sentences: Embedding Language
Images were fun, but I couldn’t wait to try this on text. Enter: the sentence-transformers library. With just a few lines of Python, I embedded these three:
- "The team enjoyed the hike through the meadow"
- "The national park had great views"
- "Olive oil drizzled over pizza tastes delicious"
You can probably guess which two were more similar, right? Spoiler: the nature ones were closer in vector space than the pizza line 🍕.
That’s the beauty of sentence embeddings, they capture meaning, not just words. And with cosine distance, I could actually measure how similar two ideas were.
🛠️ Why This Matters (And Why I’m Hooked)
Embeddings are the foundation of so many AI applications, semantic search, chatbots, recommendation systems, LLM memory, you name it. Every time I use ChatGPT, I wonder: what vector magic is going on behind the scenes?
And now, I don’t have to wonder. I get to build that magic.
This course didn’t just teach me about embeddings, it helped me understand how thinking in vector spaces is one of the keys to working with modern AI systems. And honestly? That’s thrilling.
🌱 Small Wins, Big Dreams
It wasn’t all smooth sailing. I got confused by KL divergence (math ain't always friendly), my VAE reconstructions were blurry at first, and it took a bit to wrap my head around vector norms. But every bug I squashed and every scatter plot I drew felt like a step forward.
And for someone with Nobel Prize dreams (yeah, I said it 😎), these baby steps matter. A lot.
I’m just getting started, but embedding theory and practice is now part of my AI toolkit. And it feels good.
📌 TL;DR
- Embeddings turn stuff (images, sentences) into vectors so AI can work with them.
- I built a Variational Autoencoder to embed MNIST digits into 2D space.
- I used
sentence-transformersto embed and compare semantic meaning in text. - Cosine similarity and dot product are my go-to tools for measuring closeness.
- Embeddings power modern search, chatbots, and LLM memory systems.
"Even the greatest AI starts with a print statement."
Stay nerdy,



