Embeddings: how AI turns everything into numbers

The Number Problem

Computers work with numbers. Just numbers. Neural networks? Same. Just math on numbers.

But the world isn't numbers. Words. Images. Sounds. Concepts. How does AI process these?

Embeddings. They convert everything to numbers in a way that preserves meaning. Crucial concept. Underlies all modern AI.

What Embeddings Actually Are

An embedding is a dense vector of numbers representing something.

Word embedding: "cat" becomes [0.2, -0.5, 0.8, ...] (hundreds of numbers).

Image embedding: a photo becomes [0.1, 0.9, -0.3, ...] (thousands of numbers).

The numbers aren't random. They're learned to capture meaning. Similar things get similar embeddings. Different things get different embeddings.

That's the key: similarity in meaning becomes similarity in numbers. Mathematical operations on embeddings reflect semantic relationships.

Why We Need Embeddings

You could represent words as one-hot vectors. "Cat" = [1,0,0,...,0]. "Dog" = [0,1,0,...,0]. Unique number for each word.

Problem: no relationship captured. "Cat" and "dog" are as different as "cat" and "airplane." All vectors orthogonal. No semantic meaning.

Embeddings solve this. "Cat" and "dog" get similar embeddings (both animals). "Cat" and "airplane" get different embeddings. Similarity in vector space reflects similarity in meaning.

Now math operations make sense. Arithmetic on embeddings corresponds to reasoning about meaning.

How Embeddings Are Learned

Embeddings aren't hand-crafted. They're learned from data.

Word Embeddings (Word2Vec approach):

Train a neural network on a simple task: predict context words from a target word. Or vice versa.

Example: sentence "The cat sat on the mat." For target word "cat," predict "the," "sat," "on."

Network learns: to predict context well, it needs to represent similar words similarly. "Cat" and "dog" appear in similar contexts. They get similar embeddings.

The embeddings are a byproduct. Not the task goal. But they capture semantic meaning.

Modern Approach (Transformers):

Learn embeddings as part of a larger model. Language model predicts next word. Image model classifies objects. Embeddings emerge as internal representations.

These are contextual. Same word gets different embeddings in different contexts. "Bank" (financial) vs "bank" (river) get different representations.

The Semantic Space

Embeddings create a geometric space where meaning is geometry.

Similarity = Proximity: Similar concepts cluster. Animals cluster. Vehicles cluster. Abstract concepts cluster. Distance measures similarity.
Relationships = Directions: Famous example: king - man + woman ≈ queen

Vector arithmetic captures relationships. The direction from "man" to "king" (gender to royalty) is similar to "woman" to "queen."

Analogies become vector operations. Mind-blowing but it works.

Dimensions = Attributes:

Each dimension captures some attribute. One dimension might be "animacy" (living vs non-living). Another might be "size." Another "abstractness."

Hundreds of dimensions capture hundreds of attributes. Combined, they represent meaning.

Different Types of Embeddings

Word Embeddings: Words to vectors. Word2Vec, GloVe, FastText. Foundation of NLP.
Sentence Embeddings: Entire sentences to vectors. Capture meaning of full sentences, not just words. Used for semantic search.
Image Embeddings: Images to vectors. CNN features. Vision transformer outputs. Enable image search, similarity comparison.
Multimodal Embeddings: Different modalities to the same space. Text and images get comparable embeddings. CLIP does this. Enables cross-modal search.
Graph Embeddings: Nodes in graphs to vectors. Capture network structure. Used in social networks, knowledge graphs.

How Embeddings Are Used

Similarity Search: Find similar items. Nearest neighbors in embedding space. Search engines, recommendation systems.
Classification: Use embeddings as features for classification. Semantic features, not raw data. Better generalization.
Clustering: Group similar items. K-means on embeddings. Topic modeling, customer segmentation.
Transfer Learning: Use embeddings from large model in small task. Pretrained knowledge transfers. Common in vision and NLP.
Retrieval-Augmented Generation: Embed queries and documents. Retrieve relevant documents. Provide to language model. Factual AI responses.

Binary Embeddings (The Efficient Alternative)

Traditional embeddings: floating-point vectors. 32 bits per dimension. Large memory footprint.

Binary embeddings: 1 bit per dimension. Each dimension is +1 or -1. 32× less memory.

How They Work:

Learn embeddings normally. Then binarize: positive dimensions become +1, negative become -1.

Similarity: instead of dot product, use Hamming distance or XNOR-popcount. Much faster.

Trade-offs:

Lose some precision. But for many tasks, it doesn't matter. Retrieval, nearest neighbor search work fine with binary.

Gain: massive speed and memory efficiency. Deploy on edge devices. Process billions of vectors quickly.

Dweve's Approach:

Constraints are binary patterns. Inherently binary embeddings. 65,536-bit hypervectors. Efficient storage, fast operations.

Pattern matching through XNOR and popcount. Similarity through agreement counting. Binary all the way down.

Dimensionality Matters

How many dimensions? More isn't always better.

Too Few Dimensions: Can't capture complexity. Different concepts collide. Lose important distinctions.

Too Many Dimensions: Computational cost. Memory usage. Overfitting. Curse of dimensionality (everything becomes equidistant in high dimensions).

Typical Sizes:

Word embeddings: 100-300 dimensions

Sentence embeddings: 384-1024 dimensions

Image embeddings: 512-2048 dimensions

Binary hypervectors: 1024-65536 bits (for robust properties)

Choice depends on task complexity and computational budget.

What You Need to Remember

1. Embeddings convert everything to numbers. Words, images, concepts become vectors. Enables AI processing.
2. Meaning becomes geometry. Similar concepts get similar vectors. Distance measures similarity. Directions capture relationships.
3. Learned from data, not hand-crafted. Neural networks learn embeddings as part of training. Patterns in data determine representation.
4. Enable semantic operations. Math on vectors reflects reasoning about meaning. Vector arithmetic does analogies.
5. Multiple types for different data. Words, sentences, images, graphs. Each has specialized embedding methods.
6. Binary embeddings offer efficiency. 1 bit per dimension instead of 32. Massive memory and speed gains. Works for many tasks.
7. Dimensionality is a trade-off. More dimensions capture more complexity. But cost computational resources. Balance needed.

The Bottom Line

Embeddings are how AI bridges the gap between human concepts and machine computation. Everything meaningful gets converted to vectors in a space where similarity in meaning becomes similarity in geometry.

This isn't just representation. It's the foundation of modern AI. Search, recommendation, generation, understanding. All rely on embeddings.

The vectors aren't arbitrary. They're learned to capture semantic structure. The geometry reflects meaning. Math operations correspond to reasoning.

Binary embeddings show you don't need floating-point precision for semantic meaning. 1-bit representations work. Efficiently. At scale. Deployed anywhere.

Understanding embeddings means understanding how AI sees the world. Not as words or images. As vectors in high-dimensional space where meaning is mathematics.

Want efficient embeddings? Explore Dweve's hypervector approach. 65,536-bit binary patterns. XNOR-based similarity. Semantic meaning in binary space. The kind of representation that works at hardware speed.

Embeddings: how AI turns everything into numbers

The Number Problem

What Embeddings Actually Are

Why We Need Embeddings

How Embeddings Are Learned

The Semantic Space

Different Types of Embeddings

How Embeddings Are Used

Binary Embeddings (The Efficient Alternative)

Dimensionality Matters

What You Need to Remember

The Bottom Line

Tagged with

About the Author

Marc Filipan

Related posts

The Neuro-Symbolic Renaissance: Why the Future of AI Combines Intuition with Logic

The End of the Black Box: Why Transparency is Non-Negotiable

We Built AI Different

Stay updated with Dweve