An ancient problem
The labyrinth of sources
Imagine you want to understand what prāṇāyāma is. You open Patañjali’s Yoga Sūtras and find four verses (2.49-2.52) that mention it almost cryptically: “the regulation of inhalation and exhalation.” You need more.
You search the Haṭha Yoga Pradīpikā, where Svātmārāma dedicates the entire second chapter to describing eight specific techniques. But he uses terms you don’t know: kumbhaka, rechaka, pūraka. Each requires its own investigation.
The Upaniṣads, written centuries earlier, mention prāṇa in contexts that seem contradictory: sometimes it’s breath, sometimes vital energy, sometimes the cosmic principle that animates the universe. What’s the relationship between these meanings?
And this is just one concept. Traditional yoga has hundreds.
The guru as living index
In the oral tradition, this problem didn’t exist—or rather, one person solved it: the master.
A guru who had memorized the texts, received commentaries from his own teacher, and practiced for decades, functioned as a living index. When a student asked about prāṇāyāma, the master could recite the relevant Patañjali verse from memory, connect it with the practice described in the Pradīpikā, clarify how the word appears in the Praśna Upaniṣad, and explain why Bihar School teaches the technique in a particular way.
This knowledge was embodied. It didn’t exist in books but in the relationship between master and disciple. Transmission was slow—a lifetime of study—but deep. Each connection came accompanied by context, nuance, practical experience.
The problem: this system doesn’t scale. It depends on finding a qualified master, having access to them for years, and trusting that their memory and understanding are correct.
The written solution: commentaries and concordances
When texts were written down, new tools emerged:
Commentaries (bhāṣya) —like Vyāsa’s on the Yoga Sūtras— attempted to make explicit what the guru transmitted orally. Each cryptic verse received pages of explanation, references to other texts, practical examples. But commentaries generated their own problems: different schools produced contradictory commentaries, and soon you needed a commentary on the commentary.
Indexes and concordances appeared when texts became too numerous to memorize. Medieval librarians developed systems to locate passages by topic, by keyword, by cross-reference. The work was manual and expensive—a monk might spend years compiling the index of a single work.
Thesauri formalized relationships between terms. Not just “this word appears here” but “this word is synonymous with that one” and “this concept is part of that other.” Modern library science inherited these techniques and systematized them.
The problem isn’t new
This challenge—connecting dispersed knowledge—appears in every serious textual tradition:
The Jewish Talmud developed an extraordinarily sophisticated system of cross-references. Each page contains the central text surrounded by commentaries from different eras, with notes referring to discussions in other treatises. A Talmud scholar learns to navigate this network of connections.
Medieval Christian scholasticism created sententiae and indexes of loci communes: thematically organized compilations that allowed finding what Augustine had said on a topic, comparing it with Thomas Aquinas, and tracing the idea back to its biblical sources.
The Islamic hadith tradition developed the sciences of the chain of transmission (isnad): each saying of the Prophet comes accompanied by who transmitted it to whom, allowing evaluation of its authenticity and connection with other narrations.
All these traditions faced the same problem: important knowledge is dispersed, and the connections between fragments are as valuable as the fragments themselves.
Modern fragmentation
Today the problem has intensified. Yoga texts are available as never before—translations, critical editions, recordings of masters—but this abundance brings its own difficulty.
A student can find ten different translations of the Yoga Sūtras, each with its interpretation. They can access the original Pradīpikā in Sanskrit and academic commentaries. They can watch videos of masters explaining techniques. But no one tells them how to connect these sources.
Traditional tools don’t work: they don’t have a guru available 24 hours, printed indexes are incomplete, academic thesauri are too technical. They navigate alone through an ocean of information, drowning in data but thirsty for understanding.
A contemporary response
hatha.es is an attempt to solve this problem by applying technology to ancient principles.
We invent nothing: gurus always connected texts, librarians always created indexes, commentators always explained terms. What we do is automate what previously required decades of manual work, and make it accessible to anyone with an internet connection.
The system we describe below doesn’t pretend to substitute the master—no system can transmit the embodied experience of practice. It aims to do the librarian’s work: organize, connect, facilitate access. So that when you find a guru, you arrive prepared. And so that while you search, you’re not completely alone.
Knowledge architecture
The concept graph
Every page on the site —whether a verse from the Yoga Sūtras, a prāṇāyāma technique, or a glossary term— contains structured metadata. The most important field is terminos: a list of key concepts appearing in that content.
# Example: Yoga Sūtra 1.2
numero: 2
pada: 1
sanscrito: "योगश्चित्तवृत्तिनिरोधः"
transliteracion: "yogaś-citta-vṛtti-nirodhaḥ"
terminos: ['yoga', 'citta', 'vrtti', 'nirodha']
These terms function as nodes in a graph. When two texts share a term, they’re connected. The system calculates these connections automatically on every update.
Controlled vocabulary
The Sanskrit glossary functions as a controlled vocabulary —a library science standard that guarantees terminological consistency. Each term has a canonical entry:
- Normalized form: lowercase, no diacritics in the identifier
- Equivalences: the system recognizes that “pranayama” and “prāṇāyāma” are the same concept
- Authoritative definition: each term has a single source of truth
This approach avoids the classic problem of information systems: ambiguity. When Sūtra 2.49 mentions prāṇāyāma and verse 2.1 of the Pradīpikā also does, the system knows they’re talking about the same thing.
Bidirectional links
The TextoConexiones component generates two types of links:
- Concepts → Glossary: Each term in the verse links to its definition
- See also → Other verses: Groups references by source text
Yoga Sūtra 3.3 (about dhyāna)
├── Concepts: dhyāna · samādhi
│ └── → Glossary
└── See also:
├── Yoga Sūtras: 1.39 · 2.11 · 2.29...
├── Haṭha Pradīpikā: 4.5 · 4.29...
└── Bhagavad Gītā: 6.10 · 6.25...
The system calculates this at build-time, not when the user visits the page. Result: zero JavaScript running in the browser for connections.
Technical standards adopted
Content Collections (Astro 5)
Content is organized in typed collections. Each collection has a schema that validates structure:
const sutraSchema = z.object({
numero: z.number(),
pada: z.number(),
sanscrito: z.string(),
transliteracion: z.string(),
terminos: z.array(z.string()).optional(),
});
If someone tries to add a verse without a number, the system rejects the content. This guarantees structural integrity as the project grows.
Unique identifiers
Each piece of content has a predictable identifier:
| Type | Pattern | Example |
|---|---|---|
| Yoga Sūtra | {pada}-{numero} | 1-02 |
| Pradīpikā | {adhyaya}-{numero} | 2-01 |
| Glossary | {term} | pranayama |
| Āsana | {name} | padmasana |
This system is extensible: adding a new classical text only requires defining its schema and identifier pattern.
Native internationalization
The site is bilingual (Spanish/English) with:
- Separate routes:
/textos/yoga-sutras/1-02vs/en/texts/yoga-sutras/1-02 - Parallel collections:
sutrasandsutras-en - Verifiable parity: the system can detect untranslated content
Comparison with academic standards
Dublin Core and bibliographic metadata
The content schema of hatha.es follows Dublin Core principles, the metadata standard for digital resources:
| Dublin Core | hatha.es | Use |
|---|---|---|
| Title | titulo/nombre | Identification |
| Subject | terminos | Thematic classification |
| Language | collection (-en) | Internationalization |
| Source | fuentes | Traceability |
Thesauri and controlled vocabularies
The glossary functions as a simplified thesaurus:
- Preferred terms: The canonical form in IAST
- Related terms: Via shared
terminosfield - Scope notes: The term’s definition
Unlike academic thesauri such as the AAT (Getty), we prioritize accessibility over exhaustiveness. A student should be able to navigate intuitively, not need cataloging training.
Linked Data and the semantic web
The system paves the way for Linked Data:
- Stable and predictable identifiers
- Explicit relationships between entities
- Potential to export to RDF/JSON-LD
We don’t currently publish linked data, but the architecture allows it without restructuring.
Explore: the serendipity feed
The problem with linear entry points
Every digital library faces the same tension: you need a starting point, but knowledge is not linear.
Traditional entry points —a table of contents, an index, a search box— are intentional tools. They work when you know what you’re looking for. But some of the most valuable encounters with a text happen when you weren’t looking for anything in particular. You open a book, your eye lands on a verse, and something resonates.
That accidental encounter is difficult to design. Most websites optimize for intention: search, browse, filter. The serendipitous dimension disappears.
Explore is an attempt to recover it.
What it is
Explore presents an infinite scroll of wisdom cards drawn from the full corpus — 1,644 entries spanning all seven classical texts and the Sanskrit glossary. Cards appear in randomized order, reshuffled on every page load. As you scroll, new batches load silently in the background.
Each card shows:
- The source text (colour-coded by tradition)
- The Sanskrit original in Devanāgarī
- The translation
Clicking any card goes directly to the verse page with its full commentary and cross-references.
How it works technically
The feed is powered by a dedicated API endpoint (/api/wisdom.json) that runs server-side on every request:
GET /api/wisdom.json?page=1&size=20&seed=482930&lang=en
Parameters:
page— pagination (the feed is infinite; when it reaches the end it wraps around)size— cards per batch (default 20, max 50)seed— a random integer generated on first load; reused for all subsequent pages so that the shuffle is consistent within a session but different across sessionslang—esoren; determines both which collections are loaded and which URLs are generated
The card pool is built once per server process and cached in memory:
// Cards are loaded from language-specific collections
const [sutras, gita, pradipika, vbt, upanishads,
glosario, gheranda, shivaSamhita] = await Promise.all([
getCollection('sutras-en'), // → /en/texts/yoga-sutras/…
getCollection('bhagavad-gita-en'),
getCollection('hatha-pradipika-en'),
getCollection('vijñana-bhairava-en'),
getCollection('upanishads-en'),
getCollection('glosario-en'), // → /en/glossary/…
getCollection('gheranda-samhita-en'),
getCollection('shiva-samhita-en'),
]);
Seeded shuffle uses a linear congruential generator — a deterministic pseudo-random function that, given the same seed, always produces the same order. This means paginating through the feed is coherent: page 2 continues page 1 logically, without duplicates or gaps.
function seededRandom(seed: number) {
return function() {
seed = (seed * 9301 + 49297) % 233280;
return seed / 233280;
};
}
Card sizing and visual language
Cards come in three sizes that the API assigns based on translation length:
| Size | Condition | Visual weight |
|---|---|---|
S | < 50 characters | Compact; glossary entries, short aphorisms |
M | 50–150 characters | Standard verse |
L | > 150 characters | Dense; long Gītā or Śivasaṃhitā verses |
The CSS grid arranges cards in a masonry-like layout using grid-auto-rows and span values derived from size. No JavaScript layout engine — the browser handles it natively.
Each textual tradition has its own colour palette, deliberately distinct:
| Tradition | Background | Accent | Mode |
|---|---|---|---|
| Yoga Sūtras | warm parchment | dark gold | light |
| Bhagavad Gītā | deep indigo | lavender | dark |
| Haṭha Pradīpikā | dark umber | sand | dark |
| Vijñāna Bhairava | near black | crimson | dark |
| Upaniṣads | dark forest | sage | dark |
| Gheraṇḍa Saṃhitā | dark olive | golden | dark |
| Śivasaṃhitā | deep violet-black | soft purple | dark |
| Glossary | off-white | warm brown | light |
The palette is intentional: scanning the feed, you can identify the tradition of each card before reading a word.
Corpus coverage
The full pool (as of February 2026) contains 1,644 entries in English, covering:
| Text | Entries |
|---|---|
| Haṭha Yoga Pradīpikā | 332 verses |
| Śivasaṃhitā | 644 verses |
| Gheraṇḍa Saṃhitā | 233 ślokas |
| Vijñāna Bhairava Tantra | 112 dhāraṇās |
| Yoga Sūtras | 195 sūtras |
| Upaniṣads | 61 verses |
| Bhagavad Gītā | 38 verses |
| Glossary | 139 terms |
The Spanish feed (/explora/) has an equivalent pool. Both are fully independent: different collections, different translations, different URLs.
The philosophy behind it
Explore doesn’t try to teach. It doesn’t suggest a path, or sort by relevance, or recommend “related content.” It simply surfaces what’s there, in an order no one chose.
This is closer to how classical texts were often encountered in the oral tradition: not as systematic study programmes, but as exposure — a verse heard in passing, a term that appeared unexpectedly and lodged in memory, a question arising from a context that couldn’t have been predicted.
The guru’s oral transmission had an element of this: the teaching given was not always the teaching planned. Something in the moment, in the student’s question, in the atmosphere, determined what was said. The feed is a faint digital echo of that principle — structured randomness as a door to encounter.
Why the project is alive
Frictionless updates
Adding new content is trivial:
- Create Markdown file with correct frontmatter
- The system validates the structure
- Automatic build recalculates all connections
No database to migrate, no indexes to rebuild manually. Knowledge flows.
Distributed correction
Any error is corrected in a single place:
- Misspelled term: Correct in the glossary, all references update
- Improved translation: Edit the file, rebuild
- New academic source: Add reference, context is enriched
Organic growth
The shared terms system means every addition enriches everything else:
- Add a new Upaniṣad → appears in “See also” of related verses
- Add term to glossary → previously orphan links activate
- Add prāṇāyāma technique → connects with Sūtras theory
It’s not a static archive. It’s an organism that grows.
Current metrics
| Metric | Value |
|---|---|
| Total pages | 3,751 |
| Classical texts | 7 |
| Verses / ślokas / dhāraṇās | 1,615 |
| Glossary terms | 139 |
| Explore feed (EN) | 1,644 cards |
| ES/EN parity | 100% |
| Build time | ~11 seconds |
The most connected term (samādhi) appears in over 80 distinct verses. A student can follow the concept through seven textual traditions with one click.
Invitation
This system doesn’t pretend to replace deep study or a teacher’s guidance. It aims to facilitate access to texts that for centuries were reserved for a few.
If you find an error, a missing connection, or want to contribute content, the project accepts collaborations. The architecture is designed to grow.
Haṭha Yoga is an intangible heritage. Making it accessible is a service.
Last updated: February 7, 2026