How hatha.es works: architecture of a living project

An ancient problem

The labyrinth of sources

Imagine you want to understand what prāṇāyāma is. You open Patañjali’s Yoga Sūtras and find four verses (2.49-2.52) that mention it almost cryptically: “the regulation of inhalation and exhalation.” You need more.

You search the Haṭha Yoga Pradīpikā, where Svātmārāma dedicates the entire second chapter to describing eight specific techniques. But he uses terms you don’t know: kumbhaka, rechaka, pūraka. Each requires its own investigation.

The Upaniṣads, written centuries earlier, mention prāṇa in contexts that seem contradictory: sometimes it’s breath, sometimes vital energy, sometimes the cosmic principle that animates the universe. What’s the relationship between these meanings?

And this is just one concept. Traditional yoga has hundreds.

The guru as living index

In the oral tradition, this problem didn’t exist—or rather, one person solved it: the master.

A guru who had memorized the texts, received commentaries from his own teacher, and practiced for decades, functioned as a living index. When a student asked about prāṇāyāma, the master could recite the relevant Patañjali verse from memory, connect it with the practice described in the Pradīpikā, clarify how the word appears in the Praśna Upaniṣad, and explain why Bihar School teaches the technique in a particular way.

This knowledge was embodied. It didn’t exist in books but in the relationship between master and disciple. Transmission was slow—a lifetime of study—but deep. Each connection came accompanied by context, nuance, practical experience.

The problem: this system doesn’t scale. It depends on finding a qualified master, having access to them for years, and trusting that their memory and understanding are correct.

The written solution: commentaries and concordances

When texts were written down, new tools emerged:

Commentaries (bhāṣya) —like Vyāsa’s on the Yoga Sūtras— attempted to make explicit what the guru transmitted orally. Each cryptic verse received pages of explanation, references to other texts, practical examples. But commentaries generated their own problems: different schools produced contradictory commentaries, and soon you needed a commentary on the commentary.

Indexes and concordances appeared when texts became too numerous to memorize. Medieval librarians developed systems to locate passages by topic, by keyword, by cross-reference. The work was manual and expensive—a monk might spend years compiling the index of a single work.

Thesauri formalized relationships between terms. Not just “this word appears here” but “this word is synonymous with that one” and “this concept is part of that other.” Modern library science inherited these techniques and systematized them.

The problem isn’t new

This challenge—connecting dispersed knowledge—appears in every serious textual tradition:

The Jewish Talmud developed an extraordinarily sophisticated system of cross-references. Each page contains the central text surrounded by commentaries from different eras, with notes referring to discussions in other treatises. A Talmud scholar learns to navigate this network of connections.

Medieval Christian scholasticism created sententiae and indexes of loci communes: thematically organized compilations that allowed finding what Augustine had said on a topic, comparing it with Thomas Aquinas, and tracing the idea back to its biblical sources.

The Islamic hadith tradition developed the sciences of the chain of transmission (isnad): each saying of the Prophet comes accompanied by who transmitted it to whom, allowing evaluation of its authenticity and connection with other narrations.

All these traditions faced the same problem: important knowledge is dispersed, and the connections between fragments are as valuable as the fragments themselves.

Modern fragmentation

Today the problem has intensified. Yoga texts are available as never before—translations, critical editions, recordings of masters—but this abundance brings its own difficulty.

A student can find ten different translations of the Yoga Sūtras, each with its interpretation. They can access the original Pradīpikā in Sanskrit and academic commentaries. They can watch videos of masters explaining techniques. But no one tells them how to connect these sources.

Traditional tools don’t work: they don’t have a guru available 24 hours, printed indexes are incomplete, academic thesauri are too technical. They navigate alone through an ocean of information, drowning in data but thirsty for understanding.

A contemporary response

hatha.es is an attempt to solve this problem by applying technology to ancient principles.

We invent nothing: gurus always connected texts, librarians always created indexes, commentators always explained terms. What we do is automate what previously required decades of manual work, and make it accessible to anyone with an internet connection.

The system we describe below doesn’t pretend to substitute the master—no system can transmit the embodied experience of practice. It aims to do the librarian’s work: organize, connect, facilitate access. So that when you find a guru, you arrive prepared. And so that while you search, you’re not completely alone.

Knowledge architecture

The concept graph

Every page on the site —whether a verse from the Yoga Sūtras, a prāṇāyāma technique, or a glossary term— contains structured metadata. The most important field is terminos: a list of key concepts appearing in that content.

# Example: Yoga Sūtra 1.2
numero: 2
pada: 1
sanscrito: "योगश्चित्तवृत्तिनिरोधः"
transliteracion: "yogaś-citta-vṛtti-nirodhaḥ"
traduccion: "Yoga is the cessation of the fluctuations of the mind."
terminos: ['yoga', 'citta', 'vrtti', 'nirodha']

These terms function as nodes in a graph. When two texts share a term, they’re connected. The system calculates these connections automatically on every update.

Controlled vocabulary

The Sanskrit glossary functions as a controlled vocabulary —a library science standard that guarantees terminological consistency. Each term has a canonical entry:

Normalized form: lowercase, no diacritics in the identifier
Equivalences: the system recognizes that “pranayama” and “prāṇāyāma” are the same concept
Authoritative definition: each term has a single source of truth

This approach avoids the classic problem of information systems: ambiguity. When Sūtra 2.49 mentions prāṇāyāma and verse 2.1 of the Pradīpikā also does, the system knows they’re talking about the same thing.

Bidirectional links

The TextoConexiones component generates two types of links:

Concepts → Glossary: Each term in the verse links to its definition
See also → Other verses: Groups references by source text

Yoga Sūtra 3.3 (about dhyāna)
├── Concepts: dhyāna · samādhi
│   └── → Glossary
└── See also:
    ├── Yoga Sūtras: 1.39 · 2.11 · 2.29...
    ├── Haṭha Pradīpikā: 4.5 · 4.29...
    └── Bhagavad Gītā: 6.10 · 6.25...

The system calculates this at build-time, not when the user visits the page. Result: zero JavaScript running in the browser for connections.

Technical standards adopted

Content Collections (Astro 5)

Content is organized in typed collections. Each collection has a schema that validates structure:

const sutraSchema = z.object({
  numero: z.number(),
  pada: z.number(),
  sanscrito: z.string(),
  transliteracion: z.string(),
  traduccion: z.string().optional(),
  terminos: z.array(z.string()).optional(),
});

If someone tries to add a verse without a number, the system rejects the content. This guarantees structural integrity as the project grows.

Unique identifiers

Each piece of content has a predictable identifier:

Type	Pattern	Example
Yoga Sūtra	`{pada}-{numero}`	`1-02`
Pradīpikā	`{adhyaya}-{numero}`	`2-01`
Glossary	`{term}`	`pranayama`
Āsana	`{name}`	`padmasana`

This system is extensible: adding a new classical text only requires defining its schema and identifier pattern.

Native internationalization

The site is bilingual (Spanish/English) with:

Separate routes: /textos/yoga-sutras/1-02 vs /en/texts/yoga-sutras/1-02
Parallel collections: sutras and sutras-en
Verifiable parity: the system can detect untranslated content

Comparison with academic standards

Dublin Core and bibliographic metadata

The content schema of hatha.es follows Dublin Core principles, the metadata standard for digital resources:

Dublin Core	hatha.es	Use
Title	titulo/nombre	Identification
Subject	terminos	Thematic classification
Language	collection (-en)	Internationalization
Source	fuentes	Traceability

Thesauri and controlled vocabularies

The glossary functions as a simplified thesaurus:

Preferred terms: The canonical form in IAST
Related terms: Via shared terminos field
Scope notes: The term’s definition

Unlike academic thesauri such as the AAT (Getty), we prioritize accessibility over exhaustiveness. A student should be able to navigate intuitively, not need cataloging training.

Linked Data and the semantic web

The system paves the way for Linked Data:

Stable and predictable identifiers
Explicit relationships between entities
Potential to export to RDF/JSON-LD

We don’t currently publish linked data, but the architecture allows it without restructuring.

Why the project is alive

Frictionless updates

Adding new content is trivial:

Create Markdown file with correct frontmatter
The system validates the structure
Automatic build recalculates all connections

No database to migrate, no indexes to rebuild manually. Knowledge flows.

Distributed correction

Any error is corrected in a single place:

Misspelled term: Correct in the glossary, all references update
Improved translation: Edit the file, rebuild
New academic source: Add reference, context is enriched

Organic growth

The shared terms system means every addition enriches everything else:

Add a new Upaniṣad → appears in “See also” of related verses
Add term to glossary → previously orphan links activate
Add prāṇāyāma technique → connects with Sūtras theory

It’s not a static archive. It’s an organism that grows.

Current metrics

Metric	Value
Total pages	1,142
Glossary terms	90
Connections between texts	~1,050
ES/EN parity	100%
Build time	~15 seconds

The most connected term (samādhi) appears in 55 distinct verses. A student can follow the concept through four textual traditions with one click.

Invitation

This system doesn’t pretend to replace deep study or a teacher’s guidance. It aims to facilitate access to texts that for centuries were reserved for a few.

If you find an error, a missing connection, or want to contribute content, the project accepts collaborations. The architecture is designed to grow.

Haṭha Yoga is an intangible heritage. Making it accessible is a service.

Last updated: February 7, 2026