BIBLE DATA EXPLORER

Full-text navigator + Data Science Magic Lab: TF-IDF, key phrases (RAKE), concordance, similarity, collocations (PMI), Zipf law, clustering, sentiment, and exports.
Default translation: King James Version (KJV) via an open MIT-licensed JSON dataset.

📚 66 books

🔎 Search + Concordance

🧠 Verse Similarity

🧩 Clustering

📈 Zipf + PMI

⭐ Bookmarks

⬇️ Export JSON / CSV

Book

Chapter

Analysis scope

Search term

Token mode

N-grams

Stopwords (comma-separated)

Loading books list… — — Scope: Chapter Selected: —

Sacred Text (click a verse)

Analysis (Data Science)

0Total tokens

0Unique tokens

0Verses

0Avg tokens/verse

0Lexical diversity (TTR)

0Hapax (count=1)

Median: —

Std dev: —

Stopword rate: —

Readability: —

Verse length chart tokens per verse (sparkline)

Verse length histogram distribution of verse lengths

Keywords, Phrases, Concordance, Compare

Top 10 Keywords frequency after stopword removal

TF-IDF signature terms chapter vs other chapters in this book

Key phrases (RAKE) multi-word phrases extracted from text

Entities (simple proper-noun extraction) capitalized words, counted

Theme signals keyword-based thematic scoring

Concordance where your term appears

Occurrences

—

Verses matched

—

Coverage

—

Compare two passages Jaccard + cosine similarity + distinctive words

Compare book

Compare chapter

Jaccard (unique tokens)

—

Cosine (term freq)

—

Shared top terms

—

Distinctive in currenthighest relative frequency

Distinctive in comparisonhighest relative frequency

Data Science Magic Lab

Click any verse in the left panel to select it, then this lab finds the most similar verses (cosine similarity on TF-IDF vectors). Tip: choose Chapter scope for the snappiest experience.

Similar verses

Similarity scope

Selected verse profile keywords + sentiment + length

Tokens (no stop)

—

Sentiment (toy)

—

Top TF-IDF

—

Most similar verses click to jump/select

Explore where a term “lives” in the text. This builds a distribution across chapters/books and lets you jump directly by clicking bars. For full-Bible maps, load all books first.

Term

Map level

Term heat map —

Collocations find word pairs that “stick together”. Here we rank bigrams by PMI (pointwise mutual information), with a minimum count filter.

Min bigram count

Show

PMI-ranked collocationsscore + count

Zipf’s law says word frequency roughly follows a power law. This plot uses your current analysis scope and estimates the slope of the log-log line.

Max ranks

Slope: — R²: —

Zipf plotlog(rank) vs log(freq)

Cluster verses into groups based on TF-IDF vectors. This is a lightweight k-means (cosine distance). Best with Chapter scope.

K clusters

Iterations

Cluster summaryclick a cluster to filter verses

Build a co‑occurrence network of names/places (proper nouns) from the current scope. Nodes are entities; edges mean they appear together in the same verse (or same chapter window).
Tip: load all books if you want cross‑Bible networks.

Window

Min node freq

Min edge weight

Max nodes

Network canvasdrag to pan • scroll to zoom • click node for neighbors

Speaker Lab uses patterns like “X said”, “X spake”, “X answered” to estimate dialogue structure. It also estimates speech vs narrative share per verse and draws a speech‑density timeline.

Top speakers

Strictness

Top speakerscounts of speech cues

Speech density timeline1 = speech-like verse, 0 = narrative-like

Book Analytics works best after you load multiple books (press Load all books). It computes per‑book stats, finds the most similar books to your current selection, and clusters books into “families” based on TF‑IDF.

K clusters

Top similar books

Book leaderboardclick to jump to a book

Most similar bookscosine similarity (TF‑IDF)

Book clustersk‑means over TF‑IDF

Star verses (⭐) to bookmark. Bookmarks are saved locally in your browser. You can export/import your bookmark set as JSON.

Your bookmarks—

Quick magic moves:

Click a verse → open Verse Similarity and press “Find similar”.
Type a word → watch the Concordance chart update live.
Open Term Explorer → map a term across chapters/books and click bars to jump.
Try PMI Collocations → find the strongest “phrase glue”.
Run Clustering → then click a cluster to filter the text panel.