BIBLE DATA EXPLORER

Full-text navigator + Data Science Magic Lab: TF-IDF, key phrases (RAKE), concordance, similarity, collocations (PMI), Zipf law, clustering, sentiment, and exports.
Default translation: King James Version (KJV) via an open MIT-licensed JSON dataset.

📚 66 books
🔎 Search + Concordance
🧠 Verse Similarity
🧩 Clustering
📈 Zipf + PMI
⭐ Bookmarks
⬇️ Export JSON / CSV
Loading books list… Scope: Chapter Selected: —

Sacred Text (click a verse)

Analysis (Data Science)

0Total tokens
0Unique tokens
0Verses
0Avg tokens/verse
0Lexical diversity (TTR)
0Hapax (count=1)
Median: —
Std dev: —
Stopword rate: —
Readability: —
Verse length chart tokens per verse (sparkline)
Verse length histogram distribution of verse lengths

Keywords, Phrases, Concordance, Compare

Top 10 Keywords frequency after stopword removal
TF-IDF signature terms chapter vs other chapters in this book
Key phrases (RAKE) multi-word phrases extracted from text
Entities (simple proper-noun extraction) capitalized words, counted
Theme signals keyword-based thematic scoring
Concordance where your term appears
Occurrences
Verses matched
Coverage
Compare two passages Jaccard + cosine similarity + distinctive words
Jaccard (unique tokens)
Cosine (term freq)
Shared top terms
Distinctive in currenthighest relative frequency
Distinctive in comparisonhighest relative frequency

Data Science Magic Lab

Click any verse in the left panel to select it, then this lab finds the most similar verses (cosine similarity on TF-IDF vectors). Tip: choose Chapter scope for the snappiest experience.
Selected verse profile keywords + sentiment + length
Tokens (no stop)
Sentiment (toy)
Top TF-IDF
Most similar verses click to jump/select
Explore where a term “lives” in the text. This builds a distribution across chapters/books and lets you jump directly by clicking bars. For full-Bible maps, load all books first.
Term heat map
Collocations find word pairs that “stick together”. Here we rank bigrams by PMI (pointwise mutual information), with a minimum count filter.
PMI-ranked collocationsscore + count
Zipf’s law says word frequency roughly follows a power law. This plot uses your current analysis scope and estimates the slope of the log-log line.
Slope: — R²: —
Zipf plotlog(rank) vs log(freq)
Cluster verses into groups based on TF-IDF vectors. This is a lightweight k-means (cosine distance). Best with Chapter scope.
Cluster summaryclick a cluster to filter verses
Build a co‑occurrence network of names/places (proper nouns) from the current scope. Nodes are entities; edges mean they appear together in the same verse (or same chapter window).
Tip: load all books if you want cross‑Bible networks.
Network canvasdrag to pan • scroll to zoom • click node for neighbors
Speaker Lab uses patterns like “X said”, “X spake”, “X answered” to estimate dialogue structure. It also estimates speech vs narrative share per verse and draws a speech‑density timeline.
Top speakerscounts of speech cues
Speech density timeline1 = speech-like verse, 0 = narrative-like
Book Analytics works best after you load multiple books (press Load all books). It computes per‑book stats, finds the most similar books to your current selection, and clusters books into “families” based on TF‑IDF.
Book leaderboardclick to jump to a book
Most similar bookscosine similarity (TF‑IDF)
Book clustersk‑means over TF‑IDF
Star verses (⭐) to bookmark. Bookmarks are saved locally in your browser. You can export/import your bookmark set as JSON.
Your bookmarks
Quick magic moves:
  • Click a verse → open Verse Similarity and press “Find similar”.
  • Type a word → watch the Concordance chart update live.
  • Open Term Explorer → map a term across chapters/books and click bars to jump.
  • Try PMI Collocations → find the strongest “phrase glue”.
  • Run Clustering → then click a cluster to filter the text panel.