Full-text navigator + Data Science Magic Lab: TF-IDF, key phrases (RAKE), concordance, similarity, collocations (PMI), Zipf law, clustering, sentiment, and exports.
Default translation: King James Version (KJV) via an open MIT-licensed JSON dataset.
📚 66 books
🔎 Search + Concordance
🧠 Verse Similarity
🧩 Clustering
📈 Zipf + PMI
⭐ Bookmarks
⬇️ Export JSON / CSV
Loading books list…——Scope: ChapterSelected: —
Sacred Text (click a verse)
Analysis (Data Science)
0Total tokens
0Unique tokens
0Verses
0Avg tokens/verse
0Lexical diversity (TTR)
0Hapax (count=1)
Median: —
Std dev: —
Stopword rate: —
Readability: —
Verse length charttokens per verse (sparkline)
Verse length histogramdistribution of verse lengths
Keywords, Phrases, Concordance, Compare
Top 10 Keywordsfrequency after stopword removal
TF-IDF signature termschapter vs other chapters in this book
Key phrases (RAKE)multi-word phrases extracted from text
Compare two passagesJaccard + cosine similarity + distinctive words
Jaccard (unique tokens)
—
Cosine (term freq)
—
Shared top terms
—
Distinctive in currenthighest relative frequency
Distinctive in comparisonhighest relative frequency
Data Science Magic Lab
Click any verse in the left panel to select it, then this lab finds the most similar verses (cosine similarity on TF-IDF vectors).
Tip: choose Chapter scope for the snappiest experience.
Explore where a term “lives” in the text. This builds a distribution across chapters/books and lets you jump directly by clicking bars.
For full-Bible maps, load all books first.
Term heat map—
Collocations find word pairs that “stick together”. Here we rank bigrams by PMI (pointwise mutual information), with a minimum count filter.
PMI-ranked collocationsscore + count
Zipf’s law says word frequency roughly follows a power law. This plot uses your current analysis scope and estimates the slope of the log-log line.
Slope: —R²: —
Zipf plotlog(rank) vs log(freq)
Cluster verses into groups based on TF-IDF vectors. This is a lightweight k-means (cosine distance).
Best with Chapter scope.
Cluster summaryclick a cluster to filter verses
Build a co‑occurrence network of names/places (proper nouns) from the current scope.
Nodes are entities; edges mean they appear together in the same verse (or same chapter window).
Tip: load all books if you want cross‑Bible networks.
Network canvasdrag to pan • scroll to zoom • click node for neighbors
Speaker Lab uses patterns like “X said”, “X spake”, “X answered” to estimate dialogue structure.
It also estimates speech vs narrative share per verse and draws a speech‑density timeline.
Top speakerscounts of speech cues
Speech density timeline1 = speech-like verse, 0 = narrative-like
Book Analytics works best after you load multiple books (press Load all books).
It computes per‑book stats, finds the most similar books to your current selection, and clusters books into “families” based on TF‑IDF.
Book leaderboardclick to jump to a book
Most similar bookscosine similarity (TF‑IDF)
Book clustersk‑means over TF‑IDF
Star verses (⭐) to bookmark. Bookmarks are saved locally in your browser.
You can export/import your bookmark set as JSON.
Your bookmarks—
Quick magic moves:
Click a verse → open Verse Similarity and press “Find similar”.
Type a word → watch the Concordance chart update live.
Open Term Explorer → map a term across chapters/books and click bars to jump.
Try PMI Collocations → find the strongest “phrase glue”.
Run Clustering → then click a cluster to filter the text panel.