Methodology
How X Persona turns X list memberships into a persona word cloud.
Data Source — X List Memberships
When you analyze an account, X Persona fetches every public X list that account has been added to by other users. These lists are community-curated — real people on X have grouped that account alongside others they consider similar, relevant, or noteworthy.
The list names are the raw signal. A list called "AI Researchers" or "Web3 Founders" tells you something meaningful about how the community perceives that account. The more lists with similar names, the stronger the signal.
Word Extraction & Scoring
Each list name is processed to extract meaningful keywords:
- Normalize — The name is lowercased and any punctuation (dashes, slashes, dots, etc.) is replaced with spaces.
- Tokenize — The cleaned text is split into individual tokens. Numbers are preserved, so a list like "web3" or "ai2024" remains intact.
- Filter — Tokens shorter than 3 characters are discarded. Common stop words (articles, conjunctions, pronouns, and list-specific noise like "follow", "followers", "lists") are also removed.
- Count — The remaining meaningful words are tallied across all list names. A word appearing in 30 different list names scores 30.
- Rank — The top 100 words by frequency are kept and used to build the word cloud.
Word Cloud Rendering
Word size in the cloud directly reflects frequency — a word that appears in more list names is rendered larger. The scaling is proportional: the highest-scoring word sets the maximum size, and all others are sized relative to it.
Colors are assigned randomly from a vibrant palette on each page load, ensuring no two adjacent words share the same color. Click any word to see a drill-down panel showing every list whose name contains that word, along with the list owner and a direct link to view it on X.
Caching & Freshness
Every analysis result is cached in a local database to avoid redundant API calls. When you request a profile that has already been analyzed, the cached result is served instantly — no API call is made.
Cached lookups do not count against the rate limit. Only fresh analyses (cache misses or stale refreshes) are subject to the per-hour limit.
Limitations & Considerations
- List quality varies — The signal depends entirely on the X community. Accounts with few list memberships will produce sparse or less meaningful clouds.
- Recency bias — List names reflect how users described the account when they created the list, which may not reflect recent activity or changes.
- Language — Only alphabetic and alphanumeric tokens are extracted. List names in non-Latin scripts are not currently processed.
- Public lists only — Private lists are excluded, so the analysis represents a subset of all lists an account belongs to.
- Snapshot in time — Analyses are cached for 7 days. The cloud reflects list memberships as of the last fetch date shown on the profile.