Ingestion Pipeline

When you ingest content, it goes through:
  1. Secret scanning — checks for API keys, passwords, tokens
  2. Deduplication — SHA-256 hash prevents duplicate articles
  3. LLM metadata extraction — auto-tags people, topics, organizations, action items, sentiment
  4. Embedding generation — creates vector embeddings for semantic search
  5. Chunk sync — breaks content into retrievable chunks for RAG

Supported Sources

SourceMethod
Markdown filesbestmate ingest --file or directory
Obsidian vaultsmacOS app auto-sync or bestmate deploy --vault
Granola meetingsmacOS app auto-sync (every 30 min)
Slack conversations@bestmate ingest in channels
Textbestmate ingest "your text"
Stdincat file.txt | bestmate ingest

Visibility Levels

LevelWho can querySet with
PrivateOnly the owner--private (default)
TeamTeam members--team
OrganizationEveryone in org--org
Specific usersNamed individuals--visible-to email@...

Managing Articles

bestmate kb list                    # List articles
bestmate kb search "topic"          # Search
bestmate kb get <id>                # View full content
bestmate kb update <id> --title "X" # Update metadata
bestmate kb delete <id>             # Delete + chunks

RAG Retrieval

When someone queries a twin:
  1. Query is embedded and compared against chunk embeddings
  2. Top chunks are scored with boosts for twin match, expertise areas, and recency
  3. Stale content (>90 days) gets a -15% penalty (except decisions)
  4. Top 5 chunks + legacy article matches are combined as context
  5. LLM generates an answer with confidence score