Smarter Document Chunking
Before text can be embedded, it must be broken into pieces (chunks). If you split a sentence in half, you lose its meaning. We use Recursive Character Text Splitting or Semantic Chunking (finding natural semantic changes in sentences using cosine similarity) to keep document context pristine.
Overlap Factor: Keeping a 10% to 20% overlap window between contiguous chunks prevents details at document boundaries from being lost during query lookup.