MidnightTokensdeveloper portal
Sign In
Unit Study Document

Advanced Chunking & Multimodal RAG

7 min readβ€’Visual explainer included

Smarter Document Chunking

Before text can be embedded, it must be broken into pieces (chunks). If you split a sentence in half, you lose its meaning. We use Recursive Character Text Splitting or Semantic Chunking (finding natural semantic changes in sentences using cosine similarity) to keep document context pristine.

Overlap Factor: Keeping a 10% to 20% overlap window between contiguous chunks prevents details at document boundaries from being lost during query lookup.
Fast Drill

Active Recalls

Card 1 of 1
Question

What is semantic chunking?

Tap card to flip
Answer

Splitting text into chunks based on semantic shifts by calculating cosine distances of embedding vectors between consecutive sentences.

Mastery: 0%
Knowledge Check

Quiz Practice

Question 1 of 1
What is the danger of setting chunk size too large?

Chapter Scratchpad

Auto-saves immediately

Active Recall Cards

Review core concepts before doing the quiz

Fast Drill

Active Recalls

Card 1 of 1
Question

What is semantic chunking?

Tap card to flip
Answer

Splitting text into chunks based on semantic shifts by calculating cosine distances of embedding vectors between consecutive sentences.

Mastery: 0%

AI Study Buddy

Always online

Hi! I'm Spooky, your study buddy! Let's learn together.