Not at scale. Currently we do some extraction for metadata, but pretty simple. D...

Not at scale. Currently we do some extraction for metadata, but pretty simple. Doing LLM based pre-processing of each chunk like this can be quite expensive especially with billions of them. Summarizing each document before ingestion could cost thousands of dollars when you have billions.

We have been experimenting with semantic chunking (https://www.neum.ai/post/contextually-splitting-documents) and semantic selectors (https://www.neum.ai/post/semantic-selectors-for-structured-d...) but from a scale perspective. For example, if we have 1 millions docs, but we know they are generally similar in format / template, then we can bypass having to use an LLM to analyze them one by one and simply help create scripts to extract the right info.

We think there are clever approaches like this that can help improve RAG while still being scalable.