LlamaIndex Integration
Trace your LlamaIndex queries end to end - from the initial query through retrieval, node processing, and response synthesis. Glassbrain integrates via the LlamaIndex callback system to capture the full execution flow of your RAG pipelines.
Pro plan and above. The LlamaIndex integration is available on the Pro plan and above. Upgrade your plan in the dashboard to enable this integration.
Installation
Install the Glassbrain SDK alongside LlamaIndex for your language.
JavaScript / TypeScript
npm install @glassbrain/js llamaindexPython
pip install glassbrain llama-indexQuick Start
Register the Glassbrain callback handler with LlamaIndex to start tracing. Once registered, all query engine and index operations are traced automatically.
JavaScript / TypeScript
400 font-semibold">import {
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
} 400 font-semibold">from 400 font-semibold">class="text-emerald-400">"llamaindex";
400 font-semibold">import { GlassbrainCallbackHandler } 400 font-semibold">from 400 font-semibold">class="text-emerald-400">"@glassbrain/js/llamaindex";
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic">// Create and register the Glassbrain callback handler
400 font-semibold">const glassbrainHandler = 400 font-semibold">new GlassbrainCallbackHandler({
projectKey: process.env.GLASSBRAIN_PROJECT_KEY,
});
Settings.callbackManager.addHandler(glassbrainHandler);
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic">// Load documents and build an index 400 font-semibold">as usual
400 font-semibold">const documents = 400 font-semibold">await 400 font-semibold">new SimpleDirectoryReader().loadData(400 font-semibold">class="text-emerald-400">"./data");
400 font-semibold">const index = 400 font-semibold">await VectorStoreIndex.fromDocuments(documents);
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic">// Query the index - the full pipeline is traced automatically
400 font-semibold">const queryEngine = index.asQueryEngine();
400 font-semibold">const response = 400 font-semibold">await queryEngine.query(400 font-semibold">class="text-emerald-400">"What is the main topic?");
console.log(response.toString());Python
400 font-semibold">import os
400 font-semibold">from llama_index.core 400 font-semibold">import VectorStoreIndex, SimpleDirectoryReader, Settings
400 font-semibold">from llama_index.core.callbacks 400 font-semibold">import CallbackManager
400 font-semibold">from glassbrain.llamaindex 400 font-semibold">import GlassbrainCallbackHandler
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Create the Glassbrain callback handler
glassbrain_handler = GlassbrainCallbackHandler(
project_key=os.environ[400 font-semibold">class="text-emerald-400">"GLASSBRAIN_PROJECT_KEY"]
)
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Register it 400 font-semibold">with LlamaIndex
Settings.callback_manager = CallbackManager([glassbrain_handler])
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Load documents 400 font-semibold">and build an index 400 font-semibold">as usual
documents = SimpleDirectoryReader(400 font-semibold">class="text-emerald-400">"./data").load_data()
index = VectorStoreIndex.from_documents(documents)
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Query the index - the full pipeline 400 font-semibold">is traced automatically
query_engine = index.as_query_engine()
response = query_engine.query(400 font-semibold">class="text-emerald-400">"What 400 font-semibold">is the main topic?")
print(response)How It Works
The GlassbrainCallbackHandler implements the LlamaIndex callback interface. When you run a query, LlamaIndex fires callback events at each stage of execution: query start, retrieval, node postprocessing, synthesis, LLM calls, and query end. Glassbrain captures these events and organizes them into a hierarchical trace.
The callback handler traces the following LlamaIndex event types:
QUERY- Top-level query engine executionRETRIEVE- Document and node retrievalSYNTHESIZE- Response synthesis from retrieved nodesLLM- Individual LLM callsEMBEDDING- Embedding generation for queries and documentsNODE_PARSING- Document to node parsing
What Gets Traced
Each stage of the LlamaIndex pipeline produces a span with data specific to that stage. Here are the key span types.
{
"span_id": "sp_query_001",
"trace_id": "tr_li_456",
"type": "query",
"name": "VectorIndexQuery",
"timestamp": "2026-04-03T12:00:00.000Z",
"duration_ms": 4200,
"status": "success",
"input": {
"query_str": "What is the main topic?"
},
"output": {
"response": "The main topic discussed in the documents is...",
"source_nodes": ["node_001", "node_002", "node_003"]
},
"children": ["sp_retrieve_001", "sp_synthesize_001"]
}{
"span_id": "sp_retrieve_001",
"parent_span_id": "sp_query_001",
"type": "retrieve",
"name": "VectorIndexRetriever",
"duration_ms": 320,
"input": {
"query_str": "What is the main topic?",
"similarity_top_k": 3
},
"output": {
"nodes": [
{
"node_id": "node_001",
"text": "The document covers...",
"score": 0.92,
"metadata": { "file_name": "report.pdf", "page": 1 }
},
{
"node_id": "node_002",
"text": "Additional context...",
"score": 0.87,
"metadata": { "file_name": "report.pdf", "page": 3 }
}
]
}
}{
"span_id": "sp_synthesize_001",
"parent_span_id": "sp_query_001",
"type": "synthesize",
"name": "CompactAndRefine",
"duration_ms": 3100,
"input": {
"query_str": "What is the main topic?",
"nodes_count": 3
},
"output": {
"response": "The main topic discussed in the documents is..."
},
"children": ["sp_llm_001"]
}{
"span_id": "sp_node_001",
"parent_span_id": "sp_query_001",
"type": "node_parsing",
"name": "SentenceSplitter",
"duration_ms": 45,
"input": {
"documents_count": 1
},
"output": {
"nodes_count": 12,
"avg_node_length": 512
}
}RAG Pipeline Tracing
For complex RAG pipelines with custom retrievers, rerankers, and response synthesizers, Glassbrain captures every component in the pipeline. Here is an example with a more advanced setup.
400 font-semibold">import os
400 font-semibold">from llama_index.core 400 font-semibold">import VectorStoreIndex, SimpleDirectoryReader, Settings
400 font-semibold">from llama_index.core.callbacks 400 font-semibold">import CallbackManager
400 font-semibold">from llama_index.core.postprocessor 400 font-semibold">import SimilarityPostprocessor
400 font-semibold">from llama_index.core.response_synthesizers 400 font-semibold">import get_response_synthesizer
400 font-semibold">from glassbrain.llamaindex 400 font-semibold">import GlassbrainCallbackHandler
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Set up Glassbrain tracing
glassbrain_handler = GlassbrainCallbackHandler(
project_key=os.environ[400 font-semibold">class="text-emerald-400">"GLASSBRAIN_PROJECT_KEY"]
)
Settings.callback_manager = CallbackManager([glassbrain_handler])
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Build the index
documents = SimpleDirectoryReader(400 font-semibold">class="text-emerald-400">"./data").load_data()
index = VectorStoreIndex.from_documents(documents)
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Create a query engine 400 font-semibold">with custom components
query_engine = index.as_query_engine(
similarity_top_k=5,
node_postprocessors=[
SimilarityPostprocessor(similarity_cutoff=0.7)
],
response_synthesizer=get_response_synthesizer(
response_mode=400 font-semibold">class="text-emerald-400">"compact"
),
)
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Every component 400 font-semibold">in this pipeline 400 font-semibold">is traced:
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Query -> Retrieve (5 nodes) -> Postprocess (filter by score)
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># -> Synthesize -> LLM call -> Response
response = query_engine.query(
400 font-semibold">class="text-emerald-400">"Summarize the key findings 400 font-semibold">from the research papers"
)
print(response)Advanced Configuration
Customize the callback handler with additional options.
glassbrain_handler = GlassbrainCallbackHandler(
project_key=os.environ[400 font-semibold">class="text-emerald-400">"GLASSBRAIN_PROJECT_KEY"],
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Add custom metadata to every trace
metadata={
400 font-semibold">class="text-emerald-400">"environment": 400 font-semibold">class="text-emerald-400">"production",
400 font-semibold">class="text-emerald-400">"pipeline": 400 font-semibold">class="text-emerald-400">"document-qa",
400 font-semibold">class="text-emerald-400">"index_version": 400 font-semibold">class="text-emerald-400">"v3",
},
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Control what gets captured
capture_input=400">True, 400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Set to 400">False to skip logging queries
capture_output=400">True, 400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Set to 400">False to skip logging responses
capture_node_content=400">True, 400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Set to 400">False to skip node text content
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Sampling rate (0.0 to 1.0)
sample_rate=1.0,
400 font-semibold">class="text-[rgba(255,255,255,0.3)] italic"># Maximum node content length to capture (chars)
max_node_content_length=10000,
)Troubleshooting
No traces appearing after running queries
Verify that the callback handler is registered with Settings.callback_manager before creating your index or query engine. If you create the index before registering the handler, it will not be attached to the query engine. Also confirm that your project key is valid.
Feature not available error
The LlamaIndex integration requires the Pro plan or above. Check your current plan in the Glassbrain dashboard under Account Settings. If you recently upgraded, allow a few minutes for the change to propagate.
Retrieved nodes show empty content
Check that capture_node_content is set to True (the default). If node content is still missing, verify that your nodes have the text attribute populated. Some custom node types may store content in a different field.
Embedding spans are not captured
Embedding events are only fired during index construction and when the query engine generates a query embedding. If you built the index before registering the handler, index-time embeddings will not be traced. Re-register the handler and rebuild the index, or focus on query-time traces where the query embedding will be captured.