Hey HN, I built RAG Logger, a lightweight open-source logging tool specifically designed for Retrieval-Augmented Generation (RAG) applications. LangSmith is excellent, but my usage is quite minimal, and I would prefer a locally hosted version that is easy to customize. Key features: Detailed step-by-step pipeline tracking Performance monitoring (embedding, retrieval, LLM generation) Structured JSON logs with timing and metadata Zero external dependencies Easy integration with existing RAG systems The tool helps debug RAG applications by tracking query understanding, embedding generation, document retrieval, and LLM responses. Each step is timed and logged with relevant metadata.
Really awesome seeing more people work on this! I’m one of the founders of Opik https://github.com/comet-ml/opik which does similar things but also has a UI and supports massive scale. Curious to hear if you have any feedback!
Please remember to write "Show HN:" when submitting your own content.
You appear to have forgotten the license for your open source alternative
How is this a replacement for LangSmith? I browsed the source and I could only find what appear to be a few small helper functions for emitting structured logs.
I’m less familiar with LangSmith, but browsing their site suggests they happen to offer observability into LLM interactions in addition to other parts of the workflow lifecycle. This just seems to handle logging and you have to pass all the data yourself- it’s not instrumenting an LLM client, for example.
> in addition to other parts of the workflow lifecycle
FWIW this is primarily based on the LangChain framework so it's fairly turnkey, but has no integration with the rest of your application. You can use the @traceable decorator in python to decorate a custom function in code too, but this doesn't integrate with frameworks like OpenTelemetry, which makes it hard to see everything happens.
So for example, if your LLM feature is plugged into another feature area in the rest of your product, you need to do a lot more work to capture things like which user is involved, or if you did some post-processing on a response later down the road, what steps might have had to be taken to produce a better response, etc. It's quite useful for chat apps right now, but most enterprise RAG use cases will likely want to instrument with OpenTelemetry directly.
Why can't you use opentelemetry for something like this?
You can, which is why tools like Traceloop do this.
Although it's worth noting that long context + observability doesn't always work with o11y systems since they usually put limits on the size of a log body or trace attribute.
I've just published to Github my own LLM logging and debugging tool with local storage: https://github.com/zby/llm_recorder It is more for debugging than observability in production like your package.
I think I am ready to push it to PyPi now.
It replaces the llm client and logs everything that goes through it.
It is very simplistic in comparison with the remote loggers - but you can use all the local tools - like grep or your favourite editor. The feature that I needed from it is replaying past interactions. I use it for debugging execution paths that happens only sometimes. Can Langfuse do that?
Is anyone using Prometheus / Grafana for LLM metrics? Seems like there’s a lot of existing leverage there. What makes LLM metrics different than other performance metrics? Why not use a single system to collect and analyze both?
Hey HN, I built RAG Logger, a lightweight open-source logging tool specifically designed for Retrieval-Augmented Generation (RAG) applications.
LangSmith is excellent, but my usage is quite minimal, and I would prefer a locally hosted version that is easy to customize.
How do you differentiate yourself from Langfuse?
Key features:
Detailed step-by-step pipeline tracking
Performance monitoring (embedding, retrieval, LLM generation)
Structured JSON logs with timing and metadata
Zero external dependencies
Easy integration with existing RAG systems
How does this compare to something like LangFuse?