Llm feedback #1036

jakedahn · 2025-04-14T20:53:52Z

This PR contains all of the API and UI plumbing to handle user feedback submissions for a specific opentelemetry span id (which points to a llm request)

NOTE: This is currently a work-in-progress. We need to figure out how to get the actual llm telemetry span id from the active charm, and I'm not sure how to do this part.

the ui is that we have a feedback button, when you click it, it pops open a thumbs-up and thumbs-down button

when you click a thumbs button, it gives you a text field to fill out some feedback, which we then send to phoenix

this shows up in phoenix as an annotation

jakedahn · 2025-04-14T20:56:47Z

jumble/src/components/FeedbackActions.tsx

+
+// FIXME(jake): This is for demo purposes... ideally we could just get the llm
+// span from the persisted blobby blob of the charm recipe, but none of that is hooked up yet.
+const CURRENT_SPAN_ID = "48fc49c695cdc4f3";


To make this all real, we need to somehow grab the llm span id from the active charm-- I have no idea how to wire this up.

jakedahn · 2025-04-14T20:57:57Z

toolshed/routes/ai/llm/instrumentation.ts

        spanFilter: (span) => {
-          return isOpenInferenceSpan(span);
+          // console.log("SPAN", span);
+          const includeSpanCriteria = [
+            isOpenInferenceSpan(span),
+            span.attributes["http.route"] == "/api/ai/llm", // Include the root span, which is in the hono app
+            span.instrumentationLibrary.name.includes("@vercel/otel"), // Include the actual LLM API fetch span
+          ];
+          return includeSpanCriteria.some((c) => c);


This filter update makes all requests appear as a child of a root span, which is a quality of life improvement in phoenix.

anotherjesse

awesome - a great first pass

User Feedback UI - Implemented FeedbackActions and FeedbackDialog components in Jumble to allow users to submit positive or negative feedback on generated content. - Added interactive thumbs-up/down buttons with feedback form submission workflow. Trace Span ID Management - Added functions (setLastTraceSpanID, getLastTraceSpanID) to manage trace IDs in the builder environment. - Updated builder exports to expose new trace ID functions. Integration of Trace ID in LLM Client - Modified LLMClient to capture x-ct-llm-trace-id from response headers and store it via setLastTraceSpanID. Backend Feedback Handling - Extended toolshed API (/api/ai/llm/feedback) to accept user feedback and relay it to Phoenix observability backend. Instrumentation and Span Filtering - Enhanced OpenTelemetry instrumentation to include specific spans relevant to LLM inference requests for better observability.

* chore: commit wip with working serverside traverse overhaul * chore: finished updating tests * chore: commit wip while debugging lack of schema * fix: wrong charm title (#1023) * add gemini-flash-lite (#1035) support flash-lite * chore: Do not attempt to serve static directories (#1037) * Only re-render CharmList when connection status actually changes (#1038) * Only re-render CharmList when connection status actually changes * Update types * track loads differently based on their schema import as FactModule instead of Fact to avoid collision expose the schema associated with the facts from QueryView make schemaContext optional in SchemaPathSelector, since I don't want to traverse commit logs Change default storage to cached * Llm feedback (#1036) User Feedback UI - Implemented FeedbackActions and FeedbackDialog components in Jumble to allow users to submit positive or negative feedback on generated content. - Added interactive thumbs-up/down buttons with feedback form submission workflow. Trace Span ID Management - Added functions (setLastTraceSpanID, getLastTraceSpanID) to manage trace IDs in the builder environment. - Updated builder exports to expose new trace ID functions. Integration of Trace ID in LLM Client - Modified LLMClient to capture x-ct-llm-trace-id from response headers and store it via setLastTraceSpanID. Backend Feedback Handling - Extended toolshed API (/api/ai/llm/feedback) to accept user feedback and relay it to Phoenix observability backend. Instrumentation and Span Filtering - Enhanced OpenTelemetry instrumentation to include specific spans relevant to LLM inference requests for better observability. * fix schema query with commit+json * chore: commit wip * chore: env variable to toggle schema sub * changed to have subscribing queries not delete themselves with the task/return changed so the server sends each response to each subscriber, even though there may be duplicates. a schema subscription on the client needs to know its updated list of entities added lots of debugging to try to track down this stall issue * we should be able to inspect both the graph/query and the subscribe for the combo * reverted incomplete change to docIsSyncing that caused me to stall on load * fix blunder with pulling value from fact cleaned up logging and comments * change todo comment for linter * remove incorrect comment in test * fix the other half of the value blunder * Remove unused import * Added start-ci task for running integration tests that makes BroadcastChannel available. * Change cache to only use the BroadcastChannel if it's available. * better import of isObj * just inline the isObj test * Remove the BroadcastChannel, since we're not really using it. --------- Co-authored-by: Irakli Gozalishvili <contact@gozala.io> Co-authored-by: Jesse Andrews <anotherjesse@gmail.com> Co-authored-by: Jordan Santell <jsantell@users.noreply.github.com> Co-authored-by: Ben Follington <5009316+bfollington@users.noreply.github.com> Co-authored-by: Jake Dahn <jake@common.tools>

* chore: commit wip with working serverside traverse overhaul * chore: finished updating tests * chore: commit wip while debugging lack of schema * fix: wrong charm title (#1023) * add gemini-flash-lite (#1035) support flash-lite * chore: Do not attempt to serve static directories (#1037) * Only re-render CharmList when connection status actually changes (#1038) * Only re-render CharmList when connection status actually changes * Update types * track loads differently based on their schema import as FactModule instead of Fact to avoid collision expose the schema associated with the facts from QueryView make schemaContext optional in SchemaPathSelector, since I don't want to traverse commit logs Change default storage to cached * Llm feedback (#1036) User Feedback UI - Implemented FeedbackActions and FeedbackDialog components in Jumble to allow users to submit positive or negative feedback on generated content. - Added interactive thumbs-up/down buttons with feedback form submission workflow. Trace Span ID Management - Added functions (setLastTraceSpanID, getLastTraceSpanID) to manage trace IDs in the builder environment. - Updated builder exports to expose new trace ID functions. Integration of Trace ID in LLM Client - Modified LLMClient to capture x-ct-llm-trace-id from response headers and store it via setLastTraceSpanID. Backend Feedback Handling - Extended toolshed API (/api/ai/llm/feedback) to accept user feedback and relay it to Phoenix observability backend. Instrumentation and Span Filtering - Enhanced OpenTelemetry instrumentation to include specific spans relevant to LLM inference requests for better observability. * fix schema query with commit+json * chore: commit wip * chore: env variable to toggle schema sub * changed to have subscribing queries not delete themselves with the task/return changed so the server sends each response to each subscriber, even though there may be duplicates. a schema subscription on the client needs to know its updated list of entities added lots of debugging to try to track down this stall issue * we should be able to inspect both the graph/query and the subscribe for the combo * reverted incomplete change to docIsSyncing that caused me to stall on load * fix blunder with pulling value from fact cleaned up logging and comments * change todo comment for linter * remove incorrect comment in test * fix the other half of the value blunder * Remove unused import * Added start-ci task for running integration tests that makes BroadcastChannel available. * Change cache to only use the BroadcastChannel if it's available. * better import of isObj * just inline the isObj test * Remove the BroadcastChannel, since we're not really using it. * Remove remote provider Needs more work, since we fail tests which try to use indexdb in deno. * correct merge error --------- Co-authored-by: Irakli Gozalishvili <contact@gozala.io> Co-authored-by: Jesse Andrews <anotherjesse@gmail.com> Co-authored-by: Jordan Santell <jsantell@users.noreply.github.com> Co-authored-by: Ben Follington <5009316+bfollington@users.noreply.github.com> Co-authored-by: Jake Dahn <jake@common.tools>

jakedahn added 6 commits April 14, 2025 11:54

getting started on user feedback

552ab59

fixing feedback endpoint schema

d451a2c

fixing up the phoenix annotation request

fcb2e22

wiring up a feedback form

e15a1ba

minor cleanup

a858c9d

re-enable cache

6cb1194

jakedahn commented Apr 14, 2025

View reviewed changes

jakedahn added 2 commits April 14, 2025 15:34

wire up last span id to feedback buttons

15c5b03

fixing react-icons check and removing fixme comment

5894973

jakedahn temporarily deployed to ci April 14, 2025 21:57 — with GitHub Actions Inactive

jakedahn temporarily deployed to ci April 14, 2025 21:59 — with GitHub Actions Inactive

fixing feedback button registration

6dbd2c6

jakedahn temporarily deployed to ci April 14, 2025 22:09 — with GitHub Actions Inactive

anotherjesse approved these changes Apr 14, 2025

View reviewed changes

jakedahn temporarily deployed to ci April 14, 2025 22:11 — with GitHub Actions Inactive

only show feedback button on charm pages

d71ad62

jakedahn temporarily deployed to ci April 14, 2025 22:16 — with GitHub Actions Inactive

jakedahn temporarily deployed to ci April 14, 2025 22:18 — with GitHub Actions Inactive

use globalThis instead of localStorage to support multiple tabs

1f4939c

anotherjesse temporarily deployed to ci April 14, 2025 22:23 — with GitHub Actions Inactive

anotherjesse temporarily deployed to ci April 14, 2025 22:25 — with GitHub Actions Inactive

anotherjesse force-pushed the llm-feedback branch from a37d568 to df74f1e Compare April 14, 2025 22:45

use builder env to store last trace

74e3cc4

anotherjesse force-pushed the llm-feedback branch from df74f1e to 74e3cc4 Compare April 14, 2025 23:04

anotherjesse temporarily deployed to ci April 14, 2025 23:06 — with GitHub Actions Inactive

anotherjesse temporarily deployed to ci April 14, 2025 23:08 — with GitHub Actions Inactive

anotherjesse merged commit abd6b22 into main Apr 14, 2025
5 checks passed

anotherjesse deleted the llm-feedback branch April 14, 2025 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llm feedback #1036

Llm feedback #1036

Uh oh!

jakedahn commented Apr 14, 2025 •

edited

Loading

Uh oh!

jakedahn Apr 14, 2025

Uh oh!

jakedahn Apr 14, 2025

Uh oh!

anotherjesse left a comment

Uh oh!

Uh oh!

Uh oh!

Llm feedback #1036

Llm feedback #1036

Uh oh!

Conversation

jakedahn commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakedahn Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

jakedahn Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

anotherjesse left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jakedahn commented Apr 14, 2025 •

edited

Loading