Skip to content

chore: update seeder for testing #1123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/deno.yml
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,54 @@ jobs:
FRONTEND_URL=http://localhost:8000/ \
deno task integration

seeder-integration-test:
name: "Seeder Integration Tests"
runs-on: ubuntu-latest
needs: ["build-binaries"]
environment: ci
services:
redis:
image: redis
ports:
- 6379:6379
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- name: 📥 Checkout repository
uses: actions/checkout@v4

- name: 🦕 Setup Deno 2.2.2
uses: denoland/setup-deno@v2
with:
deno-version: "2.2.2"

- name: 📦 Cache Deno dependencies
uses: actions/cache@v3
with:
path: |
~/.deno
~/.cache/deno
key: ${{ runner.os }}-deno-${{ hashFiles('**/deno.json') }}

- name: 📥 Download built binaries
uses: actions/download-artifact@v4

- name: 🚀 Start Toolshed server for testing
run: |
chmod +x ./common-binaries/toolshed
CTTS_AI_LLM_ANTHROPIC_API_KEY=fake \
CACHE_DIR=${GITHUB_WORKSPACE}/jumble/integration/cache \
./common-binaries/toolshed &

- name: 🧪 Run seeder integration tests
working-directory: seeder
run: |
TOOLSHED_API_URL=http://localhost:8000/ \
deno task start --tag smol --no-verify --no-report --name ${{ github.run_id }}

# Automatic deployment to staging (toolshed)
deploy-toolshed:
name: "Deploy to Toolshed (Staging)"
Expand Down
21 changes: 14 additions & 7 deletions .github/workflows/evals.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ jobs:
key: llm-cache-${{ github.run_id }}
restore-keys: llm-cache

- name: ✏️ List LLM Cache Load
- name: ✏️ List LLM Cache Load
run: |
mkdir -p ./llm_cache_dir/llm-api-cache
ls -1 ./llm_cache_dir/llm-api-cache > ./cache-log-pre
Expand Down Expand Up @@ -138,17 +138,24 @@ jobs:
FLAGS="--tag 10x10"
fi

TOOLSHED_API_URL=http://localhost:8000/ \
OPENAI_API_KEY=${{ secrets.CTTS_AI_LLM_OPENAI_API_KEY }} \
deno task start --name ${{ env.target_sha }} $FLAGS
# Wrap in an if/else, we don't want the job to fail yet
# until we can confidently expect passing results.
if
TOOLSHED_API_URL=http://localhost:8000/ \
OPENAI_API_KEY=${{ secrets.CTTS_AI_LLM_OPENAI_API_KEY }} \
deno task start --name ${{ env.target_sha }} $FLAGS ; then
echo "All charms have been successfully verified!"
else
echo "Some charms have failed verification."
fi

- name: ✏️ List LLM Cache Save
- name: ✏️ List LLM Cache Save
run: |
mkdir -p ./llm_cache_dir/llm-api-cache
ls -1 ./llm_cache_dir/llm-api-cache > ./cache-log-post
echo "$(cat ./cache-log-post | wc -l) items found in cache."
cat ./cache-log-post

- name: 📦 LLM Cache Status
run: |
if diff ./cache-log-pre ./cache-log-post > ./cache-analysis-diff ; then
Expand All @@ -158,7 +165,7 @@ jobs:
ADDITIONS=$(cat ./cache-analysis-diff | wc -l)
echo "LLM cache has added $((ADDITIONS / 2)) entries."
fi

- name: 📤 Upload reports
uses: actions/upload-artifact@v4
with:
Expand Down
2 changes: 1 addition & 1 deletion deno.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,4 @@
"zod-to-json-schema": "npm:zod-to-json-schema@^3.24.1",
"zod": "npm:zod@^3.24.1"
}
}
}

This file was deleted.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"model": "anthropic:claude-3-7-sonnet-20250219",
"system": "\nYou are creating a simple MVP based on the user's goal, using an existing data schema. Focus on the simplest implementation that works with the provided schema.\n\n<task>\nGiven a user's feature request and an existing data schema, you will:\n1. Create a short title (2-5 words) that names the artifact\n2. Create a one-sentence description in the format \"A <artifact> to <goal>\"\n3. Create a concise specification (3-5 sentences max) that works with the existing schema\n4. Generate a brief implementation plan (3 steps max)\n5. Design a minimal JSON schema that represents the core data model\n</task>\n\n<output_structure>\nYour response must be structured as follows:\n\n<thinking>Freeform reasoning for you to consider the solution.</thinking>\n\n<title>\n[Short title for the artifact, 2-5 words]\n</title>\n\n<description>\n[One-sentence description in the format \"A <artifact> to <goal>\"]\n</description>\n\n<spec>\n[Concise specification that captures only the essential requirements]\n</spec>\n\n<plan>\n[Brief 3-step implementation plan using the existing schema]\n</plan>\n\n<result_schema>\n[Minimal JSON Schema in valid JSON format that represents data created by the artifact]\n</result_schema>\n</output_structure>\n\n<schema_guidelines>\n1. Keep it minimal:\n - Include only essential fields (5-7 properties max)\n - Focus on the core functionality\n - If user requested complex features, simplify for this first version\n\n2. Each property should have:\n - A descriptive \"title\" field\n - A brief \"description\" field\n - A sensible default value where appropriate\n\n3. Example of a simple schema:\n```json\n{\n \"type\": \"object\",\n \"title\": \"Note List\",\n \"description\": \"A simple note list for the user\",\n \"properties\": {\n \"notes\": {\n \"type\": \"array\",\n \"title\": \"Notes\",\n \"description\": \"List of user notes\",\n \"default\": [],\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"title\": {\n \"type\": \"string\",\n \"title\": \"Title\",\n \"description\": \"Title of the note\",\n \"default\": \"New Note\"\n },\n \"content\": {\n \"type\": \"string\",\n \"title\": \"Content\",\n \"description\": \"Content of the note\"\n },\n \"created\": {\n \"type\": \"string\",\n \"format\": \"date-time\",\n \"title\": \"Created Date\",\n \"description\": \"When the note was created\"\n }\n },\n \"required\": [\"title\", \"content\"]\n }\n }\n },\n \"required\": [\"title\", \"content\"]\n}\n```\n\nGUIDELINES:\n- Aim for the simplest possible solution that works with the existing schema\n- The specification should take into account the existing schema structure\n- Focus on what can be achieved quickly with the existing data model\n- Avoid suggesting modifications to the schema if possible\n</schema_guidelines>\n\n<high_level>\nIMPORTANT:\n- Focus on the simplest working version\n- Aim for fewer fields rather than more\n- But still capture all the important state the user is creating\n- The user can always iterate and improve the solution later\n</high_level>\n",
"stream": false,
"messages": [
{
"role": "user",
"content": "\n<user_input>\n\n<goal>2048 game</goal>\n\n<description>Create a 🎮 2048 game charm with sliding 🔢 number tiles, score tracking, and game state management. The charm will need a grid layout, tile merging logic, and keyboard/swipe controls. Could add a 🏆 high score feature and 🔄 restart button.</description>\n\n</user_input>\n\n<existing_schema>\n```json\n{\n \"type\": \"object\",\n \"properties\": {}\n}\n```\n</existing_schema>\n\nBased on this goal and the existing schema, please provide a title, description, any additional schema, detailed specification, and implementation plan.\n"
},
{
"role": "assistant",
"content": "<thinking>\nThe user wants to create a 2048 game charm. The 2048 game is a sliding tile puzzle where players combine tiles with the same numbers to create a tile with the number 2048. \n\nKey components needed:\n1. A grid (typically 4x4) to hold the number tiles\n2. Logic for sliding tiles in four directions (up, down, left, right)\n3. Merging logic when two tiles with the same number collide\n4. Score tracking\n5. Game state management (in progress, won, lost)\n6. Optional: High score tracking and restart functionality\n\nThe existing schema is empty, so I'll need to design a complete schema that captures:\n- The game board (grid of numbers)\n- Current score\n- Game state\n- High score (as mentioned in optional features)\n\nFor the implementation plan, I'll focus on the core game mechanics first, then the UI, and finally the additional features.\n</thinking>\n\n<title>\n2048 Sliding Puzzle Game\n</title>\n\n<description>\nA game charm to play the classic 2048 sliding tile puzzle with score tracking.\n</description>\n\n<spec>\nThe 2048 game will feature a 4x4 grid where numbered tiles can be slid in four directions (up, down, left, right). When two tiles with the same number collide, they merge into a single tile with their sum. The game tracks the current score, game state (playing, won, lost), and highest tile achieved. A high score feature will persist the best score across game sessions.\n</spec>\n\n<plan>\n1. Implement the core game board with grid representation and tile movement/merging logic\n2. Create the UI with responsive grid layout and keyboard/swipe controls\n3. Add game state management, scoring system, and restart functionality\n</plan>\n\n<result_schema>\n{\n \"type\": \"object\",\n \"title\": \"2048 Game State\",\n \"description\": \"Represents the current state of a 2048 game\",\n \"properties\": {\n \"board\": {\n \"type\": \"array\",\n \"title\": \"Game Board\",\n \"description\": \"4x4 grid of number tiles (0 represents empty)\",\n \"default\": [\n [0, 0, 0, 0],\n [0, 0, 0, 0],\n [0, 0, 0, 0],\n [0, 0, 0, 0]\n ],\n \"items\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"integer\",\n \"minimum\": 0\n }\n }\n },\n \"score\": {\n \"type\": \"integer\",\n \"title\": \"Current Score\",\n \"description\": \"Player's current score in this game\",\n \"default\": 0,\n \"minimum\": 0\n },\n \"highScore\": {\n \"type\": \"integer\",\n \"title\": \"High Score\",\n \"description\": \"Highest score achieved across games\",\n \"default\": 0,\n \"minimum\": 0\n },\n \"gameState\": {\n \"type\": \"string\",\n \"title\": \"Game State\",\n \"description\": \"Current state of the game\",\n \"enum\": [\"playing\", \"won\", \"lost\"],\n \"default\": \"playing\"\n },\n \"highestTile\": {\n \"type\": \"integer\",\n \"title\": \"Highest Tile\",\n \"description\": \"Value of the highest tile on the board\",\n \"default\": 2,\n \"minimum\": 2\n }\n },\n \"required\": [\"board\", \"score\", \"gameState\"]\n}\n</result_schema>"
}
],
"maxTokens": 64000
}
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
},
{
"role": "assistant",
"content": "<thinking>\nLet's analyze what we have:\n\n1. The user wants to display data from a `simpleValue2` charm.\n2. The existing schema shows that `simpleValue2` contains an object with a property called `values`, which is an array of strings.\n\nThe user wants a simple way to view this data. I'll create a simple viewer component that displays the current values in the array.\n\nFor the schema, I'll keep it minimal since we're just displaying existing data. The result schema will represent what the viewer component will display - the values from the array and potentially some metadata about the display.\n</thinking>\n\n<title>\nSimpleValue2 Viewer\n</title>\n\n<description>\nA component to display the current values stored in the simpleValue2 charm.\n</description>\n\n<spec>\nThe SimpleValue2 Viewer will display all string values currently stored in the simpleValue2.values array. It will present the values in a clean, readable list format. The viewer will automatically update when the underlying data changes.\n</spec>\n\n<plan>\n1. Access the simpleValue2 data from the existing schema\n2. Create a display component that renders the values array as a list\n3. Implement auto-refresh functionality to keep the display in sync with data changes\n</plan>\n\n<result_schema>\n{\n \"type\": \"object\",\n \"title\": \"SimpleValue2 Viewer\",\n \"description\": \"Display configuration for simpleValue2 data\",\n \"properties\": {\n \"displayData\": {\n \"type\": \"object\",\n \"title\": \"Display Data\",\n \"description\": \"The data being displayed from simpleValue2\",\n \"properties\": {\n \"values\": {\n \"type\": \"array\",\n \"title\": \"Values\",\n \"description\": \"The string values from simpleValue2\",\n \"items\": {\n \"type\": \"string\"\n },\n \"default\": []\n },\n \"lastUpdated\": {\n \"type\": \"string\",\n \"format\": \"date-time\",\n \"title\": \"Last Updated\",\n \"description\": \"When the display was last refreshed\"\n }\n },\n \"required\": [\"values\"]\n },\n \"displaySettings\": {\n \"type\": \"object\",\n \"title\": \"Display Settings\",\n \"description\": \"Configuration for how the data is displayed\",\n \"properties\": {\n \"showTimestamp\": {\n \"type\": \"boolean\",\n \"title\": \"Show Timestamp\",\n \"description\": \"Whether to show when the data was last updated\",\n \"default\": true\n },\n \"refreshInterval\": {\n \"type\": \"integer\",\n \"title\": \"Refresh Interval\",\n \"description\": \"How often to refresh the display (in seconds)\",\n \"default\": 30\n }\n }\n }\n },\n \"required\": [\"displayData\"]\n}\n</result_schema>"
"content": "<thinking>\nLet's analyze what we have:\n\n1. The user wants to display data from a `simpleValue2` charm.\n2. The existing schema shows that `simplevalue2` is an object with a property called `values`, which is an array of strings.\n3. The user wants to see the current value and any associated metadata.\n\nFrom the schema, it seems like the main data is in the `values` array. There doesn't appear to be explicit metadata in the schema, just the array of string values.\n\nFor this MVP, I'll create a simple viewer that displays the array of strings from the `simplevalue2.values` property. Since there's no explicit metadata in the schema, I'll focus on displaying the values themselves in a clean, readable format.\n\nThe artifact will be a viewer component that reads from the existing schema and presents the data in a user-friendly way.\n</thinking>\n\n<title>\nSimpleValue2 Viewer\n</title>\n\n<description>\nA viewer component to display the contents of the simpleValue2 charm.\n</description>\n\n<spec>\nThe SimpleValue2 Viewer will read and display all string values stored in the simplevalue2.values array. It will present these values in a clean, readable list format. The viewer will automatically update when the underlying data changes.\n</spec>\n\n<plan>\n1. Create a component that accesses the simplevalue2.values array from the existing schema\n2. Implement a display layout that renders each string value as a list item\n3. Add basic styling and empty state handling for when the values array is empty\n</plan>\n\n<result_schema>\n{\n \"type\": \"object\",\n \"title\": \"SimpleValue2 Viewer\",\n \"description\": \"Display component for simpleValue2 data\",\n \"properties\": {\n \"displayData\": {\n \"type\": \"object\",\n \"title\": \"Display Data\",\n \"description\": \"Formatted data for display\",\n \"properties\": {\n \"values\": {\n \"type\": \"array\",\n \"title\": \"Values\",\n \"description\": \"List of string values from simpleValue2\",\n \"default\": [],\n \"items\": {\n \"type\": \"string\"\n }\n },\n \"count\": {\n \"type\": \"integer\",\n \"title\": \"Count\",\n \"description\": \"Number of values in the array\",\n \"default\": 0\n },\n \"lastUpdated\": {\n \"type\": \"string\",\n \"format\": \"date-time\",\n \"title\": \"Last Updated\",\n \"description\": \"When the viewer last refreshed the data\"\n }\n },\n \"required\": [\"values\", \"count\"]\n }\n },\n \"required\": [\"displayData\"]\n}\n</result_schema>"
}
],
"maxTokens": 64000
Expand Down
Loading