Skip to content

Commit e4dd4b5

Browse files
authored
smoketest version of ralph, basic files (#1951)
* smoketest version of ralph, basic files * temporarily use git branch * ask ralph to use playwright * telling it how to deploy locally * how to use smoketest doc * cleanup formatting for TASKS.md * added script to run several ralph smoketests and stop them * clean up to point back to main, ready for landing * added note to self to clean up sleep infinity later
1 parent 465424f commit e4dd4b5

File tree

8 files changed

+252
-35
lines changed

8 files changed

+252
-35
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,4 @@ claude-research.key
2323
local-dev-toolshed.log
2424
local-dev-shell.log
2525
tools/ralph/logs/
26+
tools/ralph/smoketest/

tools/ralph/Dockerfile

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ RUN npm install -g @anthropic-ai/claude-code && \
7676
# --no-sandbox is required because Docker containers restrict namespace creation
7777
RUN claude mcp add --scope user playwright npx "@playwright/mcp@latest" -- --headless --isolated --no-sandbox
7878

79-
# Start Common Tool servers in background and keep container alive
80-
# the sleep ensures that restarting the server doesnt cause the container to exit
81-
CMD ["/bin/sh", "-c", "/app/start-servers.sh & sleep infinity"]
79+
# Start servers in background, run smoketest, then exit
80+
# If you want to run ralph normally without the smoketest, then replace with:
81+
# CMD ["/bin/sh", "-c", "/app/start-servers.sh & sleep infinity"]
82+
# TODO(@ellyxir): remove sleep infinity, smoketest should exit when its done
83+
CMD ["/bin/sh", "-c", "/app/start-servers.sh & /app/labs/tools/ralph/bin/ralph-smoketest.sh; sleep infinity"]

tools/ralph/README.md

Lines changed: 67 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,73 @@ Ability to run [Ralph](https://ghuntley.com/ralph/)
66

77
Claude CLI and Codex are installed
88

9-
## How to run Ralph
9+
## How to use Smoketest Ralph
10+
11+
Smoketest Ralph runs a single task from `TASKS.md` and exits when complete. Each
12+
Ralph instance is assigned a specific task via the `RALPH_ID` environment
13+
variable.
14+
15+
### Running smoketests
16+
17+
**Option 1: Use the automated script (recommended)**
18+
19+
Run multiple smoketests in parallel (tasks 1-3):
20+
21+
```bash
22+
./tools/ralph/bin/run_smoketest.sh
23+
```
24+
25+
This script will:
26+
27+
- Clean up old results
28+
- Stop/remove any existing ralph containers
29+
- Start 3 smoketest containers in parallel (RALPH_ID 1, 2, and 3)
30+
- Results are written to `./tools/ralph/smoketest/<ID>/`
31+
32+
Monitor progress with:
33+
34+
```bash
35+
docker logs ralph_1 # or ralph_2, ralph_3
36+
```
37+
38+
To stop all running smoketests:
39+
40+
```bash
41+
./tools/ralph/bin/stop_smoketest.sh
42+
```
43+
44+
**Option 2: Run a single smoketest manually**
45+
46+
1. Build the Docker image (if not using pre-built):
47+
48+
```bash
49+
docker build -t ellyxir/ralph tools/ralph/
50+
```
51+
52+
2. Run the container with your RALPH_ID and mounted credentials:
53+
54+
```bash
55+
cd ~/labs
56+
docker run -e RALPH_ID=3 -d -v ~/.claude.json:/home/ralph/.claude.json \
57+
-v ~/.claude/.credentials.json:/home/ralph/.claude/.credentials.json \
58+
-v ./tools/ralph/smoketest:/app/smoketest --name ralph ellyxir/ralph
59+
```
60+
61+
Note: The container will exit automatically when the smoketest completes.
62+
63+
### Retrieving results
64+
65+
Results are available on the host machine in
66+
`./tools/ralph/smoketest/${RALPH_ID}/`:
67+
68+
- `SCORE.txt` - Contains SUCCESS, PARTIAL, or FAILURE
69+
- `RESULTS.md` - Summary of work including test results
70+
- Pattern files created during the task
71+
72+
No need to copy files from the container - the bind mount makes results
73+
immediately available on your host.
74+
75+
## How to run Ralph (not smoketest)
1076

1177
### Using pre-built image from Docker Hub (recommended)
1278

tools/ralph/SMOKETEST_PROMPT.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Smoketest Ralph General Prompt
2+
3+
Goal: implement the unchecked item from `./tools/ralph/TASKS.md` that matches
4+
your assigned RALPH_ID
5+
6+
1. Open `./tools/ralph/TASKS.md` and find the task numbered with your RALPH_ID.
7+
8+
2. If your assigned task is already checked `[x]`, exit with a message saying
9+
the task is already complete.
10+
11+
3. Use Claude Skills "recipe-dev" to work on the task that corresponds to your
12+
RALPH_ID number.
13+
14+
4. Format with `deno fmt` for the changed files.
15+
16+
5. Once tests pass, deploy it locally using `./docs/common/RECIPE_DEV_DEPLOY.md`
17+
for info on how to deploy. It should deploy to http://localhost:8080 and not
18+
to toolshed. Servers are already running.
19+
20+
6. Once it is deployed locally, use a subagent to test your work with MCP
21+
playwright
22+
23+
7. Check off the completed items in `TASKS.md`:
24+
25+
8. git stage and commit with a message
26+
27+
9. Copy the files you created for the task to /app/smoketest/${RALPH_ID}/
28+
29+
10. Create a summary of your work in the same directory, be sure to include
30+
playwright results and what you tested. file location:
31+
/app/smoketest/${RALPH_ID}/RESULTS.md
32+
33+
11. Create a /app/smoketest/${RALPH_ID}/SCORE.txt which has one of the following
34+
values based on your results: SUCCESS, PARTIAL, FAILURE
35+
36+
12. Add feedback to documentation to `./tools/ralph/LEARNINGS.md`.
37+
38+
13. Exit
39+
40+
Please begin.

tools/ralph/TASKS.md

Lines changed: 52 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,54 @@
11
# Ralph Task List
22

3-
A running checklist of tasks. New items include brief implementation notes for a
4-
future Ralph pass.
5-
6-
Tasks marked with [UI] mean they should add/remove/modify the UI of the pattern.
7-
UI tasks should wire up the functionality and call the appropriate handlers.
8-
9-
- [] Create a counter
10-
- [] Add [UI] buttons for incrementing counter
11-
- **Test with Playwright**: click increment 3 times, verify counter shows 3,
12-
click decrement once, verify counter shows 2
13-
- **State**: The displayed count matches the pattern's `count` output field
14-
- [] Create multiple counters
15-
- [] Add [UI] buttons to create multiple counters
16-
- **Test with Playwright**: create 3 counters, test each one
17-
- **State**: Each counter maintains its own value in the pattern's
18-
`counters` array
19-
- [] Create a shopping list
20-
- [] Create [UI] for shopping list
21-
- **Test with Playwright**: add "milk" and "bread", make sure you see both,
22-
remove "bread", verify only "milk" remains
23-
- **State**: The list shows all items from pattern's `items` array with
24-
correct `completed` status
25-
- [] Lunch voter - list of destinations (just a string) (dedup)
26-
- [] [UI] for adding list of destination (just a string) and displaying it
27-
- **UI must**: Show an editable list with add/remove buttons for
28-
destinations
29-
- **Test with Playwright**: Deploy pattern, add at least 2 destinations via
30-
UI, verify they appear in the list, remove one destination, verify it's
31-
removed from both UI and charm output
32-
- **State**: The displayed list matches the pattern's `destinations` output
33-
field
3+
1. [ ] Counter
4+
- Components: Count display + Increment/decrement buttons + Reset button
5+
- Data: Current count value
6+
- Features: Increment, decrement, reset to zero
7+
8+
2. [ ] Shopping List with sort-by-category and budget tracking. These should be
9+
3 different patterns (shopping list, category list, and budget tracker)
10+
and a final pattern that combines them together and acts as a launcher.
11+
- Components: Shopping list (item input + checkboxes + clear button) +
12+
Category list (category input + item assignment) + Budget tracker (price
13+
input + total display) + Launcher (tabs/buttons to switch between views)
14+
- Data: Shopping items with name, category, price, checked status; Categories
15+
with names
16+
- Features: Add/remove items, assign categories, track prices, sort by
17+
category, view budget totals, check off purchased items
18+
19+
3. [ ] Calendar
20+
- Components: Month view with day cells + Event list displayed in calendar +
21+
Day editor (opens when clicking a day)
22+
- Data: Events with date, time, description
23+
- Features: View one month at a time, click day to edit its event list,
24+
events shown in calendar UI
25+
26+
4. [ ] Fitness Workout Planner
27+
- Components: Exercise routine builder + Set/rep counter + Progress chart
28+
- Data: Exercises with sets, reps, weight
29+
- Features: Track personal records, show strength gains over time
30+
31+
5. [ ] Lunch Voter
32+
- Components: Restaurant list + Voting buttons + Vote tally display +
33+
Add/remove restaurant form
34+
- Data: Restaurants with vote counts
35+
- Features: Add/remove restaurants, vote for favorites, see most popular
36+
choice
37+
38+
6. [ ] Study Schedule with Focus Timer
39+
- Components: Study task list + Time block scheduler + Pomodoro timer + Break
40+
reminders
41+
- Data: Study topics, estimated duration, completion status
42+
- Features: Schedule study sessions, track time spent, enforce breaks
43+
44+
7. [ ] Travel Itinerary with Budget Tracker
45+
- Components: Activity scheduler + Day-by-day timeline + Expense tracker +
46+
Budget dashboard
47+
- Data: Activities with time, location, cost
48+
- Features: Plan entire trip, track expenses by category, budget warnings
49+
50+
8. [ ] Contact Manager with Birthday Reminders
51+
- Components: Contact list + Upcoming birthdays view + Gift idea notes +
52+
Calendar integration
53+
- Data: Contacts with birthdays, gift history
54+
- Features: Birthday notifications, gift suggestions, relationship notes

tools/ralph/bin/ralph-smoketest.sh

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/usr/bin/env bash
2+
3+
# Get the directory where this script is located
4+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
5+
# Get the ralph directory (parent of bin)
6+
RALPH_DIR="$(dirname "$SCRIPT_DIR")"
7+
# Get the labs directory (two levels up from bin)
8+
LABS_DIR="$(dirname "$(dirname "$RALPH_DIR")")"
9+
10+
# Change to labs directory for relative paths to work
11+
cd "$LABS_DIR"
12+
13+
# Check RALPH_ID is set
14+
if [ -z "$RALPH_ID" ]; then
15+
echo "Error: RALPH_ID environment variable is not set"
16+
exit 1
17+
fi
18+
19+
# Ensure logs directory exists
20+
mkdir -p ./tools/ralph/logs
21+
22+
# Rotate logs keeping last 5
23+
for i in 4 3 2 1; do
24+
[ -f ./tools/ralph/logs/ralph-claude.log.$i ] && mv ./tools/ralph/logs/ralph-claude.log.$i ./tools/ralph/logs/ralph-claude.log.$((i+1))
25+
done
26+
[ -f ./tools/ralph/logs/ralph-claude.log ] && mv ./tools/ralph/logs/ralph-claude.log ./tools/ralph/logs/ralph-claude.log.1
27+
28+
# llm command to summarize changes
29+
LLM="./tools/ralph/bin/llm.sh"
30+
31+
{ printf "Your RALPH_ID is %s.\n\n" "$RALPH_ID"; cat ./tools/ralph/SMOKETEST_PROMPT.md; } | \
32+
claude --print --dangerously-skip-permissions \
33+
--verbose --output-format=stream-json 2>&1 | \
34+
tee -a ./tools/ralph/logs/ralph-claude.log
35+
36+
# Auto-stash changes if any exist
37+
if [[ -n "$(git status --porcelain)" ]]; then
38+
git add -A
39+
40+
# Generate commit message from staged changes
41+
commit_msg=$(git diff --staged | $LLM "Summarize these changes into a short one-line description, output just that one line")
42+
43+
git stash push -m "$commit_msg"
44+
fi

tools/ralph/bin/run_smoketest.sh

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#!/usr/bin/env bash
2+
3+
# Get the directory where this script is located
4+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
5+
# Get the ralph directory (parent of bin)
6+
RALPH_DIR="$(dirname "$SCRIPT_DIR")"
7+
# Get the labs directory (two levels up from bin)
8+
LABS="$(dirname "$(dirname "$RALPH_DIR")")"
9+
10+
# Change to labs directory
11+
cd "$LABS"
12+
13+
# Stop and remove any existing ralph containers first
14+
for ID in 1 2 3; do
15+
docker stop ralph_$ID 2>/dev/null || true
16+
docker rm ralph_$ID 2>/dev/null || true
17+
done
18+
19+
# Remove existing results after containers are stopped
20+
rm -rf "$LABS/tools/ralph/smoketest"/[0-9]*
21+
22+
# Run smoketests for IDs 1 through 3
23+
for ID in 1 2 3; do
24+
echo "Starting smoketest for RALPH_ID=$ID"
25+
docker run --rm -e RALPH_ID=$ID -d \
26+
-v ~/.claude.json:/home/ralph/.claude.json \
27+
-v ~/.claude/.credentials.json:/home/ralph/.claude/.credentials.json \
28+
-v "$LABS/tools/ralph/smoketest:/app/smoketest" \
29+
--name ralph_$ID \
30+
ellyxir/ralph
31+
done
32+
33+
echo "All smoketests started. Use 'docker logs ralph_<ID>' to monitor progress."

tools/ralph/bin/stop_smoketest.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/usr/bin/env bash
2+
3+
# Stop and remove all ralph smoketest containers
4+
for ID in 1 2 3; do
5+
echo "Stopping ralph_$ID..."
6+
docker stop ralph_$ID 2>/dev/null || true
7+
docker rm ralph_$ID 2>/dev/null || true
8+
done
9+
10+
echo "All smoketest containers stopped and removed."

0 commit comments

Comments
 (0)