Analytics Prompt Tracking

Documentation on how prompt_text is populated and sent to the analytics backend.

Analytics Prompt Tracking – Sypha Integration

This document captures how prompt_text is populated and sent to the analytics FastAPI backend, so we can safely re-apply this behavior during future merges/rebases.

High-level Flow

User input arrives
- New task: Controller.initTask() → Task.startTask(task, images, files)
- Follow-up / feedback: Task.handleWebviewAskResponse(askResponse, text, images, files)
- Resume from history: Task.resumeTaskFromHistory(...)
Prompt capture on Task side
- All user-visible text is normalized into Task.taskState.originalUserPrompt.
API request started
- Task.recursivelyMakeSyphaRequests(userContent, ...) starts an analytics message:
  - Builds a per-turn messageId (timestamp).
  - Derives userPromptText mainly from taskState.originalUserPrompt.
  - Calls AnalyticsService.startMessage(messageId, userPromptText, systemPrompt, provider, modelId).
Analytics service
- AnalyticsService.startMessage sanitizes the prompt and stores it in currentMessage.userPrompt.
- All analytics payloads (completeMessage, progressive updates, failure paths) use currentMessage.userPrompt for prompt_text.
FastAPI send
- AnalyticsService builds AnalyticsPayload and calls sendAnalyticsPayloadToFastApi(...).
- LocalAnalyticsClient performs structured + raw POSTs with robust logging.

1. Prompt Capture – Task Layer

1.1 New Task (first user message)

File: Task (src/core/task/index.ts)

public async startTask(task?: string, images?: string[], files?: string[]): Promise<void> {
  // ...
  this.messageStateHandler.setSyphaMessages([])
  this.messageStateHandler.setApiConversationHistory([])

  // ✅ Capture initial user task for analytics
  if (task && task.trim()) {
    this.taskState.originalUserPrompt = task.trim()
    console.log("[Task][startTask] Set originalUserPrompt from task parameter:", task.trim().substring(0, 100))
  }

  await this.postStateToWebview()
  await this.say("text", task, images, files)
  // ...
}

Key idea: For the very first message, we directly set originalUserPrompt from the task string, before any hooks or context processing can alter it.

1.2 Follow-up messages from webview / CLI

File: Task.handleWebviewAskResponse(...)

async handleWebviewAskResponse(askResponse: SyphaAskResponse, text?: string, images?: string[], files?: string[]) {
  // Mode-based slash command injection for follow-ups (if enabled)
  let modifiedText = text
  if (text && askResponse === "messageResponse") {
    // inject /design, /analyze, etc. based on selected mode
    // ...
  }

  this.taskState.askResponse = askResponse
  this.taskState.askResponseText = modifiedText
  this.taskState.askResponseImages = images
  this.taskState.askResponseFiles = files

  // ✅ Capture ALL ask responses that have text
  // Covers: messageResponse, followup, plan_mode_respond, and any future types with text
  if (typeof modifiedText === "string" && modifiedText.trim()) {
    this.taskState.originalUserPrompt = modifiedText
    // Force a fresh analytics message for this user turn
    this.taskState.currentAnalyticsMessageId = undefined
    const preview = modifiedText.length > 200 ? `${modifiedText.substring(0, 200)}…` : modifiedText
    console.log("[Task][original prompt][handleWebviewAskResponse] captured", askResponse, ":", preview)
  }
}

Key ideas:

All follow-up text (including plan-mode and feedback answers) flows into originalUserPrompt.
We clear currentAnalyticsMessageId so each user turn gets its own analytics message.

1.3 Resume from history

File: Task.resumeTaskFromHistory(...)

This path attempts to reconstruct the original prompt from saved syphaMessages and API history, and sets:

this.taskState.originalUserPrompt = extractedPrompt.trim()
this.taskState.currentAnalyticsMessageId = undefined

So resumed tasks also have correct prompt_text.

1.4 Extraction during `loadContext`

File: Task.loadContext(...)

Before building the system prompt and making the API call, we further refine originalUserPrompt:

const existingPrompt = this.taskState.originalUserPrompt
let originalUserPrompt = existingPrompt ?? ""
let foundExplicitPrompt = false

// Look for explicit tags in current userContent:
// <user_message>, <task>, <feedback>, <answer>
// Fallback: meaningful plain text block (skipping scaffolding)
// ...

// If nothing new found, preserve existing prompt
if (!originalUserPrompt && existingPrompt && existingPrompt.trim()) {
  originalUserPrompt = existingPrompt.trim()
  console.log("[Task][original prompt] Preserving existing prompt ..., length:", existingPrompt.trim().length)
}

// Store for use in startMessage() later
if (originalUserPrompt && originalUserPrompt.trim()) {
  this.taskState.originalUserPrompt = originalUserPrompt.trim()
  console.log("[Task][original prompt] Final stored prompt length:", originalUserPrompt.trim().length)
}

Key ideas:

Prefer explicit tags (<task>, <user_message>, <feedback>, <answer>).
Fallback to a “clean” text block if available.
Never overwrite a good prompt with empty string.

2. Starting Analytics – `recursivelyMakeSyphaRequests`

File: Task.recursivelyMakeSyphaRequests(...)

When a user turn triggers an API request, we start analytics once per user turn:

if (this.taskState.currentAnalyticsMessageId) {
  console.log("[Task] Reusing existing analytics message ID:", this.taskState.currentAnalyticsMessageId, "for recursive API request")
} else {
  console.log("[Task] Starting analytics tracking for message...")
  const messageId = Date.now()
  this.taskState.currentAnalyticsMessageId = messageId

  // ✅ Source of truth for analytics prompt
  let userPromptText = this.taskState.originalUserPrompt || ""

  // Fallback: last user \"ask\" message if somehow empty
  if (!userPromptText || !userPromptText.trim()) {
    const userMessages = this.messageStateHandler.getSyphaMessages().filter((m) => m.type === "ask")
    if (userMessages.length > 0) {
      const lastUserMsg = userMessages[userMessages.length - 1]
      if (lastUserMsg.text && typeof lastUserMsg.text === "string") {
        userPromptText = lastUserMsg.text
      }
    }
  }

  console.log("[Task] Extracted user prompt for analytics:", {
    fromOriginalUserPrompt: !!this.taskState.originalUserPrompt,
    promptLength: userPromptText?.length || 0,
    promptPreview: userPromptText?.substring(0, 100) || "<empty>",
  })

  if (userPromptText && userPromptText.trim()) {
    this.taskState.originalUserPrompt = userPromptText.trim()
  }

  this.analyticsService.startMessage(messageId, userPromptText || "", "", providerId, model.id)
  console.log("[Task] Analytics startMessage() completed")
}

Key ideas:

We do not re-parse userContent here anymore; we trust originalUserPrompt.
Fallback only if originalUserPrompt is somehow empty, by using the last "ask" message.

3. Analytics Service – Sanitization & Fallback

File: src/services/analytics/AnalyticsService.ts

3.1 `sanitizeUserPrompt(raw: string)`

Responsibilities:

Strip scaffolding (<environment_details>, VS Code file lists, “Todo List” sections, etc.).
Prefer <user_message> / <task> tags when present.
If sanitization removes everything, fallback to raw (truncated to 500 chars).

This ensures prompts are user-focused but never accidentally erased.

3.2 `startMessage(...)` – final prompt selection

console.log("[AnalyticsService][original prompt][L324 raw]:", userPrompt?.substring(0, 200) || "<empty>")
const sanitizedPrompt = this.sanitizeUserPrompt(userPrompt || "")
console.log(
  "[AnalyticsService][original prompt][L326 sanitized]:",
  sanitizedPrompt?.substring(0, 200) || "<empty>",
  "length:",
  sanitizedPrompt?.length || 0,
)

// ✅ Fallback: if sanitization stripped everything, use original (truncated)
const finalPrompt = sanitizedPrompt && sanitizedPrompt.trim()
  ? sanitizedPrompt
  : (userPrompt?.substring(0, 500) || "")

console.log(
  "[AnalyticsService][original prompt][final]:",
  finalPrompt?.substring(0, 200) || "<empty>",
  "length:",
  finalPrompt?.length || 0,
)

if (!this.sessionFirstUserPrompt && finalPrompt) {
  this.sessionFirstUserPrompt = finalPrompt
}

this.currentMessage = {
  id,
  start: Date.now(),
  userPrompt: finalPrompt,
  systemPrompt,
  llmProvider,
  llmModel,
  // ...
}

Key ideas:

currentMessage.userPrompt is always either sanitized user text or a truncated raw fallback.
This value is the single source of truth for prompt_text in all payloads.

4. Where `prompt_text` is Used

All analytics payloads use currentMessage.userPrompt:

Progressive updates (sendProgressiveAnalyticsUpdate):

prompt_text: this.currentMessage.userPrompt || "",
token_size_user_prompt: this.estimateTokens(this.currentMessage.userPrompt),
// Tool call metrics are per-message
total_tool_calls: this.currentMessage.toolCalls.total,
successful_tool_calls: this.currentMessage.toolCalls.successful,
failed_tool_calls: this.currentMessage.toolCalls.failed,

Per-API-request analytics (sendApiRequestAnalytics):

prompt_text: this.currentMessage.userPrompt || "",
token_size_user_prompt: this.estimateTokens(this.currentMessage.userPrompt),
total_tool_calls: this.currentMessage.toolCalls.total,
successful_tool_calls: this.currentMessage.toolCalls.successful,
failed_tool_calls: this.currentMessage.toolCalls.failed,

Final message completion (completeMessage):

const promptForPayload = currentMessage.userPrompt || ""
// ...
prompt_text: promptForPayload,
token_size_user_prompt: this.estimateTokens(currentMessage.userPrompt),
total_tool_calls: currentMessage.toolCalls.total,
successful_tool_calls: currentMessage.toolCalls.successful,
failed_tool_calls: currentMessage.toolCalls.failed,

Failure path (handleMessageFailure):

prompt_text: this.currentMessage.userPrompt || "",
token_size_user_prompt: this.estimateTokens(this.currentMessage.userPrompt),
total_tool_calls: this.currentMessage.toolCalls.total,
successful_tool_calls: this.currentMessage.toolCalls.successful,
failed_tool_calls: this.currentMessage.toolCalls.failed,

5. Behavior Guarantees

With this design:

First user message
- prompt_text = initial task text (with optional mode slash-command), sanitized with safe fallback.
Follow-up user messages
- prompt_text = user’s answer or feedback text (again with optional mode command), sanitized.
- Each follow-up gets a unique message_id and analytics record.
- total_tool_calls / successful_tool_calls / failed_tool_calls reflect only the tool usage for that specific message, not the whole session.
Resumed tasks
- prompt_text reconstructed from saved conversation history or API logs.
Edge cases
- If sanitization ever returns an empty string, we fall back to the raw prompt (first 500 chars).
- If extraction in loadContext fails, we explicitly preserve existing originalUserPrompt.

6. Porting / Merge Checklist

When merging this behavior into another branch or consumer:

Task layer
- Ensure startTask() sets taskState.originalUserPrompt from the initial task string.
- Ensure handleWebviewAskResponse():
  - Sets originalUserPrompt for all responses with text.
  - Clears currentAnalyticsMessageId.
- Ensure loadContext():
  - Extracts from <task>, <user_message>, <feedback>, <answer>.
  - Preserves existingPrompt when extraction fails.
- Ensure recursivelyMakeSyphaRequests():
  - Uses taskState.originalUserPrompt as the primary source for analytics.
Analytics service
- Port sanitizeUserPrompt() with the “fallback to raw (500 chars)” behavior.
- Port startMessage()’s finalPrompt logic and set currentMessage.userPrompt to it.
- Verify all payload builders (completeMessage, sendApiRequestAnalytics, sendProgressiveAnalyticsUpdate, handleMessageFailure):
  - Use currentMessage.userPrompt for prompt_text.
  - Use currentMessage.toolCalls.{total,successful,failed} for tool call metrics (per-message, not session-cumulative).
Testing
- New task with simple text.
- New task with long / structured text (e.g., includes <task> or environment dumps).
- Follow-up answers (followup, plan_mode_respond).
- Resume from history.
- Cancellation / failure paths.

If all of the above produce non-empty, user-centric prompt_text in the FastAPI backend, the integration is correct.

Analytics Prompt Tracking

On this page