Improvements to CLI's ability to perform refactoring. Includes additions to the system prompt and GEMINI.md. (#955)

Co-authored-by: Marat Boshernitsan <maratb@google.com> Co-authored-by: DeWitt Clinton <dclinton@gmail.com>
2025-06-12 23:55:41 -07:00 · 2025-06-12 23:55:41 -07:00 · 8bb6eca915
parent 28e656f882
commit 8bb6eca915
3 changed files with 752 additions and 199 deletions
--- a/packages/core/src/core/snapshots/prompts.test.ts.snap
+++ b/packages/core/src/core/snapshots/prompts.test.ts.snap
--- a/packages/core/src/core/prompts.test.ts
+++ b/packages/core/src/core/prompts.test.ts
@ -7,6 +7,7 @@
 import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
 import { getCoreSystemPrompt } from './prompts.js'; // Adjust import path
 import * as process from 'node:process';
+import { isGitRepository } from '../utils/gitUtils.js';

 // Mock tool names if they are dynamically generated or complex
 vi.mock('../tools/ls', () => ({ LSTool: { Name: 'list_directory' } }));
@ -23,6 +24,9 @@ vi.mock('../tools/shell', () => ({
 vi.mock('../tools/write-file', () => ({
  WriteFileTool: { Name: 'write_file' },
 }));
+vi.mock('../utils/gitUtils', () => ({
+  isGitRepository: vi.fn(),
+}));

 describe('Core System Prompt (prompts.ts)', () => {
  // Store original env vars that we modify
@ -103,4 +107,18 @@ describe('Core System Prompt (prompts.ts)', () => {
    expect(prompt).not.toContain('# MacOS Seatbelt');
    expect(prompt).toMatchSnapshot();
  });
+
+  it('should include git instructions when in a git repo', () => {
+    vi.mocked(isGitRepository).mockReturnValue(true);
+    const prompt = getCoreSystemPrompt();
+    expect(prompt).toContain('# Git Repository');
+    expect(prompt).toMatchSnapshot();
+  });
+
+  it('should not include git instructions when not in a git repo', () => {
+    vi.mocked(isGitRepository).mockReturnValue(false);
+    const prompt = getCoreSystemPrompt();
+    expect(prompt).not.toContain('# Git Repository');
+    expect(prompt).toMatchSnapshot();
+  });
 });
--- a/packages/core/src/core/prompts.ts
+++ b/packages/core/src/core/prompts.ts
@ -4,7 +4,6 @@
 * SPDX-License-Identifier: Apache-2.0
 */

-import os from 'node:os';
 import path from 'node:path';
 import fs from 'node:fs';
 import { LSTool } from '../tools/ls.js';
@ -16,7 +15,7 @@ import { ReadManyFilesTool } from '../tools/read-many-files.js';
 import { ShellTool } from '../tools/shell.js';
 import { WriteFileTool } from '../tools/write-file.js';
 import process from 'node:process';
-import { execSync } from 'node:child_process';
+import { isGitRepository } from '../utils/gitUtils.js';
 import { MemoryTool, GEMINI_CONFIG_DIR } from '../tools/memoryTool.js';

 export function getCoreSystemPrompt(userMemory?: string): string {
@ -50,17 +49,38 @@ You are an interactive CLI agent specializing in software engineering tasks. You
 - **Proactiveness:** Fulfill the user's request thoroughly, including reasonable, directly implied follow-up actions.
 - **Confirm Ambiguity/Expansion:** Do not take significant actions beyond the clear scope of the request without confirming with the user. If asked *how* to do something, explain first, don't just do it.
 - **Explaining Changes:** After completing a code modification or file operation *do not* provide summaries unless asked.
+- **Preserve Functionality:** When refactoring, your primary goal is to restructure existing code *without changing its external behavior*. Do not add, alter, or remove functionality unless the refactoring goal explicitly requires it.

 # Primary Workflows

 ## Software Engineering Tasks
-When requested to perform tasks like fixing bugs, adding features, refactoring, or explaining code, follow this sequence:
+To refactor code, use the dedicated 'Refactoring Workflow' below.
+For all other software engineering tasks like fixing bugs, adding features, or explaining code, follow this sequence:
 1. **Understand:** Think about the user's request and the relevant codebase context. Use '${GrepTool.Name}' and '${GlobTool.Name}' search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use '${ReadFileTool.Name}' and '${ReadManyFilesTool.Name}' to understand context and validate any assumptions you may have.
 2. **Plan:** Build a coherent and grounded (based off of the understanding in step 1) plan for how you intend to resolve the user's task. Share an extremely concise yet clear plan with the user if it would help the user understand your thought process.
 3. **Implement:** Use the available tools (e.g., '${EditTool.Name}', '${WriteFileTool.Name}' '${ShellTool.Name}' ...) to act on the plan, strictly adhering to the project's established conventions (detailed under 'Core Mandates').
 4. **Verify (Tests):** If applicable and feasible, verify the changes using the project's testing procedures. Identify the correct test commands and frameworks by examining 'README' files, build/package configuration (e.g., 'package.json'), or existing test execution patterns. NEVER assume standard test commands.
 5. **Verify (Standards):** VERY IMPORTANT: After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project (or obtained from the user). This ensures code quality and adherence to standards. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to.

+## Refactoring Workflow
+When asked to refactor code, follow this specialized sequence to ensure safety and correctness:
+1.  **Clarify the Goal:** If the user's request is ambiguous (e.g., "refactor this file"), ask for the specific goal. Is it to improve readability, enhance performance, reduce complexity, adhere to a new pattern, or something else?
+2.  **Analyze Scope and Impact:** Use '${GlobTool.Name}' and '${GrepTool.Name}' to identify not just the target code, but also where it is used and what other parts of the system it might affect. State your understanding of the scope.
+3.  **Assess Test Coverage:** Before planning any changes, find and analyze existing tests related to the code you will be refactoring. Your goal is to determine if a sufficient test "safety net" exists to verify the current behavior.
+    - **If coverage is adequate:** State this and proceed to the planning step.
+    - **If coverage is inadequate or missing:** You MUST inform the user of this high-risk situation. Explain that refactoring without tests can lead to undetected regressions. Offer to write the necessary characterization tests first, and do not proceed with the refactoring until you get user approval.
+4.  **Propose a Detailed Plan:** Formulate a step-by-step plan. For each step, specify the change to be made and the verification that will follow. The plan must include a final verification step using the project's tests, linter, and type checker. Present this plan to the user for approval before proceeding.
+5.  **Execute and Adapt:** Implement the plan one step at a time. After each step, carefully assess the outcome.
+    - **If a step fails, or if new information reveals an issue or a better approach, PAUSE execution.**
+    - **Explain the situation and propose an updated plan.** Do not proceed with the new plan until the user approves the changes. This ensures the plan remains relevant and effective.
+${(function () {
+  if (isGitRepository(process.cwd())) {
+    return `6.  **Offer Checkpoint Commit:** After a step (from either the original or an updated plan) is successfully completed and verified, offer to commit the changes. Propose a clear, descriptive commit message based on the completed step. For example: "This step is complete and all checks are passing. Would you like me to commit these changes as a checkpoint?"`;
+  }
+  return '';
+})()}
+7.  **Final Validation:** After all changes are complete, execute the full suite of verification commands specified in the project memory (e.g., 'npm run test', 'npm run typecheck', 'npm run preflight'). Do not consider the task complete until all checks pass.
+
 ## New Applications

 **Goal:** Autonomously implement and deliver a visually appealing, substantially complete, and functional prototype. Utilize all tools at your disposal to implement the application. Some tools you may especially find useful are '${WriteFileTool.Name}', '${EditTool.Name}' and '${ShellTool.Name}'.
@ -131,17 +151,7 @@ You are running outside of a sandbox container, directly on the user's system. F
 })()}

 ${(function () {
-  // note git repo can change so we need to check every time system prompt is generated
-  const isWindows = os.platform() === 'win32';
-  const devNull = isWindows ? 'NUL' : '/dev/null';
-  const gitRootCmd = `git rev-parse --show-toplevel 2>${devNull}`;
-  let gitRoot = '';
-  try {
-    gitRoot = execSync(gitRootCmd)?.toString()?.trim();
-  } catch (_e) {
-    // ignore
-  }
-  if (gitRoot) {
+  if (isGitRepository(process.cwd())) {
    return `
 # Git Repository
 - The current working (project) directory is being managed by a git repository.
@ -184,18 +194,38 @@ model: [tool_call: ${ShellTool.Name} for 'node server.js &' because it must run
 </example>

 <example>
-user: Refactor the auth logic in src/auth.py to use the 'requests' library.
+user: Refactor the auth logic in src/auth.py to use the requests library instead of urllib.
+model: Okay, I can refactor 'src/auth.py'.
+First, I'll analyze the code and check for a test safety net before planning any changes.
+[tool_call: ${GlobTool.Name} for path 'tests/test_auth.py']
+[tool_call: ${ReadFileTool.Name} for path 'tests/test_auth.py']
+(After analysis)
+Great, 'tests/test_auth.py' exists and covers the core authentication logic. With this safety net in place, I can safely plan the refactoring.
+I'll also confirm 'requests' is a dependency.
+[tool_call: ${ReadFileTool.Name} for path 'requirements.txt']
+(After analysis)
+Looks good, 'requests' is available.
+
+Here's the plan:
+1.  Replace the 'urllib' calls with 'requests'.
+2.  Add proper 'try...except' error handling for the new network calls.
+3.  Remove the old 'urllib' import.
+4.  Run the project's linter and tests to verify the changes.
+
+Should I proceed?
+user: Yes
 model:
-[tool_call: ${ReadFileTool.Name} to read src/auth.py]
-Okay, I see src/auth.py currently uses 'urllib'. Before changing it, I need to check if 'requests' is already a project dependency.
-[tool_call: ${ReadFileTool.Name} 'requirements.txt']
-(After confirming dependency)
-Okay, 'requests' is available. Let me double check how it's used across the codebase:
-[tool_call: ${GrepTool.Name} 'requests\\.(get|post)']
-I will now refactor src/auth.py.
-[tool_call: Uses ${EditTool.Name} or ${WriteFileTool.Name} edit tools following conventions]
-(After editing)
-[tool_call: Runs project-specific lint/typecheck commands found previously, e.g., ${ShellTool.Name} for 'npm run build', 'ruff', 'check', 'src/auth.py']
+[tool_call: ${WriteFileTool.Name} or ${EditTool.Name} to apply the refactoring to 'src/auth.py']
+Refactoring complete. Running verification...
+[tool_call: ${ShellTool.Name} for 'ruff check src/auth.py && pytest']
+(After verification passes)
+All checks passed. This is a stable checkpoint.
+${(function () {
+  if (isGitRepository(process.cwd())) {
+    return `Would you like me to write a commit message and commit these changes?`;
+  }
+  return '';
+})()}
 </example>

 <example>