Private On-Device AI Agent for Chrome – Gemma Gem

Free Chrome extension that runs Gemma 4 on-device. No data leaves your machine. Page reading, form filling, JS execution, and 128K context.

Gemma Gem is a free, open-source Chrome extension that runs Google’s Gemma 4 model on your device through WebGPU. A great privacy-preserving alternative to Google Chrome’s ‘Ask Gemini’.

It works as an in-page AI agent capable of reading page content, clicking buttons, filling forms, scrolling, executing JavaScript, and answering questions about any site the user visits.

Features

  • Runs the Gemma 4 model locally with no account or API key required.
  • Two model variants: the E2B at approximately 500MB and the E4B at approximately 1.5GB.
  • Supports a 128K token context window on both model variants with q4f16 quantization.
  • Reads page text and HTML by CSS selector or full-page scope.
  • Captures the visible page as a PNG screenshot.
  • Clicks page elements by CSS selector.
  • Types into input fields by CSS selector.
  • Scrolls the page up or down by pixel amount.
  • Executes arbitrary JavaScript in the page context with full DOM access.
  • Toggles native Gemma 4 thinking mode on or off.
  • Caps agent tool call loops per request.
  • Persists the selected model and per-site disable preferences across sessions.

Use Cases

  • Extract structured data from a long research article and ask follow-up questions.
  • Automate repetitive form filling on internal tools by describing target fields and values to the agent.
  • Inspect the DOM structure of a web page by asking the agent to read specific CSS selectors and report their content.
  • Run ad hoc JavaScript directly in the page context to probe or manipulate page state during development or debugging.
  • Summarize full-length documentation pages on demand.

How to Use It

1. Clone the repository and install dependencies:

pnpm install

2. Run the development build:

pnpm build

3. Open chrome://extensions in Chrome, enable developer mode, click “Load unpacked,” and select the .output/chrome-mv3-dev/ directory.

4. Once the extension is installed and activated, you will see a gem icon appear in the bottom-right corner of any page. Click it to open the chat overlay. Ask questions about the page or issue action commands once the model finishes loading.

5. Access all settings via the gear icon in the chat header.

SettingOptionsNotes
ModelE2B (~500MB) / E4B (~1.5GB)Selection persists across sessions
ThinkingOn / OffToggles native Gemma 4 thinking mode
Max iterationsIntegerCaps tool call loops per request
Clear contextActionResets conversation history for the current page
Disable on this siteToggleDisables extension per hostname, persisted

6. Available Agent tools:

ToolDescriptionExecution context
read_page_contentReads text or HTML of the page or a CSS selectorContent script
take_screenshotCaptures the visible page as a PNGService worker
click_elementClicks an element by CSS selectorContent script
type_textTypes into an input field by CSS selectorContent script
scroll_pageScrolls up or down by pixel amountContent script
run_javascriptExecutes JS in the page context with full DOM accessService worker

Pros

  • Zero data leaves the device.
  • No API key or subscription is required.
  • The 128K context window handles full-length articles and long documentation pages in a single request.

Cons

  • Chrome and WebGPU are mandatory.
  • First run needs a large local model download.
  • Local inference speed depends on available hardware.

Related Resources

  • Gemma 4: Model cards and technical documentation for the Gemma 4 model family.
  • WebGPU Browser Support Table: Check GPU and browser support status before attempting installation.
  • PokeClaw: Free on-Device AI Agent for Android phone. Based on Gemma 4.

FAQs

Q: What is the difference between the E2B and E4B models?
A: E2B is a 2-billion-parameter variant requiring approximately 500MB of disk space. E4B is a 4-billion-parameter variant requiring approximately 1.5GB. The E4B model generatess higher-quality output on tasks that need more contextual reasoning.

Q: Can Gemma Gem automate actions on any website?
A: It can click elements, type into inputs, scroll, and execute JavaScript on any page where the extension is active. Sites with strict Content Security Policies may restrict JavaScript injection.

Q: Can I use Gemma Gem on a device without a discrete GPU?
A: Yes, WebGPU can leverage integrated graphics or fall back to CPU execution. Performance on lower‑end hardware may be slower. At least 8GB of system RAM is recommended for smooth operation with the larger E4B model.

Q: Is the model permanently stored offline after the first download?
A: The model is cached by the browser and remains available for offline use as long as the browser cache is not cleared. Subsequent sessions do not require an internet connection.

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!