This commit is contained in:
Thomas G. Lopes
2026-02-24 16:54:28 +00:00
parent 7a4870294b
commit 7edfb90d5f
21 changed files with 4491 additions and 0 deletions
@@ -0,0 +1,8 @@
# Headless Chrome profile (copy of user's Chrome profile)
.headless-profile/
# Node modules
node_modules/
# Debug files
debug-*.png
@@ -0,0 +1,196 @@
---
name: browser-tools
description: Interactive browser automation via Chrome DevTools Protocol. Use when you need to interact with web pages, test frontends, or when user interaction with a visible browser is required.
---
# Browser Tools
Chrome DevTools Protocol tools for agent-assisted web automation. These tools connect to Chrome running on `:9222` with remote debugging enabled.
## Setup
Run once before first use:
```bash
cd {baseDir}/browser-tools
npm install
```
## Start Chrome
```bash
{baseDir}/browser-start.js # Fresh profile
{baseDir}/browser-start.js --profile # Copy user's profile (cookies, logins)
```
Launch Chrome with remote debugging on `:9222`. Use `--profile` to preserve user's authentication state.
## Navigate
```bash
{baseDir}/browser-nav.js https://example.com
{baseDir}/browser-nav.js https://example.com --new
```
Navigate to URLs. Use `--new` flag to open in a new tab instead of reusing current tab.
## Evaluate JavaScript
```bash
{baseDir}/browser-eval.js 'document.title'
{baseDir}/browser-eval.js 'document.querySelectorAll("a").length'
```
Execute JavaScript in the active tab. Code runs in async context. Use this to extract data, inspect page state, or perform DOM operations programmatically.
## Screenshot
```bash
{baseDir}/browser-screenshot.js
```
Capture current viewport and return temporary file path. Use this to visually inspect page state or verify UI changes.
## Pick Elements
```bash
{baseDir}/browser-pick.js "Click the submit button"
```
**IMPORTANT**: Use this tool when the user wants to select specific DOM elements on the page. This launches an interactive picker that lets the user click elements to select them. The user can select multiple elements (Cmd/Ctrl+Click) and press Enter when done. The tool returns CSS selectors for the selected elements.
Common use cases:
- User says "I want to click that button" → Use this tool to let them select it
- User says "extract data from these items" → Use this tool to let them select the elements
- When you need specific selectors but the page structure is complex or ambiguous
## Cookies
```bash
{baseDir}/browser-cookies.js
```
Display all cookies for the current tab including domain, path, httpOnly, and secure flags. Use this to debug authentication issues or inspect session state.
## Extract Page Content
```bash
{baseDir}/browser-content.js https://example.com
```
Navigate to a URL and extract readable content as markdown. Uses Mozilla Readability for article extraction and Turndown for HTML-to-markdown conversion. Works on pages with JavaScript content (waits for page to load).
## When to Use
- Testing frontend code in a real browser
- Interacting with pages that require JavaScript
- When user needs to visually see or interact with a page
- Debugging authentication or session issues
- Scraping dynamic content that requires JS execution
---
## Efficiency Guide
### DOM Inspection Over Screenshots
**Don't** take screenshots to see page state. **Do** parse the DOM directly:
```javascript
// Get page structure
document.body.innerHTML.slice(0, 5000)
// Find interactive elements
Array.from(document.querySelectorAll('button, input, [role="button"]')).map(e => ({
id: e.id,
text: e.textContent.trim(),
class: e.className
}))
```
### Complex Scripts in Single Calls
Wrap everything in an IIFE to run multi-statement code:
```javascript
(function() {
// Multiple operations
const data = document.querySelector('#target').textContent;
const buttons = document.querySelectorAll('button');
// Interactions
buttons[0].click();
// Return results
return JSON.stringify({ data, buttonCount: buttons.length });
})()
```
### Batch Interactions
**Don't** make separate calls for each click. **Do** batch them:
```javascript
(function() {
const actions = ["btn1", "btn2", "btn3"];
actions.forEach(id => document.getElementById(id).click());
return "Done";
})()
```
### Typing/Input Sequences
```javascript
(function() {
const text = "HELLO";
for (const char of text) {
document.getElementById("key-" + char).click();
}
document.getElementById("submit").click();
return "Submitted: " + text;
})()
```
### Reading App/Game State
Extract structured state in one call:
```javascript
(function() {
const state = {
score: document.querySelector('.score')?.textContent,
status: document.querySelector('.status')?.className,
items: Array.from(document.querySelectorAll('.item')).map(el => ({
text: el.textContent,
active: el.classList.contains('active')
}))
};
return JSON.stringify(state, null, 2);
})()
```
### Waiting for Updates
If DOM updates after actions, add a small delay with bash:
```bash
sleep 0.5 && {baseDir}/browser-eval.js '...'
```
### Investigate Before Interacting
Always start by understanding the page structure:
```javascript
(function() {
return {
title: document.title,
forms: document.forms.length,
buttons: document.querySelectorAll('button').length,
inputs: document.querySelectorAll('input').length,
mainContent: document.body.innerHTML.slice(0, 3000)
};
})()
```
Then target specific elements based on what you find.
@@ -0,0 +1,103 @@
#!/usr/bin/env node
import puppeteer from "puppeteer-core";
import { Readability } from "@mozilla/readability";
import { JSDOM } from "jsdom";
import TurndownService from "turndown";
import { gfm } from "turndown-plugin-gfm";
// Global timeout - exit if script takes too long
const TIMEOUT = 30000;
const timeoutId = setTimeout(() => {
console.error("✗ Timeout after 30s");
process.exit(1);
}, TIMEOUT).unref();
const url = process.argv[2];
if (!url) {
console.log("Usage: browser-content.js <url>");
console.log("\nExtracts readable content from a URL as markdown.");
console.log("\nExamples:");
console.log(" browser-content.js https://example.com");
console.log(" browser-content.js https://en.wikipedia.org/wiki/Rust_(programming_language)");
process.exit(1);
}
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
const p = (await b.pages()).at(-1);
if (!p) {
console.error("✗ No active tab found");
process.exit(1);
}
await Promise.race([
p.goto(url, { waitUntil: "networkidle2" }),
new Promise((r) => setTimeout(r, 10000)),
]).catch(() => {});
// Get HTML via CDP (works even with TrustedScriptURL restrictions)
const client = await p.createCDPSession();
const { root } = await client.send("DOM.getDocument", { depth: -1, pierce: true });
const { outerHTML } = await client.send("DOM.getOuterHTML", { nodeId: root.nodeId });
await client.detach();
const finalUrl = p.url();
// Extract with Readability
const doc = new JSDOM(outerHTML, { url: finalUrl });
const reader = new Readability(doc.window.document);
const article = reader.parse();
// Convert to markdown
function htmlToMarkdown(html) {
const turndown = new TurndownService({ headingStyle: "atx", codeBlockStyle: "fenced" });
turndown.use(gfm);
turndown.addRule("removeEmptyLinks", {
filter: (node) => node.nodeName === "A" && !node.textContent?.trim(),
replacement: () => "",
});
return turndown
.turndown(html)
.replace(/\[\\?\[\s*\\?\]\]\([^)]*\)/g, "")
.replace(/ +/g, " ")
.replace(/\s+,/g, ",")
.replace(/\s+\./g, ".")
.replace(/\n{3,}/g, "\n\n")
.trim();
}
let content;
if (article && article.content) {
content = htmlToMarkdown(article.content);
} else {
// Fallback
const fallbackDoc = new JSDOM(outerHTML, { url: finalUrl });
const fallbackBody = fallbackDoc.window.document;
fallbackBody.querySelectorAll("script, style, noscript, nav, header, footer, aside").forEach((el) => el.remove());
const main = fallbackBody.querySelector("main, article, [role='main'], .content, #content") || fallbackBody.body;
const fallbackHtml = main?.innerHTML || "";
if (fallbackHtml.trim().length > 100) {
content = htmlToMarkdown(fallbackHtml);
} else {
content = "(Could not extract content)";
}
}
console.log(`URL: ${finalUrl}`);
if (article?.title) console.log(`Title: ${article.title}`);
console.log("");
console.log(content);
process.exit(0);
@@ -0,0 +1,35 @@
#!/usr/bin/env node
import puppeteer from "puppeteer-core";
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
const p = (await b.pages()).at(-1);
if (!p) {
console.error("✗ No active tab found");
process.exit(1);
}
const cookies = await p.cookies();
for (const cookie of cookies) {
console.log(`${cookie.name}: ${cookie.value}`);
console.log(` domain: ${cookie.domain}`);
console.log(` path: ${cookie.path}`);
console.log(` httpOnly: ${cookie.httpOnly}`);
console.log(` secure: ${cookie.secure}`);
console.log("");
}
await b.disconnect();
@@ -0,0 +1,53 @@
#!/usr/bin/env node
import puppeteer from "puppeteer-core";
const code = process.argv.slice(2).join(" ");
if (!code) {
console.log("Usage: browser-eval.js 'code'");
console.log("\nExamples:");
console.log(' browser-eval.js "document.title"');
console.log(' browser-eval.js "document.querySelectorAll(\'a\').length"');
process.exit(1);
}
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
const p = (await b.pages()).at(-1);
if (!p) {
console.error("✗ No active tab found");
process.exit(1);
}
const result = await p.evaluate((c) => {
const AsyncFunction = (async () => {}).constructor;
return new AsyncFunction(`return (${c})`)();
}, code);
if (Array.isArray(result)) {
for (let i = 0; i < result.length; i++) {
if (i > 0) console.log("");
for (const [key, value] of Object.entries(result[i])) {
console.log(`${key}: ${value}`);
}
}
} else if (typeof result === "object" && result !== null) {
for (const [key, value] of Object.entries(result)) {
console.log(`${key}: ${value}`);
}
} else {
console.log(result);
}
await b.disconnect();
@@ -0,0 +1,108 @@
#!/usr/bin/env node
/**
* Hacker News Scraper
*
* Fetches and parses submissions from Hacker News front page.
* Usage: node browser-hn-scraper.js [--limit <number>]
*/
import * as cheerio from 'cheerio';
/**
* Scrapes Hacker News front page
* @param {number} limit - Maximum number of submissions to return (default: 30)
* @returns {Promise<Array>} Array of submission objects
*/
async function scrapeHackerNews(limit = 30) {
const url = 'https://news.ycombinator.com';
try {
const response = await fetch(url);
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const html = await response.text();
const $ = cheerio.load(html);
const submissions = [];
// Each submission has class 'athing'
$('.athing').each((index, element) => {
if (submissions.length >= limit) return false; // Stop when limit reached
const $element = $(element);
const id = $element.attr('id');
// Get title and URL from titleline
const $titleLine = $element.find('.titleline > a').first();
const title = $titleLine.text().trim();
const url = $titleLine.attr('href');
// Get the next row which contains metadata (points, author, comments)
const $metadataRow = $element.next();
const $subtext = $metadataRow.find('.subtext');
// Get points
const $score = $subtext.find(`#score_${id}`);
const pointsText = $score.text();
const points = pointsText ? parseInt(pointsText.match(/\d+/)?.[0] || '0') : 0;
// Get author
const author = $subtext.find('.hnuser').text().trim();
// Get time
const time = $subtext.find('.age').attr('title') || $subtext.find('.age').text().trim();
// Get comments count
const $commentsLink = $subtext.find('a').last();
const commentsText = $commentsLink.text();
let commentsCount = 0;
if (commentsText.includes('comment')) {
const match = commentsText.match(/(\d+)/);
commentsCount = match ? parseInt(match[0]) : 0;
}
submissions.push({
id,
title,
url,
points,
author,
time,
comments: commentsCount,
hnUrl: `https://news.ycombinator.com/item?id=${id}`
});
});
return submissions;
} catch (error) {
console.error('Error scraping Hacker News:', error.message);
throw error;
}
}
// CLI interface
if (import.meta.url === `file://${process.argv[1]}`) {
const args = process.argv.slice(2);
let limit = 30;
// Parse --limit argument
const limitIndex = args.indexOf('--limit');
if (limitIndex !== -1 && args[limitIndex + 1]) {
limit = parseInt(args[limitIndex + 1]);
}
scrapeHackerNews(limit)
.then(submissions => {
console.log(JSON.stringify(submissions, null, 2));
console.error(`\n✓ Scraped ${submissions.length} submissions`);
})
.catch(error => {
console.error('Failed to scrape:', error.message);
process.exit(1);
});
}
export { scrapeHackerNews };
@@ -0,0 +1,44 @@
#!/usr/bin/env node
import puppeteer from "puppeteer-core";
const args = process.argv.slice(2);
const newTab = args.includes("--new");
const reload = args.includes("--reload");
const url = args.find(a => !a.startsWith("--"));
if (!url) {
console.log("Usage: browser-nav.js <url> [--new] [--reload]");
console.log("\nExamples:");
console.log(" browser-nav.js https://example.com # Navigate current tab");
console.log(" browser-nav.js https://example.com --new # Open in new tab");
console.log(" browser-nav.js https://example.com --reload # Navigate and force reload");
process.exit(1);
}
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
if (newTab) {
const p = await b.newPage();
await p.goto(url, { waitUntil: "domcontentloaded" });
console.log("✓ Opened:", url);
} else {
const p = (await b.pages()).at(-1);
await p.goto(url, { waitUntil: "domcontentloaded" });
if (reload) {
await p.reload({ waitUntil: "domcontentloaded" });
}
console.log("✓ Navigated to:", url);
}
await b.disconnect();
@@ -0,0 +1,162 @@
#!/usr/bin/env node
import puppeteer from "puppeteer-core";
const message = process.argv.slice(2).join(" ");
if (!message) {
console.log("Usage: browser-pick.js 'message'");
console.log("\nExample:");
console.log(' browser-pick.js "Click the submit button"');
process.exit(1);
}
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
const p = (await b.pages()).at(-1);
if (!p) {
console.error("✗ No active tab found");
process.exit(1);
}
// Inject pick() helper into current page
await p.evaluate(() => {
if (!window.pick) {
window.pick = async (message) => {
if (!message) {
throw new Error("pick() requires a message parameter");
}
return new Promise((resolve) => {
const selections = [];
const selectedElements = new Set();
const overlay = document.createElement("div");
overlay.style.cssText =
"position:fixed;top:0;left:0;width:100%;height:100%;z-index:2147483647;pointer-events:none";
const highlight = document.createElement("div");
highlight.style.cssText =
"position:absolute;border:2px solid #3b82f6;background:rgba(59,130,246,0.1);transition:all 0.1s";
overlay.appendChild(highlight);
const banner = document.createElement("div");
banner.style.cssText =
"position:fixed;bottom:20px;left:50%;transform:translateX(-50%);background:#1f2937;color:white;padding:12px 24px;border-radius:8px;font:14px sans-serif;box-shadow:0 4px 12px rgba(0,0,0,0.3);pointer-events:auto;z-index:2147483647";
const updateBanner = () => {
banner.textContent = `${message} (${selections.length} selected, Cmd/Ctrl+click to add, Enter to finish, ESC to cancel)`;
};
updateBanner();
document.body.append(banner, overlay);
const cleanup = () => {
document.removeEventListener("mousemove", onMove, true);
document.removeEventListener("click", onClick, true);
document.removeEventListener("keydown", onKey, true);
overlay.remove();
banner.remove();
selectedElements.forEach((el) => {
el.style.outline = "";
});
};
const onMove = (e) => {
const el = document.elementFromPoint(e.clientX, e.clientY);
if (!el || overlay.contains(el) || banner.contains(el)) return;
const r = el.getBoundingClientRect();
highlight.style.cssText = `position:absolute;border:2px solid #3b82f6;background:rgba(59,130,246,0.1);top:${r.top}px;left:${r.left}px;width:${r.width}px;height:${r.height}px`;
};
const buildElementInfo = (el) => {
const parents = [];
let current = el.parentElement;
while (current && current !== document.body) {
const parentInfo = current.tagName.toLowerCase();
const id = current.id ? `#${current.id}` : "";
const cls = current.className
? `.${current.className.trim().split(/\s+/).join(".")}`
: "";
parents.push(parentInfo + id + cls);
current = current.parentElement;
}
return {
tag: el.tagName.toLowerCase(),
id: el.id || null,
class: el.className || null,
text: el.textContent?.trim().slice(0, 200) || null,
html: el.outerHTML.slice(0, 500),
parents: parents.join(" > "),
};
};
const onClick = (e) => {
if (banner.contains(e.target)) return;
e.preventDefault();
e.stopPropagation();
const el = document.elementFromPoint(e.clientX, e.clientY);
if (!el || overlay.contains(el) || banner.contains(el)) return;
if (e.metaKey || e.ctrlKey) {
if (!selectedElements.has(el)) {
selectedElements.add(el);
el.style.outline = "3px solid #10b981";
selections.push(buildElementInfo(el));
updateBanner();
}
} else {
cleanup();
const info = buildElementInfo(el);
resolve(selections.length > 0 ? selections : info);
}
};
const onKey = (e) => {
if (e.key === "Escape") {
e.preventDefault();
cleanup();
resolve(null);
} else if (e.key === "Enter" && selections.length > 0) {
e.preventDefault();
cleanup();
resolve(selections);
}
};
document.addEventListener("mousemove", onMove, true);
document.addEventListener("click", onClick, true);
document.addEventListener("keydown", onKey, true);
});
};
}
});
const result = await p.evaluate((msg) => window.pick(msg), message);
if (Array.isArray(result)) {
for (let i = 0; i < result.length; i++) {
if (i > 0) console.log("");
for (const [key, value] of Object.entries(result[i])) {
console.log(`${key}: ${value}`);
}
}
} else if (typeof result === "object" && result !== null) {
for (const [key, value] of Object.entries(result)) {
console.log(`${key}: ${value}`);
}
} else {
console.log(result);
}
await b.disconnect();
@@ -0,0 +1,34 @@
#!/usr/bin/env node
import { tmpdir } from "node:os";
import { join } from "node:path";
import puppeteer from "puppeteer-core";
const b = await Promise.race([
puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
}),
new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), 5000)),
]).catch((e) => {
console.error("✗ Could not connect to browser:", e.message);
console.error(" Run: browser-start.js");
process.exit(1);
});
const p = (await b.pages()).at(-1);
if (!p) {
console.error("✗ No active tab found");
process.exit(1);
}
const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
const filename = `screenshot-${timestamp}.png`;
const filepath = join(tmpdir(), filename);
await p.screenshot({ path: filepath });
console.log(filepath);
await b.disconnect();
@@ -0,0 +1,86 @@
#!/usr/bin/env node
import { spawn, execSync } from "node:child_process";
import puppeteer from "puppeteer-core";
const useProfile = process.argv[2] === "--profile";
if (process.argv[2] && process.argv[2] !== "--profile") {
console.log("Usage: browser-start.js [--profile]");
console.log("\nOptions:");
console.log(" --profile Copy your default Chrome profile (cookies, logins)");
process.exit(1);
}
const SCRAPING_DIR = `${process.env.HOME}/.cache/browser-tools`;
// Check if already running on :9222
try {
const browser = await puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
});
await browser.disconnect();
console.log("✓ Chrome already running on :9222");
process.exit(0);
} catch {}
// Setup profile directory
execSync(`mkdir -p "${SCRAPING_DIR}"`, { stdio: "ignore" });
// Remove SingletonLock to allow new instance
try {
execSync(`rm -f "${SCRAPING_DIR}/SingletonLock" "${SCRAPING_DIR}/SingletonSocket" "${SCRAPING_DIR}/SingletonCookie"`, { stdio: "ignore" });
} catch {}
if (useProfile) {
console.log("Syncing profile...");
execSync(
`rsync -a --delete \
--exclude='SingletonLock' \
--exclude='SingletonSocket' \
--exclude='SingletonCookie' \
--exclude='*/Sessions/*' \
--exclude='*/Current Session' \
--exclude='*/Current Tabs' \
--exclude='*/Last Session' \
--exclude='*/Last Tabs' \
"${process.env.HOME}/Library/Application Support/Google/Chrome/" "${SCRAPING_DIR}/"`,
{ stdio: "pipe" },
);
}
// Start Chrome with flags to force new instance
spawn(
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
[
"--remote-debugging-port=9222",
`--user-data-dir=${SCRAPING_DIR}`,
"--no-first-run",
"--no-default-browser-check",
],
{ detached: true, stdio: "ignore" },
).unref();
// Wait for Chrome to be ready
let connected = false;
for (let i = 0; i < 30; i++) {
try {
const browser = await puppeteer.connect({
browserURL: "http://localhost:9222",
defaultViewport: null,
});
await browser.disconnect();
connected = true;
break;
} catch {
await new Promise((r) => setTimeout(r, 500));
}
}
if (!connected) {
console.error("✗ Failed to connect to Chrome");
process.exit(1);
}
console.log(`✓ Chrome started on :9222${useProfile ? " with your profile" : ""}`);
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,19 @@
{
"name": "browser-tools",
"version": "1.0.0",
"type": "module",
"description": "Minimal CDP tools for collaborative site exploration",
"author": "Mario Zechner",
"license": "MIT",
"dependencies": {
"@mozilla/readability": "^0.6.0",
"cheerio": "^1.1.2",
"jsdom": "^27.0.1",
"puppeteer": "^24.31.0",
"puppeteer-core": "^23.11.1",
"puppeteer-extra": "^3.3.6",
"puppeteer-extra-plugin-stealth": "^2.11.2",
"turndown": "^7.2.2",
"turndown-plugin-gfm": "^1.0.2"
}
}