WhatsApp Web Interaction Rules
Philosophy
Interact like a human using the web app. Navigate naturally, read what’s rendered, use the socket event stream for real-time data, and never call Store APIs.
This rule exists because WhatsApp can detect programmatic use of its internal APIs. Everything we do should be indistinguishable from a browser extension or accessibility tool passively observing the page.
The Seven Rules
1. Navigate Like a Human
Click chats, open panels, scroll. One action at a time, at human pace (2–3 seconds between actions). No bulk operations.
// CORRECT — Click a chat row using real browser mouse events
await page.mouse.click(rect.x + rect.width / 2, rect.y + rect.height / 2);
await new Promise(r => setTimeout(r, 2500)); // Wait like a human would
// WRONG — Programmatic bulk navigation
for (const chatId of allChats) {
await page.evaluate(`Store.Chat.get("${chatId}")`); // Store API
}
2. Read What’s Already Rendered
Read from the actual DOM elements visible in the browser: fiber props, img src, text content. Passive observation of what the browser has already loaded and rendered is fine.
// CORRECT — Read rendered HTML elements
document.querySelectorAll('[role="row"]') // Chat list rows
document.querySelector('span[title]') // Contact name
element.textContent // Displayed text
element.getAttribute('src') // Image URL
// CORRECT — Read React fiber props (passive, read-only)
const fiberKey = Object.keys(el).find(k => k.startsWith('__reactFiber'));
const fiber = el[fiberKey];
const chatId = fiber.memoizedProps.chat.id._serialized;
// FORBIDDEN — Store API calls
window.Store.Contact.get(wid)
window.Store.Chat.get(wid)
window.Store.ProfilePic.requestProfilePicFromServer(wid)
client.getContactById(id)
client.getChatById(id)
chat.fetchMessages({ limit: N })
3. Use the Socket Event Stream for Real-Time Data
Incoming messages, delivery confirmations, and reactions arrive via wa:message, wa:message_create, wa:message_ack. No DOM polling needed for incoming data — whatsapp-web.js provides these events from the WebSocket connection.
// CORRECT — Listen for real-time events
client.on('message', (msg) => { /* process incoming */ });
client.on('message_create', (msg) => { /* process outgoing */ });
client.on('message_ack', (msg, ack) => { /* delivery status */ });
// WRONG — Poll DOM for new messages
setInterval(async () => {
const msgs = await page.evaluate('document.querySelectorAll("[data-id]").length');
}, 1000);
4. Never Call Store APIs
No getProfilePicThumb, requestProfilePicFromServer, Chat.get(), Contact.get(), or any window.Store.* method. These are internal WhatsApp JavaScript functions that have detectable call patterns, stack traces, and timing signatures.
Forbidden methods (comprehensive list):
window.Store.Contact.get(wid)
window.Store.Chat.get(wid)
window.Store.Chat.getModelsArray()
window.Store.ProfilePic.get(wid)
window.Store.ProfilePic.requestProfilePicFromServer(wid)
window.Store.WidFactory.createWid(id)
window.Store.StatusUtils.getStatus(wid)
window.Store.Lid.unLid(wid)
contact.getProfilePicThumb()
client.getContactById(id)
client.getChatById(id)
chat.fetchMessages({ limit: N })
client.getProfilePicUrl(id)
5. Binary Media via downloadMedia() is Acceptable
For data with no rendered equivalent (document attachments, voice notes stored as binary), msg.downloadMedia() is acceptable. Always prefer extracting from rendered <img>, <video>, <audio> elements first.
6. No Batch Operations
Scanning 50 fibers in one evaluate() call, iterating all contacts, or bulk-extracting data is NOT human behavior. One contact at a time, as you’d see them by clicking each chat.
// WRONG — Batch extraction
const allContacts = await page.evaluate(() => {
const rows = document.querySelectorAll('[role="row"]');
return Array.from(rows).map(row => {
// Walk 50 fibers, extract 9 fields each
});
});
// CORRECT — Read from current view, one chat at a time
// Chat list scan is OK because those rows are all VISIBLE
// But deep fiber extraction should happen per-chat when the user opens it
Exception: Reading the chat list grid ([role="grid"] rows) in one pass is acceptable because all those rows are simultaneously visible on screen — a human sees them all at once too. The rule targets deep data extraction that would require navigating to each item.
7. Use the App’s Own Panels
WhatsApp provides media/docs/links panels, contact info drawers, starred messages, and other views. Open them naturally (click) and read what renders, rather than inventing extraction methods.
// CORRECT — Open contact info to read phone number
await page.mouse.click(headerX, headerY); // Click contact header
await new Promise(r => setTimeout(r, 2500));
const drawer = document.querySelector('[data-animate-drawer-right="true"]');
const phone = drawer.querySelector('span[title*="+"]');
// WRONG — Invent a fiber-walking method to extract phone without opening the panel
What “Reading the DOM” Means
When WhatsApp Web displays a chat list, it renders real HTML:
<div role="row" style="z-index: 249; height: 76px;">
<img class="_ao3e" src="https://pps.whatsapp.net/v/..." />
<span title="Shane">Shane</span>
<span>Hey, how are you?</span>
</div>
Reading that HTML is reading the DOM. The data is already there because WhatsApp Web put it there for the user to see. This is indistinguishable from a screen reader, browser extension, or password manager.
How to Extract Specific Data
Avatars
- Find the chat list item for the contact in the rendered HTML
- Find the
<img>element inside it (the avatar) - Read its
srcattribute — this will be ablob:URL or CDN URL - For
blob:URLs: usefetch(blobUrl).then(r => r.blob())to get the image data
Contact Names / Phone Numbers
- Find the rendered elements in the chat list or contact info panel
- Read
textContentortitleattributes from the displayed elements
Messages
- Open a chat by clicking on it in the chat list
- Read the rendered message elements from the conversation panel
- Scroll up to load older messages (natural scroll behavior)
Contact Details (business status, about, etc.)
- Open the contact info panel (click contact name/avatar at top of chat)
- Read the rendered fields from the info panel HTML
Why This Matters
window.Storecalls have detectable patterns — call frequency, stack traces, timing- WhatsApp actively monitors for automation
- Reading rendered HTML is indistinguishable from a browser extension, accessibility tool, or password manager
- WhatsApp Web ALWAYS renders this data for visible contacts — so DOM reads should have near-100% success
- If data is visible on screen, it is in the DOM, and we can read it safely