The robots.txt file tells search engine crawlers which parts of your site they are allowed to access and which parts they should skip. Every website should have one, and SEO Forge creates and manages it for you automatically as a virtual file — no physical file needed on your server, and you can edit the rules directly from the WordPress dashboard. A well-configured robots.txt file helps Google focus its crawl budget on your important content and avoid wasting time on admin pages, login screens, and other areas that should not be indexed.
Why Robots.txt Matters
Without a robots.txt file, search engine crawlers will attempt to access everything on your site, including /wp-admin/, database query pages, and internal file paths. While these pages are usually not indexed (because they require login), crawling them wastes your crawl budget. A robots.txt file says “skip these areas entirely.”
Step-by-Step: Editing Your Robots.txt
- Go to SEO Forge > Settings > Robots.txt in the WordPress sidebar.
- You will see a text editor with your current rules.
- The default rules block
/wp-admin/while allowing AJAX requests. - Edit the rules as needed — add
Disallow:lines for any paths you want to block. - Your sitemap URL (
Sitemap: yoursite.com/sitemap_index.xml) is added automatically at the bottom. - Click Save Changes.
- Verify by visiting
yoursite.com/robots.txtin your browser.
Common Robots.txt Rules
| Rule | What It Does | Should You Use It? |
|---|---|---|
User-agent: * | Applies the following rules to all search engine crawlers | Yes — always start with this |
Disallow: /wp-admin/ | Blocks crawlers from the WordPress admin area | Yes — default, always keep |
Allow: /wp-admin/admin-ajax.php | Allows AJAX requests, needed by some themes and plugins | Yes — default, always keep |
Disallow: /cart/ | Blocks crawlers from shopping cart pages | Yes — cart pages have no search value |
Disallow: /checkout/ | Blocks crawlers from checkout pages | Yes — checkout pages should not be in search |
Disallow: /my-account/ | Blocks crawlers from user account pages | Yes — private user pages |
Disallow: /search/ | Blocks internal search results pages from being crawled | Optional — can prevent thin content indexing |
Sitemap: yoursite.com/sitemap_index.xml | Tells crawlers where to find your sitemap | Yes — SEO Forge adds this automatically |
Real-World Example
Imagine you run a WooCommerce store. Your robots.txt might look like this:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /cart/
Disallow: /checkout/
Disallow: /my-account/
Sitemap: https://yourstore.com/sitemap_index.xmlThis configuration tells Google: crawl all my pages, products, and blog posts, but skip the admin area, shopping cart, checkout, and user account pages.
How to Verify Your Robots.txt
Visit yoursite.com/robots.txt in your browser. You will see the exact file that search engines see when they visit your site. The Robots.txt card shows a small status pill next to the editor label that tells you at a glance whether SEO Forge actually controls what visitors see:
- No pill — happy path. The editor is live and what you save here is exactly what Google fetches at
/robots.txt. - 🔒 Read-only pill (orange). A physical
robots.txtfile exists in your site root and your web server is serving it directly, bypassing SEO Forge. The editor is locked so you do not accidentally save rules that never reach visitors. Hover the pill for the exact cause. To regain control, delete the file via FTP or your hosting control panel. If the physical file is one that SEO Forge previously managed, SEO Forge repairs stale sitemap URLs to the current site URL instead of leaving an old domain in place. - ⚠ Server override detected pill (yellow). No physical file is present, but when SEO Forge checks
yoursite.com/robots.txtover HTTP, the response is different from what the editor is about to save. Usually this means a CDN (Cloudflare), a reverse proxy, or a hosting control panel (hPanel, cPanel) has its own robots rule set up. Expand the “Live /robots.txt differs from these Settings” panel below the editor to see the actual body visitors receive, then disable that override at the host.
> Tip: Do not use robots.txt to hide sensitive pages. Robots.txt is a public file that anyone can read. If a page needs to be truly private, use password protection or authentication — not robots.txt.
Common Mistakes
- Blocking your entire site accidentally. A rule like
Disallow: /blocks everything. Always double-check your rules before saving. - Trying to use robots.txt as a noindex replacement. Robots.txt prevents crawling, not indexing. If other sites link to a blocked page, Google may still index the URL (without content). Use noindex for removing pages from search results.
- Blocking CSS and JavaScript files. Google needs to render your pages to understand them. Never block
/wp-content/themes/or/wp-content/plugins/— this prevents Google from seeing your site properly.
—