🕵🏼

Google Dorking (8 min)

🍎

Before you begin this tutorial, make sure you've completed the Quick Start Guide

In this tutorial, we'll create a button on the AngelList company profile page that searches for PDF files within the company's URL. To do this, we'll trigger a Google search that uses Google's advanced search operators. In the Open Source Intelligence (OSINT) community, this is called Google Dorking.

Here are three examples of Google Dork searches:

  • inurl:resume "peter parker" (finds sites with "resume" in the URL and "peter parker" in the text)
  • related:tesla.com (finds sites related to tesla.com)
  • site:hubspot.com filetype:PDF (find PDFs hosted on the HubSpot website)

1. Place a Button

Open the Page Editor

In this tutorial, we'll use the HubSpot AngelList profile page to develop our workflow.

Navigate your browser to the HubSpot profile page, and open the PixieBrix Page Editor.

Place a Button

  • Click Add in the top left of the Page Editor and choose Button.
  • Then hover your cursor over any of the AngelList buttons in the button group that includes [Overview, People, Culture, Funding, Jobs]
  • Then click, and PixieBrix will add a new custom button to the button group.
image

Customize the Caption

To customize the caption, click the Page Editor Menu Item tab and replace the word "Action" with "PDFs." The button should look like this:

image

🚨

Read Before Continuing 2A: If you ARE NOT logged into AngelList If you ARE NOT logged into AngelList, company Overview pages load as static webpages. In this case, we'll use jQuery to scrape data, which is outlined in Section 2A 2B: If you ARE logged into AngelList If you ARE logged into AngeList, the framework that was used to develop the company Overview will load. In this case, we'll scrape data directly from the frontend framework, which is outlined in Section 2B

2A: Select Data with jQuery

Select the Page Element

Because you're NOT logged into AngelList, the HubSpot page loaded as a static webpage, so we'll used jQuary to scrape data.

In the Data tab, click Add a Property. Then click the pointer icon below Value and hover your mouse over hubspot.com in the ABOUT HUBSPOT section until the shaded blue area looks like the image below. Then click to select this page element:

image

Label the Property

Scroll to the bottom of the Page Editor Data tab and you will see the following Raw Data:

image

In the Selectors section of the Page Editor Data tab change "property1" to "companyUrl"

The Raw Data section will now look like this:

image

Click the page icon next to companyUrl to copy it to your clipboard

2B: Select Data with the Website's Framework

Because you ARE logged into AngelList, the framework that was used to develop the website will load, so we'll scrape data directly from the website's frontend framework.

Identify the Framework

Start by clicking the Data tab in the Page Editor. The Framework field identifies the framework used to develop the webpage, which is "React" for AngelList.

If PixieBrix did not automatically detect "React", choose it in the Framework dropdown for this tutorial.

Select the Page Element

In the Data tab, click the pointer icon next to Selector. Then hover your mouse over the ABOUT HUBSPOT section until the shaded blue area looks like the image below. Then click to select this page element:

image

Find the Property Path to Data

Click the Selector dropdown and choose the following div tag:

image

Next, go to the search bar at the bottom of the Page Editor, and search for "hubspot.com"

You should find:

image

Click the page icon next to companyUrl to copy its property path to your clipboard:

startup.companyUrl

3. Perform an Action

Create Your Search

  • Go to the Effect tab of the Page Editor and click Add a Block.
  • Search "Google Search in new tab"
  • The Google Dork for searching PDFs on a website is:
    • site:<domain>
    • filetype:PDF
  • 2A: If you used jQuery to scrape data, we can express this as:
site:{{{companyUrl}}} filetype:PDF
  • 2B: If you used React to scrape data, we can express this as:
site:{{{startup.companyUrl}}} filetype:PDF

This expression replaces <domain> with the value you selected surrounded by "{{{" and "}}}

🚨

We surround myVariable with three sets of braces in this example so that PixieBrix fills in the exact text. Otherwise, PixieBrix's default behavior with double braces "{{" is to escape the special characters in the URL (e.g., "/" to) to ensure the text renders correctly on a webpage

Test Your Search

Press the PDFs button to test your search. Try it from different company pages to see how the search dynamically changes. Once you're happy with the button, click Save to use this search in the future.

Continued Learning

You can modify this search in many ways by changing the Google Dork expression. Check out these sites for inspiration: