In this tutorial, we’re going to read data from a table automatically on click.
This can be useful to quickly export data to a CSV file or to manipulate the content of the data parsed from the table for further examination.
1. Open the Page Editor
Let’s navigate to this page https://en.wikipedia.org/wiki/Table_(information) and open the PixieBrix Page Editor.
2. Add Context Menu brick
The first step after opening the Page editor is to add the Context Menu brick.
Click the green Add button from the Extension and select the Context Menu as shown below.
Now whenever we click on the page a Pixiebrix context menu will be created and that’s how we can activate the extension.
We’re going to configure the Context Menu brick like such:
- Extension Name:
Read Data from Table on Wiki
Read Data from WikiThis is the title of the context menu
Leave the other fields as they. The whole brick should look like this:
3. Add the Table Reader brick
Now the fun part of this tutorial, we’re going to select the table from the page that we want to use as the source for the data we’re going to save to a CSV file.
Let’s add a new brick called Table Reader: this brick allows us to Read data from an HTML table.
<table><tr><td>etc. 👉 Read more about HTML tables
The output of this brick is by default @data → leaving it like this but we’re going to use this in another step later.
We’re going to pick a selector for this brick: You can use the blue arrow button to visually select the selector...or go the manual way and inspect the page by hand.
I choose the latter and I figured out that if I use
I can select the full table on the page.
Final brick Configuration
At this point using the selector mentioned above your brick should be configured to look like below
:nth-child()pseudo-class finds and returns elements based on the position they hold in a group.
4. Add the Export as CSV brick
In this last step we’re going to export the data we automatically scraped in the previous step to a .csv file.
Let’s go ahead and add the brick named Export as CSV by clicking the Add a brick button from the extension overview panel
In this brick we’re going to provide a filename and the data that we want to use to create our CSV file.
For filename I am going to leave the default name of
exported but for data I want to provide the data that I have scraped earlier and which was stored in the variable called @table, specifically
Once you’ve changed this parameter it should look like this screenshot below.
This is it we’re done! Now let’s go ahead and test this extension...
1) Navigate to https://en.wikipedia.org/wiki/Table_(information)
2) Right-click anywhere on the page
3) Select the contextual entry we created earlier “Read Data from Wiki”
4) A file should be downloaded to your drive containing the data from the table you selected
In this tutorial, we went over how to scrape data automatically from a page, specifically tabular data.
This allows you to download or manipulate data automatically and then download it, copy it to clipboard and much more.