📚

Defining Readers

DOM (via JQuery)

The JQuery interface closely matches the JQuery API for selecting and extracting information.

Additional Resources

Example

reader:
  type: jquery
  selectors:
     # simple selector that extracts text
     propertyName: "#propertyHeader .propertyName"

     # Set the renderedText flag to retain spacing. <br> elements are also 
     # replaced with line breaks. This is useful, e.g., for parsing addresses 
	   # as the line break distinguishes between the street name and the city name
     anotherProperty:
       selector: "#propertyAddress"
       renderedText: true
    
     #  the contents flag can be either "text" or "comment"
     propertySize:
       selector: ".specList:has(h3:contains('Property Information')) ul li:contains('Units')"
       contents: text

     # you can instead extract an attribute on the html element instead 
     # of the text
     propertyCompany:
       selector: "#propertyHeader img.logo"
       # grab the alt attribute of the image, because it contains the name of the company
       attr: alt

     # nested information can be extracted using a selector and a find entry
     agent:
       selector: "#contactSection"
       find:
         name: .agentFullName
         phone: .phoneNumber
    
     availability:
       selector: tr.rentalGridRow
        
       # set multi to true to get an array of value
       multi: true

       find:
          maxrent:
            # you can also extract the data attribute of an element
            data: maxrent
          beds:
            data: beds

React

Supports React 16+

Example

reader:
  type: react
    # JQuery selector for the react component
    selector: ".details-page-container #ds-container .ds-home-details-chip"

    # Optionally, the prop to read from the component
    rootProp: property

    # optional amount of time in milliseconds to wait for the component
    # to become available on the page
    waitMillis:

    # optional number of HTML elements to traverse upward to try to find a
    # the element with a corresponding React component
    traverseUp:

Ember.js

The Ember.js reader reads the state of an Ember.js component

reader:
   type: emberjs
   
   # The JQuery selector for the ember component
   selector: ".pv-contact-info"

Composite Readers

A reader for a foundation (kind=extensionPoint) can be a reader id, or a combination of readers. Providing a mapping of readers will assign the output of a reader to a property:

reader:
  apartment: apartments.com/property-reader
  document: "@pixiebrix/document-context"

Providing an array of readers will use all the readers, with readers appearing later overriding properties read by the earlier readers.

reader:
  - apartments.com/property-reader
  - "@pixiebrix/document-context"

The styles can also be combined, with an array of readers being assigned to a property and vice versa:

reader:
  - apartments.com/property-reader
  - document: "@pixiebrix/document-context"