PyQuery: Python’s jQuery

In this tutorial, you’ll have a look at PyQuery, a Python library which allows you to make jQuery queries on XML documents. Syntactically it’s quite similar to jQuery, and if you are familiar with jQuery it should be easier to follow.

PyQuery: Python’s jQuery

Getting Started With PyQuery

To get started with PyQuery, install the Python package using PIP.

pip install pyquery

Once you have installed PyQuery, import it into the Python program.

from pyquery import PyQuery as pq

Let’s start with a basic example of how it works. Consider the following HTML:

<div id="divMain">
  <span>
    Hello World
  </span>
</div>

Pass the input XML to the PyQuery object and you should be able to apply jQuery style queries to it.

divMain = pq("<div id='divMain'><span>Hello World</span></div>")

Assume divMain as a jQuery object and print the HTML content of the div.

print divMain('#divMain').html()

Save the changes and try running the program. You should have the following output:

<span>Hello World</span>

To access the Hello World text from the inside the span, the code would be:

print divMain('#divMain').text()

Save the changes and try running the code. You should have the following output:

Hello World

Attributes Manipulation Using PyQuery

Now let’s have a look at how to read the attributes using the PyQuery library. Assume that you have an HTML element as shown:

<ul id="ulMain">
    <li>Roy</li>
    <li>Hari</li>
</ul>

Use the PyQuery library to read the above HTML content.

ulMain = pq("<ul id='ulMain'><li>Roy</li><li>Hari</li></ul>")

Let’s try to access the ID attribute of the ulMain ul.

print ulMain.attr('id')

Save the above changes and try running the Python program. You should have ID ulMain printed on the terminal.

You saw how to access the ID attribute of the ul ulMain. Let’s try to set a class name for the same ulMain element. To specify a class name attribute, you need to specify the attribute name and its corresponding value as shown:

ulMain.attr('class','testclass')

Print the ulMain object and you should have the element with the class attribute.

<ul id="ulMain" class="testclass">
    <li>Roy</li>
    <li>Hari</li>
</ul>

You can also add and remove classes directly without using the attr method. To add a class to the element, you can make use of the method addClass.

ulMain.addClass('test')

To remove an existing class from the element, you can make use of the removeClass method.

ulMain.removeClass('test')

Handling CSS Using PyQuery

Apart from attributes, the element would have some CSS properties. To add or modify the CSS properties, you can use the css method and specify the properties. Let’s say that you want to specify the height in the style of the ul ulMain. The following code would add the required style properties to the element.

ulMain.css('height','100px')

Save the above changes and try executing the Python program. You should have the ul ulMain printed along with the new style.

<ul id="ulMain" style="height: 100px">
    <li>Roy</li>
    <li>Hari</li>
</ul>

To add multiple styles to the element, you can specify them as shown:

ulMain.css({'height':'100px','width':'100px'})

Run the program and you should have the styles added to the ulMain.

<ul id="ulMain" style="width: 100px; height: 100px">
    <li>Roy</li>
    <li>Hari</li>
</ul>

Creating & Appending Elements

During dynamic element creation, you are required to create new elements and append them to the existing parent element where they’ll be rendered. Let’s have a look at how to create and append elements using PyQuery.

Assume you have a main container div called divMain.

divMain = pq("<div id='divMain'></div>")

Let’s create a new span element using PyQuery.

span = pq('<span>Hello World</span>')

Add some CSS properties to the span.

span.css({'color':'blue','font-weight':'bold'})

PyQuery provides a method to add elements to existing elements. You can use the append method to append the span element to the div divMain. Here is the code:

divMain.append(span)
print divMain

Save the changes and run the program. You should be able to see the divMain printed with the newly created span appended to it.

<div id="divMain">
  <span style="color: blue; font-weight: bold">
    Hello World
  </span>
</div>

You used the append method to append the span to the div. You have another alternative method called appendTo which would append the nodes to value. In the above case you can use the method like so:

span.appendTo(divMain)

Finding Elements Using PyQuery

PyQuery provides methods to find children, the next elements, the closest elements, etc. It also provides methods to filter and find elements inside a particular node.

Assume that you have a particular piece of HTML as shown:

<div id="divMain">
    <div id="content">
        <ul>
            <li>Jade</li>
            <li>Jim</li>
        </ul>
    </div>
    <div id="list">
        <span>Hello World</span>
    </div>
</div>

Add the following HTML to the PyQuery object:

divMain = pq("<div id='divMain'>"+
"<div id='content'>"+
"<ul>"+
"<li>Jade</li>"+
"<li>Jim</li>"+
"</ul>"
"</div>"+
"<div id='list'>"+
"<span>Hello World</span>"
"</div>"+
"</div>")

Let’s use PyQuery to find the children of the div divMain. Add the following line of code to print the children of divMain.

print divMain.children()

On running the program, you should have the following output:

<div id="content">
    <ul>
        <li>Jade</li>
        <li>Jim</li>
    </ul>
</div>
<div id="list"><span>Hello World</span></div>

To find the closest element to an element, you can use the method closest. To find the closest div element to the span, the code would be:

print divMain('span').closest('div')

The above command would return the following output:

<div id="list"><span>Hello World</span></div>

A find method is also available to find the elements. For example, to find a span inside the divMain, you need to call the method as shown:

print divMain.find('span')

The above command would return the span.

<span>Hello World</span>

Insert an Element Inside HTML

While append does the work of adding elements to the existing elements, sometimes you need to insert elements after or before certain elements. PyQuery provides a method to insert elements before or after other elements.

Let’s define a new paragraph element to be inserted after the span inside the divMain div.

p = pq('<p>Welcome</p>')

Call the insertAfter method on the paragraph p element to insert it after the span.

p.insertAfter(divMain('span'))
print divMain

Save the above changes and run the Python program. You should have the following output:

<div id="divMain">
    <div id="content">
        <ul>
            <li>Jade</li>
            <li>Jim</li>
        </ul>
    </div>
    <div id="list">
        <span>Hello World</span>
        <p>Welcome</p>
    </div>
</div>

Similarly, you have the insertBefore method, which inserts before the element. Modify the code as shown below to use the insertBefore method:

 p.insertBefore(divMain('span'))
 print divMain

Save the changes and run the program. You should be able to see the following output on the terminal:

<div id="divMain">
    <div id="content">
        <ul>
            <li>Jade</li>
            <li>Jim</li>
        </ul>
    </div>
    <div id="list">
        <p>Welcome</p>
        <span>Hello World</span>
    </div>
</div>

Wrapping It Up

In this tutorial, you saw how to get started with PyQuery, a Python library which allows you to make jQuery queries on XML documents. You saw how to manipulate the attributes and CSS styles of the HTML elements.

You learnt how to create and append elements to existing elements and insert new elements before and after elements. What you saw in this tutorial is just the tip of the iceberg, and there is a lot more that this library has to offer.

For more detailed information on using this library, I would recommend reading the official documentation. Do let us know your suggestions in the comments below.