Python API Example with Wallabag Web Application for Extracting Entries and Quotes

python and wallabag

In the previous post Python API Example with Wallabag Web Application we explored how to connect via Web API to Wallabag and make entry to Wallabag web application. For this we setup API, obtained token via python script and then created entry (added link).

In this post we will extract entries through Web API with python script. From entry we will extract needed information such as id of entry. Then for this id we will look how to extract annotations and quotes.

Wallabag is read it later web application like Pocket or Instapaper. Quotes are some texts that we highlight within Wallabag. Annotations are our notes that we can save together with annotations. For one entry we can have several quotes / annotations. Wallabag is open source software so you can download it and install it locally or remotely on web server.

If you did not setup API you need first setup API to run code below. See previous post how to do this.
The beginning of script should be also same as before – as we need first provide our credentials and obtain token.

Obtaining Entries

After obtaining token we move to actual downloading data. We can obtain entries using below code:

p = {'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.get('{}/api/entries.txt'.format(HOST), p)

p is holding parameters that allow to limit our output.
The return data is json structure with a lot of information including entries. It does not include all entries. It divides entries in set of 30 per page and it provides link to next page. So we can extract next page link and then extract entries again.

Each entry has link, id and some other information.

Obtaining Annotations / Quotes

To extract annotations, quotes we can use this code:

p = {'access_token': access}
link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt'
print (link)
r = requests.get(link.format(HOST), p)
data=json.loads(r.text)

Full Python Source Code

Below is full script example:

# Extract entries using wallabag API and Python
# Extract quotes and annotations for specific entry
# Save information to files
import requests
import json

# only these 5 variables have to be set
#HOST = 'https://wallabag.example.org'
USERNAME = 'xxxxxx'
PASSWORD = 'xxxxxx'
CLIENTID = 'xxxxxxxxxxxx'
SECRET = 'xxxxxxxxxxx'
HOST = 'https://intelligentonlinetools.com/wallabag/web'    


gettoken = {'username': USERNAME, 'password': PASSWORD, 'client_id': CLIENTID, 'client_secret': SECRET, 'grant_type': 'password'}
print (gettoken)

r = requests.post('{}/oauth/v2/token'.format(HOST), gettoken)
print (r.content)


access = r.json().get('access_token')

p = {'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.get('{}/api/entries.txt'.format(HOST), p)

data=json.loads(r.text)
print (type(data))


with open('data1.json', 'w') as f:  # writing JSON object
      json.dump(data, f)


for key, value in data.items():
     print (key, value)
     
#Below how to access needed information at page level like next link
#and at entry level like id, url for specific 3rd entry (counting from 0)      
print (data['_links']['next']) 
print (data['pages'])
print (data['page']) 
print (data['_embedded']['items'][3]['id'])  
print (data['_embedded']['items'][3]['url'])  
print (data['_embedded']['items'][3]['annotations'])


p = {'access_token': access}

link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt'
print (link)
r = requests.get(link.format(HOST), p)
data=json.loads(r.text)
with open('data2.json', 'w') as f:  # writing JSON object
      json.dump(data, f)

#Below how to access first and second annotation / quote
#assuming they exist 
print (data['rows'][0]['quote']) 
print (data['rows'][0]['text']) 
print (data['rows'][1]['quote'])    
print (data['rows'][1]['text'])

Conclusion

In this post we learned how to use Wallabag API to download entries, annotations and quotes. To do this we first downloaded entries and ids. Then we downloaded annotations and quotes for specific entry id. Additionally we learned some json python and json examples to get needed information from retrieved data.

Feel free to provide feedback or ask related questions.

Python API Example with Wallabag Web Application

python and wallabag

Wallabag

Many a times it happens that we need to create API to post data to some web application using python framework.
To over come this problem of sending data to application from outside of it, using API, I am going to show how you can do this for Wallabag Web Application. Wallabag is Read It Later type of application, where you can save website links, and then read later.

Thus we will look here how to write API script that can send information to Wallabag web based application.

To do this we need access to Wallabag. It is open open source project (MIT license) so you can download and install as self hosted service.

Collecting Information

Once we installed Wallabag or got access to it, we will collect information needed for authorization.
Go to Wallabag application, then API Client Management Tab and create the client.
Note client id, client secret.
See below screenshot for references.

Python API Example Script

Now we can go to python IDE and write the script as below. Here https://mysite.com is base url, wallabag is an installation folder where we installed application.


import requests


# below 5 variables have to be set

USERNAME = 'xxxxxxxx'
PASSWORD = 'xxxxxxxx'
CLIENTID = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
HOST = 'https://mysite.com/wallabag/web'    


gettoken = {'username': USERNAME, 'password': PASSWORD, 'client_id': CLIENTID, 'client_secret': SECRET, 'grant_type': 'password'}
print (gettoken)
r = requests.post('{}/oauth/v2/token'.format(HOST), gettoken)
print (r.content)

access = r.json().get('access_token')


url = 'https://visited_site.com' # URL of the article
# should the article be already read? 0 or 1 for archive
# should the article be added as favorited? 0 or 1 for starred

article = {'url': url, 'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.post('{}/api/entries.json'.format(HOST), article)


"""
output:
{'username': 'xxxxxxxxx', 'password': 'xxxxxxxxxx', 'grant_type': 'password', 'client_id': 'xxxxxxxxxxxxxxx', 'client_secret': 'xxxxxxxxxxx'}
b'{"access_token":"xxxxxxxxxxxxxx","expires_in":3600,"token_type":"bearer","scope":null,"refresh_token":"xxxxxxxxxxxxxx"}'
"""



Troubleshooting

I found useful to include print (r.content) in case something goes wrong. It can help see what is returned by sever.
Also it helped me looking at log which is located at /var/logs/prod.log under yoursite.com/wallabag. In case something is going wrong it might have some clue in the log.

Conclusion

We looked at python api example of how to integrate python script with Wallabag web application and send data using Wallabag API and python requests library.

References

wallabag api example
Wallabag