In the previous post Python API Example with Wallabag Web Application we explored how to connect via Web API to Wallabag and make entry to Wallabag web application. For this we setup API, obtained token via python script and then created entry (added link).
In this post we will extract entries through Web API with python script. From entry we will extract needed information such as id of entry. Then for this id we will look how to extract annotations and quotes.
Wallabag is read it later web application like Pocket or Instapaper. Quotes are some texts that we highlight within Wallabag. Annotations are our notes that we can save together with annotations. For one entry we can have several quotes / annotations. Wallabag is open source software so you can download it and install it locally or remotely on web server.
If you did not setup API you need first setup API to run code below. See previous post how to do this.
The beginning of script should be also same as before – as we need first provide our credentials and obtain token.
Obtaining Entries
After obtaining token we move to actual downloading data. We can obtain entries using below code:
p = {'archive': 0 , 'starred': 0, 'access_token': access} r = requests.get('{}/api/entries.txt'.format(HOST), p)
p is holding parameters that allow to limit our output.
The return data is json structure with a lot of information including entries. It does not include all entries. It divides entries in set of 30 per page and it provides link to next page. So we can extract next page link and then extract entries again.
Each entry has link, id and some other information.
Obtaining Annotations / Quotes
To extract annotations, quotes we can use this code:
p = {'access_token': access} link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt' print (link) r = requests.get(link.format(HOST), p) data=json.loads(r.text)
Full Python Source Code
Below is full script example:
# Extract entries using wallabag API and Python # Extract quotes and annotations for specific entry # Save information to files import requests import json # only these 5 variables have to be set #HOST = 'https://wallabag.example.org' USERNAME = 'xxxxxx' PASSWORD = 'xxxxxx' CLIENTID = 'xxxxxxxxxxxx' SECRET = 'xxxxxxxxxxx' HOST = 'https://intelligentonlinetools.com/wallabag/web' gettoken = {'username': USERNAME, 'password': PASSWORD, 'client_id': CLIENTID, 'client_secret': SECRET, 'grant_type': 'password'} print (gettoken) r = requests.post('{}/oauth/v2/token'.format(HOST), gettoken) print (r.content) access = r.json().get('access_token') p = {'archive': 0 , 'starred': 0, 'access_token': access} r = requests.get('{}/api/entries.txt'.format(HOST), p) data=json.loads(r.text) print (type(data)) with open('data1.json', 'w') as f: # writing JSON object json.dump(data, f) for key, value in data.items(): print (key, value) #Below how to access needed information at page level like next link #and at entry level like id, url for specific 3rd entry (counting from 0) print (data['_links']['next']) print (data['pages']) print (data['page']) print (data['_embedded']['items'][3]['id']) print (data['_embedded']['items'][3]['url']) print (data['_embedded']['items'][3]['annotations']) p = {'access_token': access} link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt' print (link) r = requests.get(link.format(HOST), p) data=json.loads(r.text) with open('data2.json', 'w') as f: # writing JSON object json.dump(data, f) #Below how to access first and second annotation / quote #assuming they exist print (data['rows'][0]['quote']) print (data['rows'][0]['text']) print (data['rows'][1]['quote']) print (data['rows'][1]['text'])
Conclusion
In this post we learned how to use Wallabag API to download entries, annotations and quotes. To do this we first downloaded entries and ids. Then we downloaded annotations and quotes for specific entry id. Additionally we learned some json python and json examples to get needed information from retrieved data.
Feel free to provide feedback or ask related questions.