Wallabag – Productivity App for Read It Later Saved Articles

With so much information on the web, many of you probably use read it later web applications such as Pocket, Instapaper, Wallabag. I recently discovered and started to use Wallabag application. In this post I will share some thoughts about how Wallabag can help stay more productive.

Wallabag is much better option than using default web browser bookmarks, files or folders with the links or saved pages. While Google allows to retrieve any link later, it takes time to find the link that you saw before. You also do not want every time to research same topic for links. More effectively is just search one time, save links and later use saved links.

Below are few ideas how to use Wallabag in order to get most of it and to be more productive.

1. Use tags to label most interesting or most useful pages (links). Tags may be related to the content. They also can include next action that you want to do with the content on the page. This way later you can easy locate links that you need.

2. Once you read page mark it as read and this will put in archive. If you just reviewed quickly page and not going read it later, still put it archive. That way you will have manageable number of unread links that you can act on.

3. Do not collect many links without any action. Try to make action on links and archive not needed or processed links asap.

4. If you want add notes not connected with any links, just use outside storage like Google Drive which give you link to enter to Wallabag.

5. Use as much as possible Wallabag functionality. There are features like tagging rules, export, import, rss that can be very useful.

6. With Wallabag API you can increase even more in case you like doing web developing or hacking. For example it would be nice extract notes or summary of content that you read each month or quarter. There are some API examples that I put as starting point:
Python API Example with Wallabag Web Application for Extracting Entries and Quotes

I use my own self hosted instance at here You are welcome to try it and / or play with API.

If you have ideas, comments or suggestions I would love to hear.

References
Wallabag on github
Wallabag: An open source alternative to Pocket

Python API Example with Wallabag Web Application for Extracting Entries and Quotes

python and wallabag

In the previous post Python API Example with Wallabag Web Application we explored how to connect via Web API to Wallabag and make entry to Wallabag web application. For this we setup API, obtained token via python script and then created entry (added link).

In this post we will extract entries through Web API with python script. From entry we will extract needed information such as id of entry. Then for this id we will look how to extract annotations and quotes.

Wallabag is read it later web application like Pocket or Instapaper. Quotes are some texts that we highlight within Wallabag. Annotations are our notes that we can save together with annotations. For one entry we can have several quotes / annotations. Wallabag is open source software so you can download it and install it locally or remotely on web server.

If you did not setup API you need first setup API to run code below. See previous post how to do this.
The beginning of script should be also same as before – as we need first provide our credentials and obtain token.

Obtaining Entries

After obtaining token we move to actual downloading data. We can obtain entries using below code:

p = {'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.get('{}/api/entries.txt'.format(HOST), p)

p is holding parameters that allow to limit our output.
The return data is json structure with a lot of information including entries. It does not include all entries. It divides entries in set of 30 per page and it provides link to next page. So we can extract next page link and then extract entries again.

Each entry has link, id and some other information.

Obtaining Annotations / Quotes

To extract annotations, quotes we can use this code:

p = {'access_token': access}
link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt'
print (link)
r = requests.get(link.format(HOST), p)
data=json.loads(r.text)

Full Python Source Code

Below is full script example:

# Extract entries using wallabag API and Python
# Extract quotes and annotations for specific entry
# Save information to files
import requests
import json

# only these 5 variables have to be set
#HOST = 'https://wallabag.example.org'
USERNAME = 'xxxxxx'
PASSWORD = 'xxxxxx'
CLIENTID = 'xxxxxxxxxxxx'
SECRET = 'xxxxxxxxxxx'
HOST = 'https://intelligentonlinetools.com/wallabag/web'    


gettoken = {'username': USERNAME, 'password': PASSWORD, 'client_id': CLIENTID, 'client_secret': SECRET, 'grant_type': 'password'}
print (gettoken)

r = requests.post('{}/oauth/v2/token'.format(HOST), gettoken)
print (r.content)


access = r.json().get('access_token')

p = {'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.get('{}/api/entries.txt'.format(HOST), p)

data=json.loads(r.text)
print (type(data))


with open('data1.json', 'w') as f:  # writing JSON object
      json.dump(data, f)


for key, value in data.items():
     print (key, value)
     
#Below how to access needed information at page level like next link
#and at entry level like id, url for specific 3rd entry (counting from 0)      
print (data['_links']['next']) 
print (data['pages'])
print (data['page']) 
print (data['_embedded']['items'][3]['id'])  
print (data['_embedded']['items'][3]['url'])  
print (data['_embedded']['items'][3]['annotations'])


p = {'access_token': access}

link = '{}/api/annotations/' + str(data['_embedded']['items'][3]['id']) + '.txt'
print (link)
r = requests.get(link.format(HOST), p)
data=json.loads(r.text)
with open('data2.json', 'w') as f:  # writing JSON object
      json.dump(data, f)

#Below how to access first and second annotation / quote
#assuming they exist 
print (data['rows'][0]['quote']) 
print (data['rows'][0]['text']) 
print (data['rows'][1]['quote'])    
print (data['rows'][1]['text'])

Conclusion

In this post we learned how to use Wallabag API to download entries, annotations and quotes. To do this we first downloaded entries and ids. Then we downloaded annotations and quotes for specific entry id. Additionally we learned some json python and json examples to get needed information from retrieved data.

Feel free to provide feedback or ask related questions.

Python API Example with Wallabag Web Application

python and wallabag

Wallabag

Many a times it happens that we need to create API to post data to some web application using python framework.
To over come this problem of sending data to application from outside of it, using API, I am going to show how you can do this for Wallabag Web Application. Wallabag is Read It Later type of application, where you can save website links, and then read later.

Thus we will look here how to write API script that can send information to Wallabag web based application.

To do this we need access to Wallabag. It is open open source project (MIT license) so you can download and install as self hosted service.

Collecting Information

Once we installed Wallabag or got access to it, we will collect information needed for authorization.
Go to Wallabag application, then API Client Management Tab and create the client.
Note client id, client secret.
See below screenshot for references.

Python API Example Script

Now we can go to python IDE and write the script as below. Here https://mysite.com is base url, wallabag is an installation folder where we installed application.


import requests


# below 5 variables have to be set

USERNAME = 'xxxxxxxx'
PASSWORD = 'xxxxxxxx'
CLIENTID = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
HOST = 'https://mysite.com/wallabag/web'    


gettoken = {'username': USERNAME, 'password': PASSWORD, 'client_id': CLIENTID, 'client_secret': SECRET, 'grant_type': 'password'}
print (gettoken)
r = requests.post('{}/oauth/v2/token'.format(HOST), gettoken)
print (r.content)

access = r.json().get('access_token')


url = 'https://visited_site.com' # URL of the article
# should the article be already read? 0 or 1 for archive
# should the article be added as favorited? 0 or 1 for starred

article = {'url': url, 'archive': 0 , 'starred': 0, 'access_token': access}
r = requests.post('{}/api/entries.json'.format(HOST), article)


"""
output:
{'username': 'xxxxxxxxx', 'password': 'xxxxxxxxxx', 'grant_type': 'password', 'client_id': 'xxxxxxxxxxxxxxx', 'client_secret': 'xxxxxxxxxxx'}
b'{"access_token":"xxxxxxxxxxxxxx","expires_in":3600,"token_type":"bearer","scope":null,"refresh_token":"xxxxxxxxxxxxxx"}'
"""



Troubleshooting

I found useful to include print (r.content) in case something goes wrong. It can help see what is returned by sever.
Also it helped me looking at log which is located at /var/logs/prod.log under yoursite.com/wallabag. In case something is going wrong it might have some clue in the log.

Conclusion

We looked at python api example of how to integrate python script with Wallabag web application and send data using Wallabag API and python requests library.

References

wallabag api example
Wallabag