>>
Posted by salman on 1/1/2022

I modified NeDB for freezr so it can use async storage mediums, like AWS S3 and or personal storage spaces like dropbox. The code is on github, and npmjs.

Each new storage system can have a js file that emulates the 16 or so functions required to integrate that storage system into nedb-asyncfs. A number of examples (like dbfs_aws.js) are provided under the env folder on github. Then, to initiate the db, you call the file as such:

const CustomFS = require('../path/to/dbfs_EXAMPLE.js')

const db = new Datastore({ dbFileName, customFS: new CustomFS(fsParams)})

where dbFileName is the name of the db, and fsParams are the specific credentials that the storage system requires. For example, for aws, fsParams could equal:

{
accessKeyId:'11aws_access_key11',
secretAccessKey: '22_secret22'
}

To make this work, I moved all the file system operations out of storage.js and persistence.js to dbfs_EXAMPLE.js (defaulting to dbfs_local.js which replicates the original nedb functionality), and made two main (interrelated) conceptual changes to the NeDB code:

1. appendfile - This is a critical part of NeDB but the function doesn't exist on cloud storage APIs, so the only way to 'append' a new record would be to download the whole db file and then add the new record to the file and then re-write the whole thing to storage. Doing that on every db update is obviously hugely inefficient. So instead, I did something a little different:  Instead of appending a new record to the end of the db file (eg 'testdb.db'), for every new record, I create a small file with that one record and write it to a folder (called '~testdb.db', following the NeDB naming convention of using ~). This makes the write operation acceptably fast, and I think it provides good redundancy. Afterwards, when a db file is crashsafe-written, all the small record-files in the folder are removed.  Similarly, loading a database entails reading the main dbname file plus all the little files in the ~testdb.db folder, and then appending all the records to the main file in the order of the time they were written.

2. doNotPersistOnLoad - it also turns out that persisting a database takes a long time, so it is quite annoying to persist every time you load the db, since it slows down the loading process considerably... So I added a donotperistOnLoad option. By default the behaviour is like NeDB now, but in practice you would only want to manage persisting the db at the application level... eg it makes more sense to have the application call 'persistence.compactDatafile()' when the server is less busy. 

Of course, latency is an issue in general, and for example, I had to add a bunch of setTimeOuts to the tests for them to work... mostly because deleting files (specially multiple files, can take a bit of time, so reading the db right after deleting the 'record files' doesnt work. and I also increased the timeout on the tests. Still, with a few exceptions below, all the tests passed for s3, google Drive and dropbox. Some notes on the testing:

  • testThrowInCallback' and 'testRightOrder' fail and I couldnt figure out what the issue is with it. They even fail when dbfs_local is used. I commented out those tests and noted 'TEST REMOVED'
  • ‘TTL indexes can expire multiple documents and only what needs to be expired’ was also removed => TOO MANY TIMING ISSUES
  • I also removed (and marked) 3 tests in persistence.test.js as the tests didn't make sense for async I believe.
  • I also added a few fs tests to test different file systems.
  • To run tests with new file systems, you can add the dbfs_example.js file under the env folder, add a file called '.example_credentials.js' with the required credentials and finally adjust the params.js file to detect and use those credentials.

I made one other general change to the functionality: I don't think empty lines should be viewed as errors. In the regular NeDB, empty lines are considered errors but the corruptItems count starts at -1. I thought it was better to not count empty lines as errors, but start the corruptItems count  at 0. (See persistence.js) So I added a line to persisence to ignore lines that are just '/n'

Finally, nedb-asyncfs also updates the dependencies. underscore is updated to the latest version, as the latest under nedb had some vulnerabilities. I also moved binary-search-tree inside the nedb code base, which is admittedly ugly but works. (binary-search-tree was created by Louis Chariot for nedb, and the alternative would have been to also fork and publish that as well.) 


...
labels:
>>
Posted by salman on 7/2/2020

vulog is a chrome extension that allows you to (1) bookmark web pages, highlight text on those pages, and take notes, (2) save your browsing history,  and (3) see the cookies tracking you on various web sites (and delete them). 

I wrote the first version of vulog 3 years ago to keep a log of all my web pages. It seemed to me that all the large tech companies were keeping track of my browsing history, and the only person who didn't have a full log was me! I wanted my browsing history sitting on my own personal server so that I can retain it for myself and do what I want with it.

At the time, I had also added some basic bookmarking functions on vulog, but I have been wanting to extend those features and make them much more useful:

  1. Keyboard only - Most extensions are accessed via a button next to the browser url bar. I wanted to make it faster and easier to add bookmarks and notes by using the keyboard alone. So now you can do that by pressing 'cntrl s', or 'cmd s' on a mac. (Who uses 'cntrl s' to save copies of web pages these days anyways? )
  2. Highlighting - I wanted to be able to highlight text and save those highlights. This can now be done by right clicking on highlighted text (thanks to Jérôme).
  3. inbox - I wanted to have a special bookmark called 'inbox' and to add items to that inbox by  right clicking on any link.

So these are all now implemented in the new vulog here:

https://chrome.google.com/webstore/detail/vulog-logger-bookmarker-h/peoooghegmfpgpafglhhibeeeeggmfhb

The code is all on github.

This post is supposed to be a live document with the following sections:

  1. Known Issues
  2. Instructions
  3. Privacy (CEPS)
  4. Future developments
  5. Acknowledgements

1. Known Issues

Here are some known problems and deficiencies with vulog :

  • cntl/cmd s doesn't work on all sites, specially those that make extensive use of javascript or which have menus with high z-indices. ;)
  • Highlighting - On some web page, vulog cant find the text you have highlighted. It should work on most simple sites but not on interactive ones where content is always changing. But you can always see your highlights by pressing the extension button.
  • The notes and tags functionality has a bug in the current version, thanks to my clumsy fingers changing a function call name just before submitting it to the app store. But you can always take notes  This is fixed in the new version.


2. Instructions

Current tab

Click on the vulog button to see the main "Current" tab, and tag a page or bookmark it using these buttons:

- The 'bookmark' and 'star are buttons for regular bookmarking.

- The 'Inbox' button is for items you want to read later. You can also right click on any web link on web pages you visit and add it to your vulog inbox right from the web page.

- Links marked with 'archive' do not show in default search results when you do a search from the Marks tab.  For example, once you have read a page from your inbox,  you might want to remove the 'inbox' mark, and add it to your 'archive'.

- The 'bullfrog' button makes the link public. Note that you need a CEPS compatible server to store your data and to publish it, if you want to use this feature. (See Below)

Marks tab

In the Marks tab, you can search for items you have bookmarked.

Click on the bookmark icons to filter your results. (eg clicking on inbox turns the icon green and only shows items that have been marked 'inbox'. Clicking it again will turn the button red, and you will only see items that have NOT been marked 'inbox'. You will notice that the 'archive' mark is red by default, so that archived items do not appear in the default search results.

In the marks tab, you can search for the items you have bookmarked.

When clicking on bookmark buttons, you will filter your results. (eg clicking on inbox turns the icon green and only shows items that have been marked 'inbox'. Clicking it again will turn the button red, and you will only see items that have NOT been marked as inbox. You will notice that the 'archive' mark is red by default, so that archived items do not appear in the default search results.

History tab

Search your history. The general search box searches for words used in your tags and notes and highlights, as well as meta data associated with the page.

Right Clicking on web pages

On any web page, you can right click on text you have selected to highlight it, and you can right click on a any link to add it to your inbox

Cntrl/Cmd S on web pages

When you are on any web page, you can press cntrl-S (or cmd-S for mac) and a small menu appears on the top right corner of the web page, to allow you to bookmark it. While the menu is open, pressing cntrl/cmd-I adds to inbox,  cntrl/cmd-A archives, cntrl/cmd-B adds a bookmark, and pressing cntrl/cmd-S again adds a star. You can remove marks by clicking on them with your mouse. The Escape key gets rid of the menu, which disappears automatically after a few seconds in any case.

Data storage

Your bookmarks and browser history is kept in the chrome's local storage, which has limited space. After some weeks (or months depending on usage), vulog automatically deletes older items. 

3. Privacy (CEPS)

vulog doesn't send any of your data to any outside servers, and you can always delete your data from the 'More' tab. If you want to store your data on your own server, you will need to set up a Personal Data Store. vulog was built to be able to accept CEPS-compatible data stores. (See here for more details on CEPS - Common End Points for Personal Servers and data stores. ) 

Having your data sit on your personal data store also means that you can publish your bookmarks and highlights and notes. Press the bullhorn button to publish the link from your server. 

4. Future Developments

I expect to use vulog as an example app for the development of the CEPS sharing protocol.

5. Acknowledgements

Highlighting functionality was largely copied from Jérôme Parent-Lévesque. (See here.)

Rendering function (dgelements.js) was inspired by David Gilbertson (who never expected someone would be crazy enough to actually implement his idea I think.)



...
labels:
>>
Posted by salman on 3/15/2020

CEPS provides a way for applications to work with multiple data stores. For developers, this means that you can create a new app knowing that it can run on various compliant datastore systems. For Personal Data Store (PDS) system providers, it means that you can have that many more apps to offer to users of your data store. If CEPS is adopted widely, the personal data store ecosystem can only be enriched.

Today, a number of different personal data store systems are pursuing similar ends – to grant users full control over their personal data, effectively freeing them from the current web services model where third party web sites and applications are retaining all our personal data. Yet, today, each of these PDSs has its own proprietary technology and methods to allow third parties to build apps running on those data-stores. 

This is a paradox that can only slow down the adoption of PDSs:

  • As a user, why should I jump off the rock of current proprietary web services model to land in another hard place where apps are still proprietary (even if I get more control over my data on those PDSs.)  If I am assured that I have full portability to new data stores I will have more confidence to join the ecosystem,
  • As a developer, why should I build a new app that runs solely on one type of data store? If my app could easily work with any one of multiple data stores, I would be much more prone to building apps. 

In this light, CEPS is the start of an effort to create some economies of scale in this nascent industry. 

In its current inception, CEPS has a minimum viable set of functions to run basic apps on PDSs. It allows the app to authenticate itself on the PDS, and then write records, read and query them, and update or delete the app’s own records.

Here is how it works in practice. In the video, you see a desktop app – in this case a note taking app called Notery, but it could have also been a mobile phone app. The app connects to my PDS which is in the cloud,  uses it as its store of data. Any mobile app or desktop application that you can think of could use the same model. They don’t need to send your data to some server you have no control over – using CEPS, they can store your data on your own data store.

This second clip is similar. It is an app called Tallyzoo, with which you can record and count various things. It also connects to my server and keeps data there. This is significant for two main reasons.

First, Tallyzoo wasn’t written by me. It is easy to connect some app to some server if the same person is writing both. But in this case, the app was written by Christoph from OwnYourData without any knowledge of my server. The only thing that Christoph knew was that my server would accept CEPS commands. And that’s all he needed to allow me to use Tallyzoo and store my Tallyzoo data on MY personal data store.

Second, the Tally Zoo app is a server based app – it is a web service. It is like all the great web sites we visit every day. It runs on a third party server and I am like any other user visiting a web site. The only difference is that Tallyzoo doesn’t keep my data on its own servers – it keeps the data on MY server. This is really significant in that it points to a model for all web sites to store our data on our data stores rather than on their servers.

This is a simple difference, and CEPS is a tiny little and simple specification. Yet the example above points to a world wide web which could be radically different from the one we interact with today. It shows that indeed, there is no reason for any web site – any third party company – to keep any of our data on their servers. 

This may be a world worth striving for.


...
labels: ceps freezr
>>
Posted by salman on 8/20/2019
The convenience that many are looking from the ‘data ownership’ comes from the traditional economic understanding of property rights as residual rights. It means that we can contractually or otherwise
Highlights
Posted by salman on
Key words:
>>
Posted by salman on 6/23/2019
A description of the level playing field created by the personal server paradigm.
...
labels:
>>
Posted by salman on 6/10/2019
We need to be careful of the faults of decentralised systems, yet reassured by the strength of the principles underlying them.
...
labels:
>>
Posted by salman on 6/10/2019
A theoretical framework for dis-aggregating the web services stack and separating front end apps from back-end servers (databases, files, permissioning).
...
labels:
>>
Posted by salman on 6/10/2019
A theoretical framework for dis-aggregating the web services stack and separating front end apps from back-end servers (databases, files, permissioning).
...
labels:
>>
Posted by salman on 6/10/2019
tl;dr Blockchains are centralised in some fundamental way – the web is fundamentally decentralised.  (Part of a series of posts.)
...
labels:
>>
Posted by salman on 6/9/2019
Defining data freedom and the .json manifest that goes with it.
...
labels:
More