I modified NeDB for freezr so it can use async storage mediums, like AWS S3 and or personal storage spaces like dropbox. The code is on github, and npmjs.
Each new storage system can have a js file that emulates the 16 or so functions required to integrate that storage system into nedb-asyncfs. A number of examples (like dbfs_aws.js) are provided under the env folder on github. Then, to initiate the db, you call the file as such:
const CustomFS = require('../path/to/dbfs_EXAMPLE.js')
const db = new Datastore({ dbFileName, customFS: new CustomFS(fsParams)})
where dbFileName is the name of the db, and fsParams are the specific credentials that the storage system requires. For example, for aws, fsParams could equal:
To make this work, I moved all the file system operations out of storage.js and persistence.js to dbfs_EXAMPLE.js (defaulting to dbfs_local.js which replicates the original nedb functionality), and made two main (interrelated) conceptual changes to the NeDB code:
1. appendfile - This is a critical part of NeDB but the function doesn't exist on cloud storage APIs, so the only way to 'append' a new record would be to download the whole db file and then add the new record to the file and then re-write the whole thing to storage. Doing that on every db update is obviously hugely inefficient. So instead, I did something a little different: Instead of appending a new record to the end of the db file (eg 'testdb.db'), for every new record, I create a small file with that one record and write it to a folder (called '~testdb.db', following the NeDB naming convention of using ~). This makes the write operation acceptably fast, and I think it provides good redundancy. Afterwards, when a db file is crashsafe-written, all the small record-files in the folder are removed. Similarly, loading a database entails reading the main dbname file plus all the little files in the ~testdb.db folder, and then appending all the records to the main file in the order of the time they were written.
2. doNotPersistOnLoad - it also turns out that persisting a database takes a long time, so it is quite annoying to persist every time you load the db, since it slows down the loading process considerably... So I added a donotperistOnLoad option. By default the behaviour is like NeDB now, but in practice you would only want to manage persisting the db at the application level... eg it makes more sense to have the application call 'persistence.compactDatafile()' when the server is less busy.
Of course, latency is an issue in general, and for example, I had to add a bunch of setTimeOuts to the tests for them to work... mostly because deleting files (specially multiple files, can take a bit of time, so reading the db right after deleting the 'record files' doesnt work. and I also increased the timeout on the tests. Still, with a few exceptions below, all the tests passed for s3, google Drive and dropbox. Some notes on the testing:
I made one other general change to the functionality: I don't think empty lines should be viewed as errors. In the regular NeDB, empty lines are considered errors but the corruptItems count starts at -1. I thought it was better to not count empty lines as errors, but start the corruptItems count at 0. (See persistence.js) So I added a line to persisence to ignore lines that are just '/n'
Finally, nedb-asyncfs also updates the dependencies. underscore is updated to the latest version, as the latest under nedb had some vulnerabilities. I also moved binary-search-tree inside the nedb code base, which is admittedly ugly but works. (binary-search-tree was created by Louis Chariot for nedb, and the alternative would have been to also fork and publish that as well.)
vulog is a chrome extension that allows you to (1) bookmark web pages, highlight text on those pages, and take notes, (2) save your browsing history, and (3) see the cookies tracking you on various web sites (and delete them).
I wrote the first version of vulog 3 years ago to keep a log of all my web pages. It seemed to me that all the large tech companies were keeping track of my browsing history, and the only person who didn't have a full log was me! I wanted my browsing history sitting on my own personal server so that I can retain it for myself and do what I want with it.
At the time, I had also added some basic bookmarking functions on vulog, but I have been wanting to extend those features and make them much more useful:
So these are all now implemented in the new vulog here:
https://chrome.google.com/webstore/detail/vulog-logger-bookmarker-h/peoooghegmfpgpafglhhibeeeeggmfhb
The code is all on github.
This post is supposed to be a live document with the following sections:
Here are some known problems and deficiencies with vulog :
Current tab
Click on the vulog button to see the main "Current" tab, and tag a page or bookmark it using these buttons:
- The 'bookmark' and 'star are buttons for regular bookmarking.
- The 'Inbox' button is for items you want to read later. You can also right click on any web link on web pages you visit and add it to your vulog inbox right from the web page.
- Links marked with 'archive' do not show in default search results when you do a search from the Marks tab. For example, once you have read a page from your inbox, you might want to remove the 'inbox' mark, and add it to your 'archive'.
- The 'bullfrog' button makes the link public. Note that you need a CEPS compatible server to store your data and to publish it, if you want to use this feature. (See Below)
Marks tab
In the Marks tab, you can search for items you have bookmarked.
Click on the bookmark icons to filter your results. (eg clicking on inbox turns the icon green and only shows items that have been marked 'inbox'. Clicking it again will turn the button red, and you will only see items that have NOT been marked 'inbox'. You will notice that the 'archive' mark is red by default, so that archived items do not appear in the default search results.
In the marks tab, you can search for the items you have bookmarked.
When clicking on bookmark buttons, you will filter your results. (eg clicking on inbox turns the icon green and only shows items that have been marked 'inbox'. Clicking it again will turn the button red, and you will only see items that have NOT been marked as inbox. You will notice that the 'archive' mark is red by default, so that archived items do not appear in the default search results.
History tab
Search your history. The general search box searches for words used in your tags and notes and highlights, as well as meta data associated with the page.
Right Clicking on web pages
On any web page, you can right click on text you have selected to highlight it, and you can right click on a any link to add it to your inbox
Cntrl/Cmd S on web pages
When you are on any web page, you can press cntrl-S (or cmd-S for mac) and a small menu appears on the top right corner of the web page, to allow you to bookmark it. While the menu is open, pressing cntrl/cmd-I adds to inbox, cntrl/cmd-A archives, cntrl/cmd-B adds a bookmark, and pressing cntrl/cmd-S again adds a star. You can remove marks by clicking on them with your mouse. The Escape key gets rid of the menu, which disappears automatically after a few seconds in any case.
Data storage
Your bookmarks and browser history is kept in the chrome's local storage, which has limited space. After some weeks (or months depending on usage), vulog automatically deletes older items.
vulog doesn't send any of your data to any outside servers, and you can always delete your data from the 'More' tab. If you want to store your data on your own server, you will need to set up a Personal Data Store. vulog was built to be able to accept CEPS-compatible data stores. (See here for more details on CEPS - Common End Points for Personal Servers and data stores. )
Having your data sit on your personal data store also means that you can publish your bookmarks and highlights and notes. Press the bullhorn button to publish the link from your server.
I expect to use vulog as an example app for the development of the CEPS sharing protocol.
Highlighting functionality was largely copied from Jérôme Parent-Lévesque. (See here.)
Rendering function (dgelements.js) was inspired by David Gilbertson (who never expected someone would be crazy enough to actually implement his idea I think.)
CEPS provides a way for applications to work with multiple data stores. For developers, this means that you can create a new app knowing that it can run on various compliant datastore systems. For Personal Data Store (PDS) system providers, it means that you can have that many more apps to offer to users of your data store. If CEPS is adopted widely, the personal data store ecosystem can only be enriched.
Today, a number of different personal data store systems are pursuing similar ends – to grant users full control over their personal data, effectively freeing them from the current web services model where third party web sites and applications are retaining all our personal data. Yet, today, each of these PDSs has its own proprietary technology and methods to allow third parties to build apps running on those data-stores.
This is a paradox that can only slow down the adoption of PDSs:
In this light, CEPS is the start of an effort to create some economies of scale in this nascent industry.
In its current inception, CEPS has a minimum viable set of functions to run basic apps on PDSs. It allows the app to authenticate itself on the PDS, and then write records, read and query them, and update or delete the app’s own records.
Here is how it works in practice. In the video, you see a desktop app – in this case a note taking app called Notery, but it could have also been a mobile phone app. The app connects to my PDS which is in the cloud, uses it as its store of data. Any mobile app or desktop application that you can think of could use the same model. They don’t need to send your data to some server you have no control over – using CEPS, they can store your data on your own data store.
This second clip is similar. It is an app called Tallyzoo, with which you can record and count various things. It also connects to my server and keeps data there. This is significant for two main reasons.
First, Tallyzoo wasn’t written by me. It is easy to connect some app to some server if the same person is writing both. But in this case, the app was written by Christoph from OwnYourData without any knowledge of my server. The only thing that Christoph knew was that my server would accept CEPS commands. And that’s all he needed to allow me to use Tallyzoo and store my Tallyzoo data on MY personal data store.
Second, the Tally Zoo app is a server based app – it is a web service. It is like all the great web sites we visit every day. It runs on a third party server and I am like any other user visiting a web site. The only difference is that Tallyzoo doesn’t keep my data on its own servers – it keeps the data on MY server. This is really significant in that it points to a model for all web sites to store our data on our data stores rather than on their servers.
This is a simple difference, and CEPS is a tiny little and simple specification. Yet the example above points to a world wide web which could be radically different from the one we interact with today. It shows that indeed, there is no reason for any web site – any third party company – to keep any of our data on their servers.
This may be a world worth striving for.