Creating a Virtual Filesystem with Python (and why you need one)
If you are writing an application of any size, it will most likely require a number of files to run – files which could be stored in a variety of possible locations. Furthermore, you will probably want to be able to change the location of those files when debugging and testing. You may even want to store those files somewhere other than the user's hard drive.
Any engineer worth his salt will recognise that the file locations should be stored in some kind of configuration file and the code to read the files in question should be factored out so that it isn't just scattered at points where data is read or written. In this post I'll present a way of doing just that by creating a virtual filesystem with PyFilesystem.
You'll need the most recent version of PyFilesystem from SVN to run this code.
We're going to create a virtual filesystem for a fictitious application that requires per-application and per-user resources, as well as a location for cache and log files. I'll also demonstrate how to mount files located on a web server. Here's the code:
from fs.opener import fsopendir app_fs = fsopendir('mount://fs.ini', create_dir=True)
That's, all there is to it; two lines of code (one if you don't count the import). Obviously there is quite a bit going on under the hood here, which I'll explain below, but lets see what this code gives you…
The app_fs
object is an interface to a single filesystem that contains all the file locations our application will use. For example, the path /user/app.ini
references a per-user file, whereas /resources/logo.png
references a per application file. The actual physical location of the data is irrelevant because as far as your application is concerned the paths never change. This abstraction is useful because the real path for such files varies according to the platform the code is running on; Windows, Mac and Linux all have different conventions, and if you put your files in the wrong place, your app will likely break on one platform or another.
Here's how a per-user configuration file might be opened:
from ConfigParser import ConfigParser # The 'safeopen' method works like 'open', but will return an # empty file-like object if the path does not exist with app_fs.safeopen('/user/app.ini') as ini_file: cfg = ConfigParser() cfg.readfp(ini_file) # ... do something with cfg
The files in our virtual filesystem don't even have to reside on the local filesystem. For instance, /live/
may actually reference a location on the web, where the version of the current release and a short ‘message of the day’ is stored.
Here's how the version number and MOTD might be read:
def get_application_version(): """Get the version number of the most up to date version of the application, as a tuple of three integers""" with app_fs.safeopen('live/version.txt') as version_file: version_text = version_file.read().rstrip() if not version_text: # Empty file or unable to read return None return tuple(int(v) for v in version_text.split('.', 3)) def get_motd(): """Get a welcome message""" with app_fs.safeopen("live/motd.txt") as motd_file: return motd_file.read().rstrip()
You'll notice that even though the actual data is retrieved over HTTP (the files are located here and here), the code would be no different if the files were stored locally.
So how is all this behaviour created from a single line of code? The line fsopendir("mount://fs.ini", create_dir=True)
opens a MountFS from the information contained within an INI file (create_dir=True
will create specified directories if they don't exist). Here's an example of an INI file that could be used during development:
[fs]
user=./user
resources=./resources
logs=./logs
cache=./user/cache
live=./live
The INI file is used to construct a MountFS, where the keys in the [fs]
section are the top level directory names and the values are the real locations of the files. In above example, /user/
maps on to a directory called user
relative to the current directory – but it could be changed to an absolute path or to a location on a server (e.g. FTP, SFTP, HTTP, DAV), or even to a directory within a zip file.
You can change the section to use in a mount opener by specifying it after a # symbol, i.e. mount://fs.ini#mysection
There are a few changes to this INI file we will need to make when our application is ready for release. User data, site data, logs and cache all have canonical locations that are derived from the name of the application (and the author on Windows). PyFilesystem contains handy openers for these special locations. For example, appuser://examplesoft:myapp
detects the appropriate per-user data location for an application called “myapp” developed by “examplesoft”. Ditto for the other per-application directories. e.g.:
[fs]
user=appuser://examplesoft:myapp
resources=appsite://examplesoft:myapp
logs=applog://examplesoft:myapp
cache=appcache://examplesoft:myapp
The /live/
path is different in that it needs to point to a web server:
live=http://www.willmcgugan.com/static/cfg/
Of course, you don't need to use the canonical locations. For instance, let's say you want to store all your static resources in a zip file. No problem:
resources=zip://./resources.zip
Or you want to keep your user data on a SFTP (Secure FTP) server:
user=sftp://username:password@example.org/home/will/
Perhaps you don't want to preserve the cache across sessions, for security reasons. The temp
opener creates files in a temp directory and deletes them on close:
cache=temp://
Although, if you are really paranoid you can store the cache files in memory without ever writing them to disk:
cache=mem://
Setting /user/ to mem:// is a useful way of simulating a fresh install when debugging.
I hope that covers why you might need – or at least want – a virtual file system in your application. I've glossed over some the details and other features of PyFilesystem. If you would like more information, see my previous posts, check out the documentation or join the PyFilesystem discussion group.
It would be nice to see a null/loopback file system, such as this one for fusepy:
http://code.google.com/p/fusepy/source/browse/trunk/loopback.py
Maybe it's in the docs somewhere, but I did not see it at least.
Not sure what a loopback file system would do in this context, the MounFS has that capability. A null filesystem might be useful though…
Indeed, I am looking for a way to create a file system that would allow to browse the attachments of zotero collections in the finder (or explorer).
Ideally, this would consist in creating a read-only virtual system in which I would add links to the relevant files.
I am completely new to VFS programing (although I am a convinced user)… So I have several questions:
- what kind of file system would be more appropriate? FUSE?…
- is it possible to create a read-only VFS where only links to files are specified, or do I have to store files in a temp location? If yes, a minimal example would be of great help!
- how to mount a VFS in the finder/explorer?
Thank you in advance for your help, Thomas.
This helped me to understand that FUSE is not a file system ;) and how I can use it to expose a FS in the finder/explorer…
Nonetheless, the second question stands open in my mind:
would you eventually share your code? I am a beginner in VFS and try to make the same kind of thing with my blog.
The VFS has a really huge potential.
Cheers
Nice write-up! For completeness it might be worth mentioning the idea of using environmental variables to switch the location(s) used by your file system between dev/debug & live/production.