index
My system to archive images (photos of people)
keywords: gauche scheme, postgresql, imagemagick, gtk+
download:
in my download area get
- gauche-mmc
a heap of my modules, which are necessary for my significant (evidently not stand-alone) packages.
- gauche-imagemagick
Imagemagick: get a recent one. Gauche-imagemagick contains a hacky module for reference counting (with reservations) in scheme.
(Gauche uses a conservative GC, images are huge, so i tried to track the references explicitely).
- gauche-pg
- gauche-string
Some C functions.
- gauche-foto
The system itself.
summary: I wrote this system to keep fotos both on file-system (FS) and in database (DB).
The aim is to put as much information to a fast DB (triggers for example), make the DB invoke operations on the FS part,
but, in the same time, be able to access the photos as files categorized in a tree of directories, with flexible file names.
Synchronization is optimized.
|
Why I don't trust putting fotos itself in the DB
- how to access them? gnome-vfs ? linux user-space file system?
- what if something goes wrong? I would still backup in plain files
techniques which i adopted ....
- md5 hash to recognize files
- image hash to recognize images: if I end having 2 different image formats....
main components
- md5 server: generating md5 over nfs is inefficient. the NFS server can cheaply compute the md5 itself, and send us the result.
- postgresql DB
- using imagemagick
How to synchronize 2 sets of objects (a set of files and a set of records/tuples in DB)
assumption: we have 2 sets A, B, which once had a bijection between them. In fact there were/are 2 bijections maintained: by contents
(file contents vs. a part of the tuple info) and by name: filename vs. another part of tuple information.
Problem to solve:
We have added into A, removed from A, permutated A (filenames). Now do the same modification to B.
First make a mapping between the 2 sets: (by contents)
This can be done lazily (think of topology): you want to distinguish between objects in set A only as much as
necessary: image dimension lookup (in a file) costs less than file's MD5-hash computation. Also, if we keep image-hash in DB (as well as MD5), if a file has the
corresponding md5, it also has that image-hash!
Once we have a mapping (files -> tuples, or vice versa), and some (canonical) inverse, a composition gives up a permutation of an extended set.
Extended set is the union of set A, and the image of the canonical mapping from B into the "domain" of set A (filenames).
To do the final movement, we decompose the permutation in cycles, and cycles into transpositions (w/ the help of temporary place).
Keeping backups
Once again, we construct mappings, then decompose in transpositions, adition & deletion.
---
implementation details:
FS layout
{root}/ ..../{person-number}/{category}/{number}{file name}.{mime extension}
DB layout
what we keep about each file:
- stat mtime, size
- md5
- image stat & hash
commands:
common options
For walking the FS tree, we have this set of arguments:
-r root
to limit the person-id to an interval, specify minimum and/or maximum:
--from -f {number}
--to -t {number}
or, in terms of thousands:
-F {thousand}
-T {thousand}
For executing a query:
-q "select ....;"
examples:
programs:
- refresh_thumbnails.scm
-
Checks if 'derived' files need update. Derived files are various types of thumbnails.
- sync.scm
-
synchronizes FS & DB for given person.
- import.scm
-
import new files into the FS+DB archive. This is done either from an external tree (similar to that described in
FS layout), or from suitably named files, or any file when we specify the coordinates as command options.
http://www.graphicsmagick.org/
>> Open Source Initiative and is compatible with the GPL. GraphicsMagick is originally
derived from ImageMagick 5.5.2. Since the branch from ImageMagick, many improvements
have been made (see news) by many authors using an open development model.
The RAW Flaw
http://www.luminous-landscape.com/essays/raw-flaw.shtml
Nikon D70 under Linux
How to compare (non-identical) images:
usage: compare 2 videos .avi and .mpeg
making a fast image (viewer) browser see image-viewers