Résumé in HTML.
Note that this page is a detailed description of the work I've done. My resume
is more of an executive summary.
Electronics For Imaging
From 4/04 to present I've been at Electronics For Imaging as a Staff
Software Engineer in the Variable Data Printing Group.
- I'm currently developing a desktop color application using wxWidgets, designed to interact with cloud storage for remote monitoring of color quality.
- Worked with a team to provide inline spectrophotometer support in our product, to work with hardware from various vendors.
- I was in charge of
system integration of the next generation of RIP engine; the PDF-based
rip from Adobe.
PPML development to support standards changes
for PODi compliance, making EFI the first to implement the standard.
Extended multi-server print engine support for PPML printing at print
speeds of 2000 pages per minute. Additionally added batch printing
for PDF jobs in this high-end system.
Improved performance of the PPML driver and brought it into alignment
with the PODi standard.
Developed high-speed analysis of PPML jobs as a prescan tool.
I developed page filtering software
that enables a high-end scanner and printer combination to copy pages
as a copier would.
Using arbitrary scan and print directions, I reorient the scanned data
to match the scanner and printer pair used, and perform a series of
pre-print operations including: NxM (repeat an image on a page, like a
business card), N-up (take several scanned images and put them all on
a single page), posters (enlarge a photo to the point where it fills
several printed pages, which can be arranged in a grid to appear as
if a larger page was printed), booklets, and other small
Panorama Image Stitching
For ImageDock, I've developed a tool to combine several photos into a single image, automatically finding:
skew angle, lens distortion correction, distance warping, photo layout (both 1D and 2D sequences), stitch position,
light adjustment, and smooth blending.
Web Administration and Design
For over four years I've maintained a web server at home, which now hosts twelve domains for myself, family and friends.
I've written various tools and components in Perl, PHP, and even tinkered a bit with Python.
Administration tends to become a time consuming hobby, especially when it comes to dealing with spam.
I've added to several tools which help nearly eliminate the amount of spam delivered.
Xerox & Scansoft
From 10/95 to 12/02 I worked for Xerox and
Scansoft (at the time a subsidiary of Xerox),
developing image processing libraries to manipulate electronic
documents (TIFF, TIFF-FX, PDF, etc).
Xerox and Scansoft
I worked with several teams to help bring two technologies forward:
In support of these two technologies, I have worked on several components and libraries to make it all happen:
- Mixed Raster Content (MRC) - an image document structure that compliments compression. The structure is basically representing various parts of a page with a series of layer pairs. Each pair is represented by a high-contrast component, which is bi-level, and a low-contrast component, which will contain the color images. Typically color images aren't separated into these two layers, but for documents, when separating text into high-resolution binary, and its associated color into a low-resolution contone, the entire document compresses much better.
- JBIG2 - symbol compression has become a standard in the past few years, and has recently been supported by Adobe in the Acrobat reader (version 5), with some minor caveats. Xerox has played an active role in this standardization process. I have been one of a few people who have helped bring this technology into Xerox products.
- Image compression:
- Hierarchical Vector Quantization - implemented HVQ codec, and designed HVQ codebook components. I found some innovative tricks that can be played with HVQ, namely reordering the codevectors such that intermediate compressed images are viewable images. I describe HVQ on this page.
- JBIG2 - I've made several enhancements to JBIG2, and have incorporated the multi-page compression aspect of JBIG2 into three image file formats:
- XIFF - a proprietary format, see below.
- TIFF-FX - I coauthored the proposed first extension to TIFF-FX, to include MRC and components necessary for JBIG2.
- PDF - using Thomas Merz's PDFLIB I was able to extend his libraries to read and write JBIG2.
- JPEG - some optimization for JPEG compression and decompression.
- Adaptive Huffman - some G4 improvements
- Colorspace transformations - as used for ITULAB/CIELab/ICCLab, I implemented sRGB to L*a*b* conversion, and optimized them for use in TIFF-FX, with and without JPEG. I maintain an online conversion form on the web.
For the first three years, I worked with the image library team at Scansoft West in Palo
Alto, developing various components for the flagship products:
- XIFF - a proprietary format used in Pagis. Although I inherited it, I owned this format for most of its life. This format is basically TIFF with extensions to allow a thread of information at a document level, including multi-page components used for a JBIG2 precursor.
- Automated image enhancement - a robust algorithm that dynamically determines adjustments necessary
to correct color balance, contrast, and tonal response.
- Page rotation - automated arbitrary angle page rotation such that text is preserved.
- Photo detection - automated detection of a "mostly" rectangular photo, using a watershed algorithm for edge detection. Automatically straightened.
- Automated source code documentation - Perl-based extraction of formatted comments around key functions in all the source code, prettied up for web browsing.
- Port for Windows CE - ported some library components for Windows CE, used in a product now owned by Microsoft for tablet PCs.
- Table layout - improved cell detection and layout for automated table detection algorithm.
For the past four years, I helped bring MRC and JBIG2 forward into standards, and implemented libraries and components
to make these two technologies available across the entire corporation.
- Individual components were shared on an internal copy of the sourceforge open-source system.
- These and other components were put together to build a document conversion pipeline, which optimally compresses page components and shares the result among different file destinations of a variety of formats, including: PDF, TIFF, and TIFF-FX.
Corporation develops software for translating multi-channel brainwave
patterns into mind states or thought identifications. The key to this process
is a new technology developed by Thoughtform, which will allow computers
to process brainwave into more useable information. I've been helping out with some of the software development
since company inception.
From 1/92 to 10/95 I was an optical character recognition (OCR) expert for Caere Corporation (now owned by Nuance).
My contributions include:
- Character/word error measures - the calculation of character errors, word errors, and correction difficulty.
When I saw that these kind of automated calculations were missing from the tool set, I took the initiative to look into
generating them myself. After implementing a solution, I found that the errors reported by UNLV in their annual OCR
contest were inaccurate, and helped them identify the problem.
These calculations helped make determinations for improvements in OCR for the company ever since.
- Automated nightly runs and error checks - since the error measures were found to be so important, I created and maintained a nightly OCR regression test and error comparison, to make sure minor changes didn't have a global affect on OCR accuracy.
- Word-based character recognition - co-implemented a word-based OCR algorithm to leverage trigrams and dictionary words to aid the OCR process, scored with neural net confidences.
- Neural net integration - fine tuned the neural net activation levels to optimize OCR accuracy.
- X-window OCR tracking - took the initiative to develop an X-window based tool to track and show the OCR decisions made on any image fragment. This tool made it dramatically easier to isolate OCR problems made; kind of like debugging with a debugger instead of a bunch of print statements.
- Automatic page orientation - implemented an automated page orientation algorithm, to run and make a decision without backtracking.
- X-window/Motif editor - made modifications on a Motif image editing tool to zoom in.
(later Loral, Lockheed, and now L-3)
From 8/85 to 1/92 I
worked in the Advanced Technology group at what started as Sperry Corporation,
and became Unisys Corporation. Later
my division was spun out as a subsidiary of Unisys named Paramax, which was bought
by Loral, then bought again by Lockheed, and eventually owned by a
company called L-3 Communications.
The division has
always been the Communications System Division, and has specifically been
the satellite communications division.
The group of Advanced
Technology I've worked in dealt with data compression, and we focused on
a type of image compression known as vector quantization (VQ). We experimented
with many different algorithms applied to VQ, including:
- Tree search methods:
- Full-search - no tree structure, search all possible matches
- Two-level trees - search M nodes at the first level, then N nodes at the second level
- Uniform binary - compare between two nodes, tree has N levels (all paths)
- Non-uniform binary - compare between two nodes, unbalanced tree terminates where design process
- Codebook design possibilities:
- L1 vs. L2 vs. L3 norms - error distance calculated as abs(err), err * err, or err * err * err
- Fixed point versus floating-point - restrict the design process to target integer vectors, or even vectors lying on a more coarse grid
- Split methods: - determining the direction to first split a multi-dimensional space into N regions:
- Small arbitrary perturbation - the published method of alternately adding then subtracting a small value to each vector component
- Weighted pass through the training data - allow each training vector to influence the matched codebook vector
- Furthest outliers - find the training vectors that appear to reside the furthest from the centroid
- Large training data sets - I made my training process work with in-memory data sets (if enough memory existed), but also implemented on-disk shuffling of the training data, grouping the data to match the codevectors. On one project (think hardware available in the 1989 time frame) I filled one entire 1G hard drive with a single training file, which got shuffled around 64k at a time.
- Restarting - considering the computer reliability at the time, many codebook design processes took days or weeks to complete. On the Vax available, it was typical for a reboot to occur every couple of days, thus restarting the design process was necessary.
- Complimentary algorithms:
- Mean-removal - separately encode the mean, then VQ the vector with the [decoded] mean removed
- Predictive - estimate/predict the pixel values from previously decoded pixels, then VQ the prediction error (note this is a very promising method)
- Classified VQ - also very promising, this entails classifying each vector according to the amount of variation across the vector (flat, horizontal/vertical/diagonal gradient, or mixed/other), then separately VQing for each category, assigning more bits for the more detailed areas. This method offers to trade off a fixed rate for constrained error rate, making it variable rate (like JPEG is). It worked great on a project to encode bank checks.
- Binary trees using a "dot-product threshold" - this project was innovative and fun, instead of comparing the inner product for each of the two vectors at a branch of the binary tree, a single inner product is compared to a threshold, thus reducing the math required. Said another way, instead of computing the distance to each of two vectors, to determine which side of an imaginary multi-dimensional plane the input vector lies, project the input vector onto a line connecting the two target vectors, then compare to a point where the plane intersects the line. Ask me about this one if you get the chance.
- Pyramidal VQ - break the image into successively smaller vectors. For example, break the entire image into a 4x4 grid, averaging each grid cell, and encoding the 4x4 as a vector. Then removing the decoded average vector, break each grid cell into another 4x4 grid, and repeating the process. Continue until each grid is a single pixel.
- Lattice VQ - from a full-search tree, break the vector space into a grid, and for each grid cell, determine which codebook vectors could qualify for that cell. Then when encoding, quickly determine which cell an input vector belongs (like using the upper N bits of each vector component), then search only the full-search codevectors that that cell is known to reside in/near. This method guarantees the same distortion offered by full-search, but reduces the computation greatly!
- Human visual model/domain - working with Dr. Thomas Stockham, a pioneer in work with the human visual system, we did a fair amount of work investigating ways to attempt to minimize the error as seen by the human observer. In brief, the human visual system consists of various mathematical models to represent light as it passes through the cornea, is converted to electrical stimuli at the rear of the eye, is passed through nerves to the brain, and then is interpreted by the brain. The models include light saturation (railing), high-pass filtering, a power function, and low-pass filtering.
Stockham did some later studies that indicate that another power function may be required in the mathematical model, which became the project for my senior thesis, but I found that little or no gain was to be made by the added function.
- Hierarchical VQ - a fast encode method that is done completely using LUTs. I have a page describing HVQ. Note that I worked on this later at Xerox.
I was involved in the first three of four real-time video compression hardware projects:
- Two-level VQ at 15 frames/sec - interfaced to the hardware via VME back-plane interfaces to a Sun workstation. Controlled the hardware registers and loaded dynamic data (codebook vectors).
- Non-uniform binary VQ at 30 frames/sec - similar interface to faster/better hardware. This project was used to encode reconnaissance data, and sit on the front of a visually guided missile. This system took up about 10 VME breadboards.
- Non-uniform binary VQ at 60 frames/sec - using Xilinxs and custom chip designs, this system sat on a single breadboard. The system was nearly complete when I moved on.
- Non-uniform binary VQ using dot-product threshold at N frames/sec - this project occurred after I left.
I worked on many other small projects:
- Array processor implementations - for VQ codebook design and error computation.
- Huffman and Adaptive Huffman - coded up Huffman algorithms
- Arithmetic Coding - coded up a binary AC algorithm with predictive modeling, based on the literature.
- Lempel-Ziv Welch - coded up windowing variations to LZW, and hashing speed-ups.
- Encryption - coded up some encryption algorithms for data redundancy (for noisy or jammed signals).
I held a confidential security clearance.
I have a B.S. in Electrical Engineering from the University of Utah, and emphasized signal processing, stochastic processes, and computer design. My Senior Thesis was titled Extension to the Human Visual Model.
I have worked with: Windows, Linux, UNIX, C, C++, CSH, Apple Basic, Perl, PHP, Python, FORTRAN, assembly, etc.
Badminton, volleyball, puzzles, home construction projects.