Content-Based Image Retrieval
by Using Color and Texture Information
by
Jozsef Vass
3/16/98
Table of Contents
Outline
- Motivation
- Relation to Previous Talk
- Related Work
- Image Retrieval by Color and Spatial Information
- Image Retrieval by Texture
- Video Retrieval by Objects
- Conclusions
Motivation
- Several systems exist to index and search textual information
- Large amount of visual data available on the WWW
- Indexing visual information is very difficult - Even textual
indexing often gives unsatisfactory results
- Visual data is very voluminous
- MPEG-7: Efficiently and effectively find visual information
- Local features are needed
Relation to Previous Talks
- The Virage Video Engine - Dr. Palaniappan
- Defines an open framework
- Allows global features
- Video Query: Beyond Keywords - Dr. Joshi
- Indexing is subjective and more sophisticated than textual indexing
- Video Retrieval by Example Video Clip - Dr. Shi
- Retrieval by example
- Consider spatial information
- Signatures are extracted automatically, little or no meaning to users
- Not very user friendly
Related Work
- Query by Image Content (QBIC):
- Supports query by color, texture, and shape
- No spatial information
- Virage
- Query by only global features
Image Retrieval by Color and Spatial Information
- Developed at Columbia University
- J.R. Smith and S.-F. Chang, ``VisualSEEk: A fully automated content-based
image query system,'' ACM Multimedia, 1996
- Main building blocks
- Automatic extraction of salient color regions (image segmentation)
- Color sets (color map) and color matching
- Spatial description
- Region query
- Spatial relation
- Image is decomposed into regions. Features (metadata):
- Color
- Shape (size, extent, and minimal bounding box)
- Relative location to other regions
- Retrieval: Images are compared by comparing regions
Color Set
- HSV color space is used
- Color cube -> Color cylinder

- Quantization: 18 hues, 3 saturations, and 3 values
- Color similarity: proximity in the cylindrical color space (A)
Color Matching
- Bin-comparison
- QBIC system: histogram quadric distance measure
d(hist)=(hq-ht)tA(hq-ht)
- A=[ai,j]: is the color similarity
- hq: Query color histogram
- ht: Target color histogram
- Color set distance
d(set)=(cq-ct)tA(cq-ct)
- cq: Query color histogram
- ct: Target color histogram
Spatial Description
- Region absolute location:
- Euclidean distance of centroids
- User specified tolerance
- Minimal bounding rectangle
- Region size:
- Area
- Spatial extent (horizontal and vertical size)
Region Query
- Weighted sum of
- Color set
- Location
- Area
- Spatial extent
- specified weights
- Multiple objects: Intersection of single region query results
Spatial Relation
- Most expensive - only final stage of the query
- Represented by 2-D strings
t1 < t4 < t3 < t2 < t5, t4 < t5 < t3 < t2 < t1
"<" left-right or bottom-top relationship
- Relations:
- Adjacency
- Nearness
- Overlap
- Surround
Image Retrieval by Texture
- Developed at the University of Washington
- C.E. Jacobs, A. Finkelstein, and D.H. Salesin, ``Fast multiresolution
image querying,'' ACM SIGGRAPH, 1995
- Wavelet-based approach - Multiresolution
- Compare most significant wavelet coefficients
- Main building blocks:
- Wavelet decomposition
- Distortion metric
- Query
Wavelet Decomposition
- Haar wavelet (sum and difference)
- Computationally inexpensive
- Full decomposition, image size 128x128, 7 scales, size at
coarsest scale is 1x1 -> overall average intensity
- Truncate wavelet coefficients, keep only with m largest magnitude,
typically, m < 100
- Quantization: POS or NEG
- Same decomposition is carried out for each channel (YIQ)
Distortion Metric
- Metadata
- m quantized wavelet coefficients
- Metric
||Q,T||=w0,0|Q(0,0)-T(0,0)|+SUMi,jwi,j|Qb(i,i)-Tb(i,j)|
where
- Q - Wavelet transformed query image
- T - Wavelet transformed target image
- Qb - Wavelet transformed and quantized query image (bilevel)
- Tb - Wavelet transformed and quantized target image (bilevel)
- wi,j - Weight for coefficient (i,j)
- Weights are determined for each scale
Image Query
- User Interface
- Simple drawing tool
- Query by example image
- Calculate the features of the query image
- Display several top ranked candidates
Evaluation
- Fast query
- algorithm
- No multiresolution query
- Wavelets are translation variant - same problem as with video coding
- Square image constraint
- Since (1) only few coefficients are used and (2) statistical decay
property of wavelet coefficients only coarse scale resolution coefficients
are retained - limited edge information
- Not very meaningful to humans
- Extension to video?
Video Query by Motion
- ``Example Video Clip'' gives very limited freedom to user
- Object-based approach would be highly desirable
- MPEG-4: each object is described by shape, texture and motion
- How to incorporate MPEG-4 Audio-Visual Object (AVO) with indexing
- Graphical interface: user can sketch objects and motion
Conclusions
- Two content based retrieval algorithm were presented
- Most difficult - Video retrieval
- Integration of color and texture
CECS Multimedia Communications and Visualization Laboratory