Stock Performer
Duplicate Image Detection with Go
By Oliver Rivo
Stock Performer
Analytics for the Stock Industry
"Stock" ≠ "Capital Stock"
Stock Photos
(and videos, audio, illustrations, WordPress templates...)
The Microstock Ecosystem
Non-exclusivity
- Many photographers upload to lots of agencies.
- One image sold in many different places.
- Different ID, different keywords, different titles.
- Watermarks, colour correction.
- No collaboration, agencies compete against each other.
Go to the rescue!
- Runs as an internal HTTP server
- Low-level image analysis
- Communicates with PHP
Image Similarity Search
package main
import "github.com/rivo/duplo"
func main() {
store := duplo.New()
hash, _ := duplo.CreateHash(img)
store.Add("myimage", hash)
hash, _ = duplo.CreateHash(query)
matches := store.Query(hash)
// matches[0] is the best match.
}
Visual hashes
Effectiveness
Works very well for photographs.
May fail for illustrations with minor differences.
Users can make corrections manually.
Libraries used
- nfnt/resize: Image resizing
- rcrowley/goagain + braintree/manners: graceful HTTP server shutdown / restart / upgrade
- BurntSushi/toml: Configuration via TOML file
- go-sql-driver/mysql: MySQL driver
- labix.org/mgo: mongoDB driver
Conclusion
Go is so much fun!
- Concise, to the point
- Focus on the problem
- Make fewer mistakes:
- Very consistent
- Simple yet expressive
- Fast compilation
- Static typing
- Unit testing built-in
- I never missed generics
- Caveat: roll-out more complex