Stock Performer

Duplicate Image Detection with Go

By Oliver Rivo

Oliver Rivo (in a nutshell)

15+ years in the industry

M.Sc. CompSci (SFU, Canada)

Fur project

CGI / Computer Animation

  • C/C++
  • OpenGL
Fur project

Medical Computing
Human-Computer Interaction

  • Java (in 1998)

Lufthansa Systems

Fur project

Revenue Management

  • C++
  • Java
  • Oracle
  • PM / SCRUM

Full-Time Musician

Fur project

Signed to Sony

  • Songwriting
  • Recording
  • Touring

Freelancer

Fur project

Consulting / coding

  • Lufthansa Systems
  • Arvato Bertelsmann
  • EUMlab
  • ...

Stock Performer

Fur project

Co-Founder

  • Javascript, D3.js
  • PHP, CakePHP
  • MongoDB, MySQL
  • and of course Go

Stock Performer

Analytics for the Stock Industry

"Stock" ≠ "Capital Stock"

Stock Photos

(and videos, audio, illustrations, WordPress templates...)

Stock Agencies

The Microstock Ecosystem

Diagram

Analytics for the Stock Industry

Sales chart Top sellers Breakdown charts

  • Revenue overview (drill down, drill up)
  • Top sellers
  • Collection analytics (per shoot, per theme, per keyword)
  • Sales breakdowns
  • ...and much more...

Non-exclusivity

  • Many photographers upload to lots of agencies.
  • One image sold in many different places.
  • Different ID, different keywords, different titles.
  • Watermarks, colour correction.
  • No collaboration, agencies compete against each other.

One image, multiple points of sale

Thumbnail Thumbnail Thumbnail Thumbnail Thumbnail

Photographers need to know overall revenue.

Go to the rescue!

  • Runs as an internal HTTP server
  • Low-level image analysis
  • Communicates with PHP

Image Similarity Search


package main

import "github.com/rivo/duplo"

func main() {
  store := duplo.New()

  hash, _ := duplo.CreateHash(img)
  store.Add("myimage", hash)

  hash, _ = duplo.CreateHash(query)
  matches := store.Query(hash)
  // matches[0] is the best match.
}
          

Visual hashes

Visual hashes

Effectiveness

Works very well for photographs.

May fail for illustrations with minor differences.

Failed matches

Users can make corrections manually.

Libraries used

  • nfnt/resize: Image resizing
  • rcrowley/goagain + braintree/manners: graceful HTTP server shutdown / restart / upgrade
  • BurntSushi/toml: Configuration via TOML file
  • go-sql-driver/mysql: MySQL driver
  • labix.org/mgo: mongoDB driver

Conclusion

Go is so much fun!

  • Concise, to the point
  • Focus on the problem
  • Make fewer mistakes:
    • Very consistent
    • Simple yet expressive
    • Fast compilation
    • Static typing
    • Unit testing built-in
  • I never missed generics
  • Caveat: roll-out more complex

Thank you

Oliver Rivo
http://www.rentafounder.com
https://www.stockperformer.com
https://github.com/rivo